I’ve just come back from seeing Hidden Figures. My brain is on fire.
The film tells the story of the first Computers – black women! – who performed the calculations that enabled the US to send men into space. Who not only performed those calculations, but invented them. Who solved fiendishly difficult problems that had never been solved before. It is a masterpiece of story telling, of film making, and of setting history right. Because these are stories we cannot tell often enough.
Despite the realities of history, we have allowed that story to be rewritten. We continue to believe a story that says that the colour of your skin, and the number of your X chromosomes, dictates what you can and can’t do. What you are, and aren’t, good at.
We have allowed the history of computing to become overwhelmingly white, overwhelmingly male, and overwhelmingly false. We have allowed the history of space to become the history of a few white men.
Hidden Figures shocked me in its all-too-accurate depiction of overt racism and sexism – and it didn’t even deal much with the violence, or the hatred. It made me realise how far we have come. But it also made me realise how far we still have to go.
We still have politicians telling us that your worth is dictated by your race. Or your religion. Or your clothes.
We still have teachers telling girls that computing isn’t really for them. That maths isn’t for them. That science is for boys. Most of them don’t say it outright now. But in a hundred tiny ways, girls are steered away from the “hard stuff”. In the questions directed to the boys, because the girls might not know the answer. In the career suggestions. In the level of support and encouragement.
But we know that women were central in the start of computing. From Ada Lovelace to Grace Hopper. From Mary Johnson to Anita Borg. Women have figured in computing every step of the way – and they have frequently had to overcome extreme obstacles to do so, while all white men had to do was pick up the opportunities strewn at their feet.
I still get shocked looks when I tell people I am a Computer Scientist. A PhD in Computer Science is still something women aren’t expected to have. And the numbers of women going into technical fields remain horrifically low, because we are constantly told it’s not really our kind of thing. And too many of us believe it.
So go see Hidden Figures. Take your kids. Take your students. Take your friends. And tell your students, your children, and everyone you meet, that the colour of their skin and the arrangement of their chromosomes tells them ABSOLUTELY NOTHING about who they are and what they can become.
The sky isn’t the limit anymore. We’re aiming for the stars.
Now that it’s school holidays I finally have time to think. Part 1 of my “Demystifying the dig tech curriculum” post went a little viral – as much as a post on a niche Computer Science Education blog can – and there were a lot of requests for Part 2, so here it is. Once again, all constructive feedback gratefully received!
It’s probably best to read Part 1 first, as I’m starting where I left off. This is the beginning of the next strand: Processes and production skills.
Collecting, managing and analysing data
Collect, explore and sort data, and use digital systems to present the data creatively (ACTDIP003)
Collect, access and present different types of data using simple software to create information and solve problems (ACTDIP009)
Acquire, store and validate different types of data, and use a range of software to interpret and visualise data to create information (ACTDIP016)
“F-2 Collect, explore and sort data, and use digital systems to present the data creatively”
This is another simple one, that F-2 teachers are already doing: Collect some data, like the hair colour of everyone in the class, or everyone’s favourite fruit. Then you can either graph it (simple pi charts, bar graphs etc), or draw infographic style pictures to represent it, like this (although you would, of course, include better labelling than this lazy Computer Scientist has! :):
This one explicitly says use digital systems to do it, so you would probably use a spreadsheet package like Excel or Google sheets.
“3-4: Collect, access and present different types of data using simple software to create information and solve problems.”
Another thing you’re already doing: collect some data, graph it using a spreadsheet program or some online tool, and try to answer questions like: What is the most popular fruit in class JB? Or better yet, incorporate it with some environmental sustainability and graph which class collected the most rubbish on Clean Up Australia day. And that’s a good one because you can graph it by weight or by piece count, or by rubbish type: wrappers vs bottles vs cans, for example. And you can then try to graph which class has made the most improvement to the environment – which type of rubbish is worst, and who has collected the most of it? Ok, I might be getting carried away here, but you can see how this goes. These types of activities always work best when they’re tied in with other things the kids are already doing, so that they can see the point.
“5-6: Acquire, store and validate different types of data, and use a range of software to interpret and visualise data to create information”
This is really the same as the examples above, but ramped up a little. The data above was all discrete, simple values that you can aggregate: 5 blondes, 4 brunettes, 1 redhead. Or 6 prefer strawberries, 3 prefer apples, and 1 prefers oranges. For 5-6 you could start collecting continuous data like heights that is a bit more complicated. If you graphed the heights of everyone in the class as a bar chart you would likely wind up with 25 different entries on the graph, so you need to aggregate them somehow, like 100-110cm, 111-120, etc. This is a frequency graph or histogram. You can still use a spreadsheet to graph this, or you can make it an art and communication lesson by getting them to draw pictures in drawing software like Paint on a pc, or drawing or whiteboard apps on tablets, that are creative but accurate – for example using relative size, or number of items to represent the different values.
Creating Digital Solutions by:
Investigating and defining
Follow, describe and represent a sequence of steps and decisions (algorithms)
needed to solve simple problems (ACTDIP004)
Define simple problems, and describe and follow a sequence of steps and decisions (algorithms) needed to solve them (ACTDIP010)
Define problems in terms of data and functional requirements drawing on previously solved problems (ACTDIP017)
“F-2: Follow, describe and represent a sequence of steps and decisions (algorithms)
needed to solve simple problems”
This is our first encounter with the dreaded A word: Algorithms. You’ve actually already met algorithms in the form of recipes, LEGO instructions (which are particularly interesting because they are largely visual, not text based), and any set of instructions you’ve ever followed. Kids will be familiar with these too, so you have a starting point. Next time you do a cooking activity, talk about the steps you follow, the decisions you have to make – like: is the chocolate melted yet? If so move on to the next step, if not, heat it some more. Introduce the word “algorithm” for the recipe, and boom! – This line in the curriculum is nailed.
“3-4: Define simple problems, and describe and follow a sequence of steps and decisions (algorithms) needed to solve them”
Here the kids need to define a problem, like “how can someone find my desk from the door of the room?” and then specify the set of steps that someone would need to solve that problem. Then they can follow each other’s steps and see if they work. This is a fun one because you tell the kids to follow the instructions as dumbly as possible, to find problems with the algorithm. For example, if the instructions say “Walk towards the back of the room until you get to my desk” they will walk into any furniture that happens to be between them and the desk if they are really following the instructions blindly, the way a robot would. Similarly, if the instructions say “walk through the door” and the door is closed, there is a step missing that could result in a very sore nose!
“5-6: Define problems in terms of data and functional requirements drawing on previously solved problems”
To be honest, this one seems a bit strange in this context. To me it makes sense in terms of software design, but for Grade 5/6 kids it seems a bit of a reach. For kids programming in Scratch, Snap, or other visual programming languages, you can only really skim the surface of an instruction like this.
So I take it to mean that you need to lay out the problem clearly before you start to write a program to solve it. Spell out what you are going to do, and what data you are going to do it with – for example if a student is writing a program to draw a spiral, they might ask for user input for the colour and size of the spiral. This is their data. And functional requirements simply mean defining exactly what the program is going to do – in the above example, the functional requirements might be “draw a spiral in a colour specified by the user, with as many layers as the user asks for.”
The other part of this one is to relate the new problem to previous problems. How is this problem similar to something you have solved before? eg drawing a spiral is like drawing a circle, but your line has to move outwards each time rather than finishing where it started.
The next row in the table has to come as something of a relief to lower primary teachers. 🙂 The good news is that it’s actually not hard for 5-6 teachers either.
Generating and designing
Design a user interface for a digital system (ACTDIP018)
Design, modify and follow simple algorithms involving sequences of steps, branching, and iteration (repetition) (ACTDIP019)
5-6: Design a user interface for a digital system
This step is not nearly as daunting as it sounds. In a Scratch program it simply means laying out how a user will use the program. So if it’s a game where you have to click on, say, balloons, in order to pop them, the user interface will be the start button, maybe a speed selector, and using the mouse to click on the balloons. Any interaction the program makes possible – any way the user can do something to change something in the program – is part of the user interface. So anything the user will be able to do – choose a pen colour, change the speed or size of an object, click on something to make it react – is part of the user interface and can be described before the program is actually written.
These goals are all aimed at teaching planning, as well as tech stuff. It’s really easy, even for experts, to just sit down and start doing stuff before they think about what it is they are trying to do. So here the curriculum is saying “let’s think about what we’re going to do, and even write it down, before we actually try to do it.”
5-6: Design, modify and follow simple algorithms involving sequences of steps, branching, and iteration (repetition)
Here we’re talking about algorithms, or recipes, not code. So this one is about being able to lay out the steps involved in solving a problem, using repeated steps (iteration – for example “keep beating until the egg whites are stiff”, “keep walking until you reach the stairs”, or “while there’s still LEGO on the floor, pick up a piece and put it in the box”), branching (testing to see if something is true, for example “if it’s dark, turn on the light”, or “if the cake is a chocolate cake, add cocoa, otherwise add vanilla essence.”)
Producing and implementing
Implement simple digital solutions as visual programs with algorithms involving branching (decisions) and user input (ACTDIP011)
Implement digital solutions as simple visual programs involving branching, iteration (repetition), and user input (ACTDIP020)
3-4: Implement simple digital solutions as visual programs with algorithms involving branching (decisions) and user input
Write a program in a visual programming language such as Scratch, Snap, or Blockly that has decision points (if statements, also referred to in the curriculum as “branching”), and user input. An example might be a program that asks the user if they want to draw a square. If they answer yes then it draws a square, otherwise it says “Bye Bye then!”
5-6: Implement digital solutions as simple visual programs involving branching, iteration (repetition), and user input
This is the same as the 3-4 version except now we’re including iteration (loops, or repetition). So the simple square program above can be rewritten to make the repeated bits simpler and just say “do the line and the turn 4 times” instead of spelling out 4 sets of lines and turns:
Once again this has become a little long and I’m not finished yet! So in the best entertainment traditions I am going to end on a cliffhanger (ok, maybe it’s less of a cliff and more of a small step, but work with me here!) and leave you keen for part 3 (I hope!). The good news is that Part 3 will be much shorter. If you find this useful, please share it widely.
These posts are also going to form the basis of a series of Professional Development workshops run by former students of mine (only for Victoria at this stage, but there’s hope for online workshops in the future), so if you’re interested in those, and/or coding workshops that are directly tied to your existing curriculum, please fill out the poll below (purely to gauge interest levels, no names or contact details will be recorded) and watch this space!
Just went to an awesome talk by Nvidia’s Will Ramey called Deep Learning Demystified. These are my notes – raw, stream of consciousness, and unedited. But interesting! (I hope)
Algorithms that learn have in the past used experts writing programs with classifiers to eg edge detect or look for particular features. First programmer has to conceive of classifiers and then connect them together in some kind of logic tree and characterise the features sought – eg wheels are round… so you get a system for distinguishing between cars/trucks/buses but it doesn’t cope with changes such as a rainy day or a foggy day. So you have to manually create new classifiers and work out how to make it work. Doesn’t scale/translate to new problems.
New approach of deep learning uses neural networks to learn from the examples that you give it. You don’t have to manually create classifiers, you can use examples in the data itself so that the network creates the classifiers. The major advantage is that it’s really easy to extend. If you train it on cars in the daytime and then want to use it on a foggy day you just provide more foggy pictures and retrain the network. You don’t have to create new hypotheses for what might work, you leave that to the network.
With GPU accelerators the process of training and retraining the networks as you expand the amount of data is relatively fast and doesn’t require a lot of manual effort.
Stunningly effective for internet services, medicine, media, security, autonomous machines etc
Deep learning is also being applied as a tool to understand and learn from massive amounts of data.
NASA uses deep learning to better understand the content of satellite images and provide feedback to farmers and scientists studying the ecosystem about how to manage their work and increase crop productivity etc.
How do you do deep learning?
Start with an untrained neural network. Deep neural networks have a large number of layers, but might not be well connected. There are known topologies of nns that are known to be good at image classification, or object recognition, or signals analysis, etc.
In its untrained state the network is just a bunch of math functions and weights that determine how the outputs of those functions are communicated to the next level. It can’t do anything. To train a NN to distinguish between dogs and cats we assemble a training set of images. We use a deep learning framework to feed the images through the NN one at a time and check the output. If we’re only trying to distinguish between cats and dogs there will be two output nodes with confidence levels – how confident is the nn that it might be a dog, and how confident that it might be a cat? The framework already knows the answer and will evaluate whether the NN infers the correct answer, and if so it will reward the neural net by strengthening the weights of those nodes that contributed most to the correct answer, and reducing the weights of the nodes that didn’t contribute. When the nn infers the incorrect answer it will decrease the weights that contributed most. etc. You keep showing the same collection of images over and over, continuing the training. Then the NN gets very good at it. It’s almost a skinnerian psychology experiment, but without electric shocks. Showing the dataset once is called an “epoch”. It takes many many epochs to train the network properly, and that’s the job of the deep learning framework.
This is supervised learning.
Now you have a trained network that can distinguish between dogs and cats. But nothing else. If you were to show it a raccoon it would probably give you a low probability value for both dogs and cats. The trained network still has all the flexibility you need for it to learn. But in most cases once you deploy it it doesn’t need to be able to learn anymore.
Colleague at Nvidia trained a neural net to recognise cats from a usb camera and setup a system to turn on the sprinkler system when there’s a cat on his lawn, to scare them away.
In some cases in the trained network there may be nodes that don’t contribute to the answer. The framework can pay attention to that and automatically remove nodes that don’t make a difference either way, or sometimes fuse layers to save time. And now you can integrate your optimised model into your application.
Deep Learning algorithms are evolving very rapidly, and it is challenging to keep up with them. Training NNs is incredibly computationally expensive, and you don’t necessarily know what the correct topology is at the start. So you might need to tweak it many times and train one, learn from that, and then explore different possibilities, which increases the computational needs and makes it more expensive.
Once again these are rough notes, unedited, taken during an invited talk at SC16. Some fascinating ideas in here, I hope the notes are useful to you.
Maria Klawe, Harvey Mudd College Diversity and Inclusion in Supercomputing
Why diversity and inclusion matters:
increasing supply to meet demand
access to great jobs
better solutions to the world’s problems
We can’t meet that demand without enticing more people into these fields. Big data and data analytics are changing every aspect of society today. Healthcare, climate change, clean energy. Big data, supercomputing, machine learning are going to be crucial.
If you want to make a difference to the world, if you want flexibility, travel, these are really great jobs.
When you have a diverse team working on a problem you get better solutions. It can be diversity because of country, race, religion, gender, sexual orientation – greater diversity means better solutions.
Harvey Mudd is a small college focusing on science, engineering and liberal arts. Every student believes it is their responsibility to help everyone else succeed. They emphasise collaboration and communication from day 1.
30-35% of their graduates go on to get a PhD. Second only to Caltech.
In 1996 Faculty and students were around 20% female. 2016 women are close to 40% of faculty, and 50% students.
Racial diversity is also increasing.
It’s not enough to get them in the door. You have to ensure that the experience once they get there is such that they can thrive. That the environment you put them in is one where they can do well.
Females weren’t having as good an experience as the male students, because, for example, all of the speakers were white males. Usually older white males, because the faculty were older white males and inviting their friends.
“They’ll never let me tell them how to run the machine, because they say ‘you’re a girl, you couldn’t possibly know how to do this!'”
These are small things, but when it happens over and over again there is a sense of lack of belonging.
Faculty racial diversity is harder to increase than student diversity, because of low turnover. So we are training our faculty search committees on how to find a diverse pool of applicants, interview them so they have a good experience, etc. But the moment we take our eye off the ball we hire all white males.
Hypothesis: If you make the Supercomputing environment supportive and engaging for all, build confidence and community among underrepresented groups, and demystify the path to success, a highly diverse population will come, thrive, and succeed.
Minorities, like anyone, love the chance to work in an environment where the work you do makes a difference.
Make sure that incoming students feel that they belong, regardless of prior experience. We have students arriving with very different levels of prior experience.
Passionate students with prior experience in CS scare other students off.
So separate incoming students by prior experience. But you will still have some students who just know more. So I meet with the scary students who answer all the questions separately, and I give them the chance to connect separately, so that the other students have a chance to contribute in class.
If you are a female mathematician or computer scientist you go through life with people having low expectations of your technical knowledge. Which makes us more reluctant to ask for help because it seems like we’re not as competent as we should be. This is also true for HM students and many adults at all kinds of levels. So we work hard to set the expectation that asking for help and hard work are far more important than any innate ability.
In our intro classes HM have a black section and a gold section. The gold section is for the kids with no prior experience. The black section is for those who’ve had a year of CS in high school. And there’s a 42 section for the ones who know heaps.
How do you get the gold students to the same level as the kids who’ve already taken a lot of CS. You’ve got to add some extra material to the black course, and it has to be interesting and fascinating, but it has to have nothing to do with the follow on courses, so that they don’t enter the next courses privileged over the gold students.
If you tell someone they’re going to be a great CS student, it doesn’t make it happen if they come from a background where that doesn’t seem likely. You tell them “just take the next course”. And then they start to find that they’re good at it. And you keep on doing that. You make each course engaging and supportive, and make it that they can work hard and be successful. It’s also very important to give early internship and/or research experience. They need to know that what they are working on will make a difference in the world. Then they are more likely to stay in the course.
HM typically takes 40-60 students to the Grace Hopper Women in Computing conference. They take students to conferences with diverse communities to foster that sense of belonging.
They build community – clubs – and ensure access to role models – faculty, speakers, mentors from industry.
Suppose you have a very rigorous, challenging, graduate course. And suppose your students talk about it as “that’s the course where you find out if you really belong in this discipline or not. It’s a really tough course.”
That’s the worst possible way to frame a course. Because you weed out the people with low confidence.
Instead you frame it as rigorous, exciting, and interesting, and you make it clear that if you work hard and ask for help you can do well. Because anyone can do that!
You don’t need to demystify the path to success just for minorities. You need to do it for everyone. To ensure that it’s not just the dominant group who get all the tips and advantages and who therefore succeed.
Best ways to attract diverse graduate students:
Recruit diverse undergraduates for summer internships (research and industry) (start early!)
Engage diverse faculty in recruiting to PhD programs.
Recruit at institutions with diverse students.
Include D&I reports on grant applications. Funding agencies to start asking for info on the diversity of the student body and what the institution is doing to increase it. The single easiest way to get institutions to change is to tie funding to progress. So we need to do this for research funding as well.
Include diverse speakers at every SC conference. Members of program committees. Videos that you make.
(Author aside – on a personal note I would say ban your show floor companies from using women in high heels and tight clothing to interact with the crowds!)
Having a really positive experience early on makes them much more likely to do a PhD.
Numerous studies have shown that we are all biased. There was a study that showed that scientists in Biology with equal faculty gender split were given two sets of resumes. Identical except for first name – Jonathan vs Jennifer. Both male and female faculty rated Jonathan higher than Jennifer, and were willing to pay Jonathan more.
Faculty members at MIT trying to recruit would call other faculty at other institutions to ask for recommendations of PhD students, and both male and female faculty would come up with white males. But if they were then prompted for women and people of colour they would come up with several extra names. It’s not evil. It’s not deliberate. It’s completely unconscious. It’s just part of growing up in a culture where, eg all doctors are male so you tend to think that doctors are male.
So you don’t just post an ad someplace, you actually reach out and ask people, and you ask explicitly for diversity.
It’s illegal in North America to ask about partners, marriage, and plans to have children. You need to make very sure that your recruiters know they can’t ask those questions.
“It gets really tiring to be the female who’s arguing for women. If you’re black it gets very tiring to be the black person. But this is the responsibility for everyone, not just the members of underrepresented groups.”
One of the big obstacles to diversity in computing is that people don’t see its relevance to them. They can’t picture themselves as part of the field. They don’t really know what the field looks like, but they have vague images of spotty youths in darkened basements with endless supplies of pizza and coke.
That could not be further from my experience of computing, and of computing professionals. Everyone has a different story, and well over half of the people I talk to never actually intended to go into computing. In fact I was one of those people. I confessed at a school assembly recently that I actually wanted to do medicine, but I didn’t get in. I played with computers a bit in school and thought they were fun, but I never saw myself as a computer person.
I took Computer Science as a fill in subject in first year, and was rather surprised to find myself doing all CS by third year. I was even more startled to find myself in honours, and the PhD came as a complete shock. I certainly never intended to become an academic, and teaching was not on the agenda at all. You may have noticed I suck at predicting my career path.
So my path into CS was not normal, but it turns out that not normal is quite normal where CS careers are concerned. No two stories are alike.
Among the many mind blowing things I did today, I was slightly startled to find myself participating in a user experience test. But conferences are full of surprises and unexpected connections. Having a background in usability myself, I couldn’t say no to the opportunity to pick holes in someone else’s software.
User experience tests are interesting. You get given a task – not a set of instructions, but a goal to complete – and set free on the system to see how you go about solving it. Is it easy or hard? Which features of the system help you, and which ones get in your way?
The User Experience guy who conducted the test, Craig, could not possibly have been nicer, and he went to great lengths to explain that it was the website on trial here, not me. “You can’t do anything wrong!” he told me. I was relaxed, and comfortable, and interested in the site. It looked like being a fun experience.
But an interesting thing happened. Despite a constant flow of reassurance and encouragement from Craig, I began to get stressed. There were… let me be tactful… some issues with the website. I couldn’t complete some of the tasks. If I’d been hooked up to any kind of physiological monitor you’d have seen my heart rate skyrocket. My breathing became rapid. I began to sweat.
Why? Because I couldn’t complete the tasks. Tasks that were a test of THE WEBSITE. Not of me. But a little voice inside my head started to say “I’m going to look stupid because I can’t figure this out.”
“Someone smarter would know that bit of jargon.”
“It’s probably staring me right in the face, but I just can’t see it. I’m so dumb.”
Over and over Craig told me how helpful this was, how I was doing great, and how it was so useful to see the issues I had with it. He told me that they could never have figured out the problems without me. He was lovely. Honestly, you could not pick a nicer, less scary person to conduct the test. But because I was having a hard time using a website that I knew there were issues with, I was judging myself.
That’s why usability matters.
Because when a piece of software is hard to use, crashes, or does something unexpected, even people with a PhD in usability blame themselves. People work with complex systems all the time – laptops, tablets, sound systems, even televisions. And in subtle (and not so subtle) ways those systems tell us that we are dumb. That we are too stupid to figure them out. That we make mistakes all the time.
As Katharine Frase pointed out, we try to learn to use fundamentally unusable systems, when those systems should actually be learning to work with us.
So next time you struggle with a piece of software, repeat after me: I’m not hopeless with computers. Computers are hopeless with me.
I’m going to focus here on ideas that will, I hope, inspire people to look more into HPC and computational science. I won’t cover the awards, although I congratulate all recipients and recognise their contribution to the field.
Once again I’m live blogging, unedited, unvarnished, and limited by my typing speed, so please forgive any errors and confusing bits. Ask questions, talk about the issues, engage with me! I’d love to chat.
Notes from the SC16 keynote
Keynote: Dr. Katharine Frase – Cognitive Computing: How Can We Accelerate Human Decision Making, Creativity and Innovation Using Techniques from Watson and Beyond?
Introduction: How do we make sense of the exabytes of information that have accumulated? HPC is helping us to understand ourselves, our world, and even our universe. And the future will bring even more exciting discoveries.
John West from TACC. HPC community faces two key challenges – first we need to grow the workforce to meet the growing demand for HPC. 200,000 jobs in computing go unfilled each year (in the USA). HPC is just a portion of this, but recruiting HPC professionals today is already hard and going to get worse. We need more minorities, we need to cast the net wide and deep, bring in every voice, every budding expert, and every moonshot idea. We need diverse backgrounds to bring diverse ideas to the table if we want to be successful in meeting our challenges.
22% of the SC16 student class are female. Women and minorities often don’t choose computing as a profession due to the perception that computing doesn’t have a direct impact on society. This needs to change!
We could send about 450,000,000 snapchats in the next second over SCinet. Amazing bandwidth!
Dr Katharine Frase, IBM.
Think about how you got here. Most of us relied on our phones to wake us up, remind us what we’re doing today, update us on what happened overnight, track our steps, navigate with GPS, and give us traffic information. Imagine that you were here at SC06. Almost everything I just said was not true. Almost without realising it our lives have been transformed. By the use of data, algorithms, and to some degrees learning our preferences.
We need to move beyond answers, however adaptive, flexible and realtime. How do we move towards co-creation? Co-hypothesizing? How do we move towards systems that can help us ask the right questions?
The challenge of big data is not always that it’s big. People in this room have been extracting enormous insights from data for generations. But it’s not the structured data that we’re drowning in. it’s the unstructured stuff. It’s the velocity of the data. how do we separate the signal from the noise? And it’s the veracity. The truthfulness of the data.
A project with the city of Dublin to figure out how we could get better info from the GPS systems on their buses. Everything requires transfers in Dublin, you can’t get one bus. You have to keep transferring. So everytime a bus is late you jeopardise your transfer. Couldn’t we use the GPS system to track each bus and find out what the best bus route is for today, given how buses are actually moving, rather than how they are scheduled to travel?
We found there were a number of buses that thought they were at the bottom of the Liffey river or in the basement of the Parliament building. It’s noisy data. So how do we extract insight without spending all our time cleaning up the data? How do we rely on the wisdom of crowds?
We focus on the architecture and the technology, but we need to think in another way – how do humans and systems interact? The first era of computing was counting things. The second is programmable systems. We get the same reliable answer to the same question regardless of how you ask it. We decided what those questions were going to be according to the structure of the system. We learnt to speak the way a computer thinks. The computer shaped our capabilities. We’re still, even with more usable languages like Python, learning to speak the way computers think.
So what do we mean by a cognitive system? What’s different?
The first is that a cognitive system understands language the way people use it. Natural language, machine learning, etc.
The second is that the system learns rather than being programmed. So the system gets smarter every day (with the right feedback) which means we can get a different answer depending on context – on where I am, or when it is. It responds to the most recent information. Maybe what I thought was true yesterday isn’t true anymore. Or I’ve given feedback to the system so that it gives you a subtly better answer next time based on your experience and feedback of last time.
The system can understand imagery, language and unstructured data as humans do. We have always built systems to expand human capacity, but computers have constrained our capacity as well.
I’m amazed now how anybody lived not knowing for 3 weeks at a time what the reaction was to that letter I sent. Can you imagine not being able to text your kids and find out where they are?
The challenge we face now is really one of complexity. If we’re going to have systems that help us think and decide, we need a better understanding of how people actually think and decide. Daniel Kahneman “Thinking Fast and Slow” describes the two sides of our brains. The fast side gives us reactions without us even knowing how we think about it, and the slow side thinks carefully about things. The slow side usually wins, but not right away.
You’re trying to think of the name of a guy you see in the grocery store that you know from work. The name Mike pops into your head, and you know it’s wrong, but you can’t get past that name to the real one. That’s the fast side blocking the slow side. Later the slow side pops up the name, but too late.
Humans have a tendency to prefer information that reinforces what we think we already know or what we expect. It effects what we retain from what we read and how we react when somebody argues with us. And we structure our experiments to find the results that we expect. We also overestimate the chance that something will happen that has happened before, and underestimate the chances of something happening that has never happened before (Trump!).
Cognitive systems can help us with this. It can tackle our unconscious cognitive bias, and take into account that the last coin flip does not influence the next one, even though a lot of people think it does.
A cardiologist felt he was at his best when he was in the hospital and there was a resident in the ward who could summarise the patient’s situation. History, overnight events, and test results. Then the cardiologist felt he could best use his experience in support of that patient. Katharine had the opportunity to have a copy of the documentation for each patient, and she participated in the panicked 5 minutes of reading through somebody’s chart before the next person walks through. We saw 5 patients that were very similar, and when the 6th patient walked in the cardiologist replayed the same script from the first 5, even though there were significant differences from the first 5.
Cognitive systems can be a statistically fresh set of eyes and help us become aware of, and deal with, our unconscious biases.
Machine learning: Image recognition and speech recognition. There is error in how humans recognise objects – sometimes you see something and think it’s actually something else. Speech is harder. The error rate is influenced by the size of the neural network to deal with it.
We now have massive datasets for training a system for speech, images, etc. We are trying to build systems to pass the radiology exam – a difficult problem even for humans. Can we train machine learning systems with huge datasets and have them do better?
You can’t just hand them unstructured datasets and have them come up with the right answer. They’re not omniscient. But we have huge increases in computational power and huge drop in cost, which makes these things possible.
The sheer volume of expertise being thrown at AI and cognitive computing is only increasing.
Why does this matter?
Why do we need to move forward in these new relationships with systems?
Katharine used to be involved with Watson, when it played Jeopardy. How do you go really native from language in the long tail on a game show, with poor English?
We went from there and tried to tackle oncology. We plugged in medical literature and started training the system to recognise the vocabulary of health. What is a symptom? diagnosis? treatment? We needed to add new features. In health timing matters – if you get the fever before you get the rash, it makes a difference. The system needed to learn what it’s reading and how humans interact with that.
We need to get expertise out to the broader community. Most people with cancer will not see an expert in their particular variation. They’ll see whoever is local and hope they get the right treatment. So we need to get that info and expertise out to improve the quality of local care.
U of Wisconsin is doing clinical trial matching. There are nearly 14,000,000 Americans with cancer. Less than 5% are involved in clinical trials. To match a patient to a trial there can be up to 46 characteristics you need to build that cohort. Gender, ethnicity, type of tumour, genetics, etc. You need to find that patient and reach the physician to suggest the treatment. So how can we use cognitive computing to get the word out to physicians that they might want to suggest this treatment?
Dept of Vet Affairs is trying to create a cognitive system to take the 10,000 vets with cancer and increase their capacity by 30x to get the right treatment to each indivudual.
Our systems have to fit into the workflow of those people who are doing the real work. We know the syndrome of an IT system that only makes sense to the computer scientist. The bigger problem is that it doesn’t fit into the workflow. The physician has to turn his back on the patient in order to do something with the system. So however capable, the system sits on the shelf.
The workflow of the human. The practitioner. How do we get the insights of everything we know we can do surfaced to the human in a way that is timely and fits into the way they want to do their jobs.
Picture this. A young girl has a persistent cough. It’s cancer. She gets referred to an oncologist, who sees her medical record. The cognitive system can tell the oncologist how it compares to the 15,000,000 cases the system has in its records. It prompts the oncologist to ask particular questions: like her own preferences about her care. For example it is important to her to keep her hair, or because she has small children she needs to come in less frequently. The system can take the specifics and recommend treatment options with likelihood of clinical success. The system reminds the doctor to ask her if anything has changed in the last two weeks. He would probably have asked her anyway, but the system reminds him. So a new piece of information comes in that can change the treatment options significantly. The options come back ranked according to success clinically, and according to how well they align with her preferences. The oncologist and patient can then make really informed choices.
If you like and trust your doctor it significantly improves your changes of a positive outcome, and you volunteer extra information. The system can help us in these dialog driven and intangible ways.
Let’s talk about congestive heart failure. Difficult to diagnose in advance, all you can do is treat it after it shows up, to mitigate the symptoms. The law has changed so that hospitals are no longer fully compensated if they readmit a patient for the same problem too soon. So a large hospital wanted them to help predict which patients they would see again soon. It turns out the data is there and it is possible to predict it. And it has nothing to do with clinical markers. The best predictor? Did they have dementia? Addictive behaviour? Living alone? These are all social effects that predict whether they will be back or not. It’s not really a clinical answer. It’s obvious when you look at it, but not necessarily obvious when you are looking from a clinical perspective.
86,000,000 people in the USA are prediabetic. The rates of diabetes are exploding. We have 56 years of info about diabetes patients. It’s not a perfect dataset, but it contains treatment as well as lifestyle stuff. So we want to create a cognitive database of this information so that we can start producing advice about things that can be done to delay the onset of diabetes for those that are prediabetic, or delay the consequences of diabetes such as blindness and gangrene.
Yes clinical data, genomics, exogenous data like fitbits are important, but the social context and cultural context is important. We can’t make the same dietary recommendations to all cultures, for example.
As we start amassing more data and looking at it as a whole, can we start identifying new factors that allow us to intervene earlier?
Let’s look at finance. Most audits rely on the structured financial info of the firm. But the real keys are not in that data. Most audit firms can’t manage the size of the real data, so they just take a sample and hope it’s representative. So can you use a cognitive system examine the full data of the firm and surface patterns to the auditors so that they can ask the hard questions and do the higher level thinking? We want the computers to do the routine things, the data intensive things, to free up the humans to do the higher level thinking.
What about World Peace? “I would have Watson help create World Peace. However, because this is unrealistic, I would have Watson improve education around the globe.” 6th grader.
This education challenge: we have unsatisfactory numbers for HS & college graduation rates, college debt, retraining our veterans and reintegrating them into society. In the developing world the problems are different. You can’t build enough schools or train enough teachers fast enough.
Education is very data poor. There’s no such thing as a clinical trial in education. Every trial is beset by small volumes of subjects and the fear that we didn’t control all the variables. Do we really know what we think we know about learning styles? about autism? A big data, data-driven approach could change this.
What could a system do to support a teacher in the classroom? It could grab all the info that’s already in the data systems of your school that you can’t get to, to give you insights into your students. How did they perform last year? Are they gifted? Are they troubled? The system can help you with the data we already have in schools that is not surfaced to the teacher. Teachers do their planning on the computer late at night. On the planning side the teacher can get up to the minute info about her students in all their classes. If John’s grades are dropping everywhere, it’s different to if he’s only failing in yours. You can get recommendations of curated content that the school has access that are tailored to individual students. She can chat with the other teachers who teach that student. She can get that content on her tablet and send messages forward to other teachers who are about to teach that student.
The ability to informally create a coordinated care feeling around that child is the single thing teachers find most valuable. Schools put in the effort to put care around the bottom and the top students. Those of us who have fabulous kids in the great unwashed middle, technology can help us surround those kids with personal care.
The more engaged a student is, the better they learn. They become learners rather than repeaters. They become explorers and discoverers. Sesame and IBM are developing platforms and products to address early childhood education. Not just to create smarter children but kinder children. Sesame tackles the tough issues, but til now it’s been in broadcast mode. It’s the same Elmo for every child. What if it weren’t? What if a parent could help design the way in which their child experiences it? My son is crazy about dinosaurs, can we do the alphabet content with dinosaurs? My daughter is crazy about horses, let’s put the stories in that context.
How do we engage children? 40 years ago TV was magic. Now kids try to interact with the TV. So how do we use technology to engage these youngest learners? There’s a vocabulary gap by the time the kids reach the age of 4 that is frighteningly predictive of the rest of their lives. How do we change that? How do we make it not be true?
Questions: Do doctors & teachers get better when they use cognitive systems?
How do you measure betterness? Most individual physicians don’t think they need the help. 🙂 But in the aggregate quality goes up. One of the big challenges for a teacher – you’ve always bee a 4th grade teacher. You’ve been asked to teach 3rd grade. It’s a new standard. What should you be expecting your students to come in with? You don’t want other people to know that you have questions. You’re afraid it will undermine your standing in the school. But we can provide an anonymous place to get that info without feeling like you’re embarrassing yourself.
This ability to start using systems in support of what we do in the classroom is extraordinarily powerful. Education isn’t cancer research. The financial incentive to create these tools doesn’t exist. Fighting cancer is an industry with cash, education is not.
Schools are already spending money on tech. They’re not necessarily spending it in a way that helps the teacher. What is that threshold of affordability that starts getting to adoption, and then scales to help us drop costs. Every teacher and every school has particular things they wish the system could do. How do we start from the subset that is affordable and then we can raise the bar? As in every industry business process change is harder than we as technologists every want to admit that it is. So you have to start with the individuals that really want to change.
How do we deal with the lack of diversity in HPC and machine learning, and help the systems avoid the bias that comes with biased input data?
If the system gets biased data it’s going to give biased output. But the system can actually help us identify biases in the datasets. It can surface the problems like disproportionate gender representation, missing cultural/ethnic groups etc, that we might not otherwise be aware of.
Well that was awesome. So many different fields where cognitive computing can make a difference, so many ways HPC is changing the world. If these summaries are useful to you, please share, retweet, and leave me a message!