Notes on Diversity in HPC – Maria Klawe

Once again these are rough notes, unedited, taken during an invited talk at SC16. Some fascinating ideas in here, I hope the notes are useful to you.

Maria Klawe, Harvey Mudd College
Diversity and Inclusion in Supercomputing
Why diversity and inclusion matters:

  • increasing supply to meet demand
  • access to great jobs
  • better solutions to the world’s problems

We can’t meet that demand without enticing more people into these fields. Big data and data analytics are changing every aspect of society today. Healthcare, climate change, clean energy. Big data, supercomputing, machine learning are going to be crucial.
If you want to make a difference to the world, if you want flexibility, travel, these are really great jobs.
When you have a diverse team working on a problem you get better solutions. It can be diversity because of country, race, religion, gender, sexual orientation – greater diversity means better solutions.
Harvey Mudd is a small college focusing on science, engineering and liberal arts. Every student believes it is their responsibility to help everyone else succeed. They emphasise collaboration and communication from day 1.
30-35% of their graduates go on to get a PhD. Second only to Caltech.
In 1996 Faculty and students were around 20% female. 2016 women are close to 40% of faculty, and 50% students.
Racial diversity is also increasing.
It’s not enough to get them in the door. You have to ensure that the experience once they get there is such that they can thrive. That the environment you put them in is one where they can do well.
Females weren’t having as good an experience as the male students, because, for example, all of the speakers were white males. Usually older white males, because the faculty were older white males and inviting their friends.
“They’ll never let me tell them how to run the machine, because they say ‘you’re a girl, you couldn’t possibly know how to do this!'”
These are small things, but when it happens over and over again there is a sense of lack of belonging.
Faculty racial diversity is harder to increase than student diversity, because of low turnover. So we are training our faculty search committees on how to find a diverse pool of applicants, interview them so they have a good experience, etc. But the moment we take our eye off the ball we hire all white males.
Hypothesis: If you make the Supercomputing environment supportive and engaging for all, build confidence and community among underrepresented groups, and demystify the path to success, a highly diverse population will come, thrive, and succeed.
Minorities, like anyone, love the chance to work in an environment where the work you do makes a difference.
Make sure that incoming students feel that they belong, regardless of prior experience. We have students arriving with very different levels of prior experience.
Passionate students with prior experience in CS scare other students off.
So separate incoming students by prior experience. But you will still have some students who just know more. So I meet with the scary students who answer all the questions separately, and I give them the chance to connect separately, so that the other students have a chance to contribute in class.
If you are a female mathematician or computer scientist you go through life with people having low expectations of your technical knowledge. Which makes us more reluctant to ask for help because it seems like we’re not as competent as we should be. This is also true for HM students and many adults at all kinds of levels. So we work hard to set the expectation that asking for help and hard work are far more important than any innate ability.
In our intro classes HM have a black section and a gold section. The gold section is for the kids with no prior experience. The black section is for those who’ve had a year of CS in high school. And there’s a 42 section for the ones who know heaps.
How do you get the gold students to the same level as the kids who’ve already taken a lot of CS. You’ve got to add some extra material to the black course, and it has to be interesting and fascinating, but it has to have nothing to do with the follow on courses, so that they don’t enter the next courses privileged over the gold students.
If you tell someone they’re going to be a great CS student, it doesn’t make it happen if they come from a background where that doesn’t seem likely. You tell them “just take the next course”. And then they start to find that they’re good at it. And you keep on doing that. You make each course engaging and supportive, and make it that they can work hard and be successful. It’s also very important to give early internship and/or research experience. They need to know that what they are working on will make a difference in the world. Then they are more likely to stay in the course.
HM typically takes 40-60 students to the Grace Hopper Women in Computing conference. They take students to conferences with diverse communities to foster that sense of belonging.
They build community – clubs – and ensure access to role models – faculty, speakers, mentors from industry.
Suppose you have a very rigorous, challenging, graduate course. And suppose your students talk about it as “that’s the course where you find out if you really belong in this discipline or not. It’s a really tough course.”
That’s the worst possible way to frame a course. Because you weed out the people with low confidence.
Instead you frame it as rigorous, exciting, and interesting, and you make it clear that if you work hard and ask for help you can do well. Because anyone can do that!
You don’t need to demystify the path to success just for minorities. You need to do it for everyone. To ensure that it’s not just the dominant group who get all the tips and advantages and who therefore succeed.
Best ways to attract diverse graduate students:

  • Recruit diverse undergraduates for summer internships (research and industry) (start early!)
  • Engage diverse faculty in recruiting to PhD programs.
  • Recruit at institutions with diverse students.
  • Include D&I reports on grant applications. Funding agencies to start asking for info on the diversity of the student body and what the institution is doing to increase it. The single easiest way to get institutions to change is to tie funding to progress. So we need to do this for research funding as well.
  • Include diverse speakers at every SC conference. Members of program committees. Videos that you make.
  • (Author aside – on a personal note I would say ban your show floor companies from using women in high heels and tight clothing to interact with the crowds!)

Having a really positive experience early on makes them much more likely to do a PhD.
Numerous studies have shown that we are all biased. There was a study that showed that scientists in Biology with equal faculty gender split were given two sets of resumes. Identical except for first name – Jonathan vs Jennifer. Both male and female faculty rated Jonathan higher than Jennifer, and were willing to pay Jonathan more.
Faculty members at MIT trying to recruit would call other faculty at other institutions to ask for recommendations of PhD students, and both male and female faculty would come up with white males. But if they were then prompted for women and people of colour they would come up with several extra names. It’s not evil. It’s not deliberate. It’s completely unconscious. It’s just part of growing up in a culture where, eg all doctors are male so you tend to think that doctors are male.
So you don’t just post an ad someplace, you actually reach out and ask people, and you ask explicitly for diversity.
It’s illegal in North America to ask about partners, marriage, and plans to have children. You need to make very sure that your recruiters know they can’t ask those questions.
It gets really tiring to be the female who’s arguing for women. If you’re black it gets very tiring to be the black person. But this is the responsibility for everyone, not just the members of underrepresented groups.”

Posted in Uncategorized | Leave a comment

Why Compute?

One of the big obstacles to diversity in computing is that people don’t see its relevance to them. They can’t picture themselves as part of the field. They don’t really know what the field looks like, but they have vague images of spotty youths in darkened basements with endless supplies of pizza and coke.

That could not be further from my experience of computing, and of computing professionals. Everyone has a different story, and well over half of the people I talk to never actually intended to go into computing. In fact I was one of those people. I confessed at a school assembly recently that I actually wanted to do medicine, but I didn’t get in. I played with computers a bit in school and thought they were fun, but I never saw myself as a computer person.

I took Computer Science as a fill in subject in first year, and was rather surprised to find myself doing all CS by third year. I was even more startled to find myself in honours, and the PhD came as a complete shock. I certainly never intended to become an academic, and teaching was not on the agenda at all. You may have noticed I suck at predicting my career path.

So my path into CS was not normal, but it turns out that not normal is quite normal where CS careers are concerned. No two stories are alike.

So now, thanks to the generous and enthusiastic support of my interviewees,  I am capturing those stories. Check out my Youtube Channel, Why Compute, at

You might be surprised. This is what a Computer Scientist looks like.

Posted in Uncategorized | Leave a comment

The usability of imposter syndrome

Among the many mind blowing things I did today, I was slightly startled to find myself participating in a user experience test. But conferences are full of surprises and unexpected connections. Having a background in usability myself, I couldn’t say no to the opportunity to pick holes in someone else’s software.

User experience tests are interesting. You get given a task – not a set of instructions, but a goal to complete – and set free on the system to see how you go about solving it. Is it easy or hard? Which features of the system help you, and which ones get in your way?

The User Experience guy who conducted the test, Craig, could not possibly have been nicer, and he went to great lengths to explain that it was the website on trial here, not me. “You can’t do anything wrong!” he told me. I was relaxed, and comfortable, and interested in the site. It looked like being a fun experience.

But an interesting thing happened. Despite a constant flow of reassurance and encouragement from Craig, I began to get stressed. There were… let me be tactful… some issues with the website. I couldn’t complete some of the tasks. If I’d been hooked up to any kind of physiological monitor you’d have seen my heart rate skyrocket. My breathing became rapid. I began to sweat.

Why? Because I couldn’t complete the tasks. Tasks that were a test of THE WEBSITE. Not of me. But a little voice inside my head started to say “I’m going to look stupid because I can’t figure this out.”

“Someone smarter would know that bit of jargon.”

“It’s probably staring me right in the face, but I just can’t see it. I’m so dumb.

Over and over Craig told me how helpful this was, how I was doing great, and how it was so useful to see the issues I had with it. He told me that they could never have figured out the problems without me. He was lovely. Honestly, you could not pick a nicer, less scary person to conduct the test. But because I was having a hard time using a website that I knew there were issues with, I was judging myself.


Right there.

That’s why usability matters.

Because when a piece of software is hard to use, crashes, or does something unexpected, even people with a PhD in usability blame themselves. People work with complex systems all the time – laptops, tablets, sound systems, even televisions. And in subtle (and not so subtle) ways those systems tell us that we are dumb. That we are too stupid to figure them out. That we make mistakes all the time.

As Katharine Frase pointed out, we try to learn to use fundamentally unusable systems, when those systems should actually be learning to work with us.

So next time you struggle with a piece of software, repeat after me: I’m not hopeless with computers. Computers are hopeless with me.


Posted in Uncategorized | Tagged , | Leave a comment

SC16 Keynote

I’m going to focus here on ideas that will, I hope, inspire people to look more into HPC and computational science. I won’t cover the awards, although I congratulate all recipients and recognise their contribution to the field.

Once again I’m live blogging, unedited, unvarnished, and limited by my typing speed, so please forgive any errors and confusing bits. Ask questions, talk about the issues, engage with me! I’d love to chat.

Notes from the SC16 keynote
Keynote: Dr. Katharine Frase – Cognitive Computing: How Can We Accelerate Human Decision Making, Creativity and Innovation Using Techniques from Watson and Beyond?

Introduction: How do we make sense of the exabytes of information that have accumulated? HPC is helping us to understand ourselves, our world, and even our universe. And the future will bring even more exciting discoveries.
John West from TACC. HPC community faces two key challenges – first we need to grow the workforce to meet the growing demand for HPC. 200,000 jobs in computing go unfilled each year (in the USA). HPC is just a portion of this, but recruiting HPC professionals today is already hard and going to get worse. We need more minorities, we need to cast the net wide and deep, bring in every voice, every budding expert, and every moonshot idea. We need diverse backgrounds to bring diverse ideas to the table if we want to be successful in meeting our challenges.
22% of the SC16 student class are female. Women and minorities often don’t choose computing as a profession due to the perception that computing doesn’t have a direct impact on society. This needs to change!
We could send about 450,000,000 snapchats in the next second over SCinet. Amazing bandwidth!
Dr Katharine Frase, IBM.
Think about how you got here. Most of us relied on our phones to wake us up, remind us what we’re doing today, update us on what happened overnight, track our steps, navigate with GPS, and give us traffic information. Imagine that you were here at SC06. Almost everything I just said was not true. Almost without realising it our lives have been transformed. By the use of data, algorithms, and to some degrees learning our preferences.
We need to move beyond answers, however adaptive, flexible and realtime. How do we move towards co-creation? Co-hypothesizing? How do we move towards systems that can help us ask the right questions?
The challenge of big data is not always that it’s big. People in this room have been extracting enormous insights from data for generations. But it’s not the structured data that we’re drowning in. it’s the unstructured stuff. It’s the velocity of the data. how do we separate the signal from the noise? And it’s the veracity. The truthfulness of the data.
A project with the city of Dublin to figure out how we could get better info from the GPS systems on their buses. Everything requires transfers in Dublin, you can’t get one bus. You have to keep transferring. So everytime a bus is late you jeopardise your transfer. Couldn’t we use the GPS system to track each bus and find out what the best bus route is for today, given how buses are actually moving, rather than how they are scheduled to travel?
We found there were a number of buses that thought they were at the bottom of the Liffey river or in the basement of the Parliament building. It’s noisy data. So how do we extract insight without spending all our time cleaning up the data? How do we rely on the wisdom of crowds?
We focus on the architecture and the technology, but we need to think in another way – how do humans and systems interact?  The first era of computing was counting things. The second is programmable systems. We get the same reliable answer to the same question regardless of how you ask it. We decided what those questions were going to be according to the structure of the system. We learnt to speak the way a computer thinks. The computer shaped our capabilities. We’re still, even with more usable languages like Python, learning to speak the way computers think.
So what do we mean by a cognitive system? What’s different?
The first is that a cognitive system understands language the way people use it. Natural language, machine learning, etc.
The second is that the system learns rather than being programmed. So the system gets smarter every day (with the right feedback) which means we can get a different answer depending on context – on where I am, or when it is. It responds to the most recent information. Maybe what I thought was true yesterday isn’t true anymore. Or I’ve given feedback to the system so that it gives you a subtly better answer next time based on your experience and feedback of last time.
The system can understand imagery, language and unstructured data as humans do. We have always built systems to expand human capacity, but computers have constrained our capacity as well.
I’m amazed now how anybody lived not knowing for 3 weeks at a time what the reaction was to that letter I sent. Can you imagine not being able to text your kids and find out where they are?
The challenge we face now is really one of complexity. If we’re going to have systems that help us think and decide, we need a better understanding of how people actually think and decide. 
Daniel Kahneman “Thinking Fast and Slow” describes the two sides of our brains. The fast side gives us reactions without us even knowing how we think about it, and the slow side thinks carefully about things. The slow side usually wins, but not right away.
You’re trying to think of the name of a guy you see in the grocery store that you know from work. The name Mike pops into your head, and you know it’s wrong, but you can’t get past that name to the real one. That’s the fast side blocking the slow side. Later the slow side pops up the name, but too late.
Humans have a tendency to prefer information that reinforces what we think we already know or what we expect. It effects what we retain from what we read and how we react when somebody argues with us. And we structure our experiments to find the results that we expect. We also overestimate the chance that something will happen that has happened before, and underestimate the chances of something happening that has never happened before (Trump!).
Cognitive systems can help us with this. It can tackle our unconscious cognitive bias, and take into account that the last coin flip does not influence the next one, even though a lot of people think it does.
A cardiologist felt he was at his best when he was in the hospital and there was a resident in the ward who could summarise the patient’s situation. History, overnight events, and test results. Then the cardiologist felt he could best use his experience in support of that patient. Katharine had the opportunity to have a copy of the documentation for each patient, and she participated in the panicked 5 minutes of reading through somebody’s chart before the next person walks through. We saw 5 patients that were very similar, and when the 6th patient walked in the cardiologist replayed the same script from the first 5, even though there were significant differences from the first 5.
Cognitive systems can be a statistically fresh set of eyes and help us become aware of, and deal with, our unconscious biases.
Machine learning: Image recognition and speech recognition. There is error in how humans recognise objects – sometimes you see something and think it’s actually something else. Speech is harder. The error rate is influenced by the size of the neural network to deal with it.
We now have massive datasets for training a system for speech, images, etc. We are trying to build systems to pass the radiology exam – a difficult problem even for humans. Can we train machine learning systems with huge datasets and have them do better?
You can’t just hand them unstructured datasets and have them come up with the right answer. They’re not omniscient. But we have huge increases in computational power and huge drop in cost, which makes these things possible.
The sheer volume of expertise being thrown at AI and cognitive computing is only increasing.
Why does this matter?
Why do we need to move forward in these new relationships with systems?
Katharine used to be involved with Watson, when it played Jeopardy. How do you go really native from language in the long tail on a game show, with poor English?
We went from there and tried to tackle oncology. We plugged in medical literature and started training the system to recognise the vocabulary of health. What is a symptom? diagnosis? treatment? We needed to add new features. In health timing matters – if you get the fever before you get the rash, it makes a difference. The system needed to learn what it’s reading and how humans interact with that.
We need to get expertise out to the broader community. Most people with cancer will not see an expert in their particular variation. They’ll see whoever is local and hope they get the right treatment. So we need to get that info and expertise out to improve the quality of local care.
U of Wisconsin is doing clinical trial matching. There are nearly 14,000,000 Americans with cancer. Less than 5% are involved in clinical trials. To match a patient to a trial there can be up to 46 characteristics you need to build that cohort. Gender, ethnicity, type of tumour, genetics, etc. You need to find that patient and reach the physician to suggest the treatment. So how can we use cognitive computing to get the word out to physicians that they might want to suggest this treatment?
Dept of Vet Affairs is trying to create a cognitive system to take the 10,000 vets with cancer and increase their capacity by 30x to get the right treatment to each indivudual.
Our systems have to fit into the workflow of those people who are doing the real work. We know the syndrome of an IT system that only makes sense to the computer scientist. The bigger problem is that it doesn’t fit into the workflow. The physician has to turn his back on the patient in order to do something with the system. So however capable, the system sits on the shelf.
The workflow of the human. The practitioner. How do we get the insights of everything we know we can do surfaced to the human in a way that is timely and fits into the way they want to do their jobs.
Picture this. A young girl has a persistent cough. It’s cancer. She gets referred to an oncologist, who sees her medical record. The cognitive system can tell the oncologist how it compares to the 15,000,000 cases the system has in its records. It prompts the oncologist to ask particular questions: like her own preferences about her care.  For example it is important to her to keep her hair, or because she has small children she needs to come in less frequently. The system can take the specifics and recommend treatment options with likelihood of clinical success. The system reminds the doctor to ask her if anything has changed in the last two weeks. He would probably have asked her anyway, but the system reminds him. So a new piece of information comes in that can change the treatment options significantly. The options come back ranked according to success clinically, and according to how well they align with her preferences. The oncologist and patient can then make really informed choices.
If you like and trust your doctor it significantly improves your changes of a positive outcome, and you volunteer extra information. The system can help us in these dialog driven and intangible ways.
Let’s talk about congestive heart failure. Difficult to diagnose in advance, all you can do is treat it after it shows up, to mitigate the symptoms. The law has changed so that hospitals are no longer fully compensated if they readmit a patient for the same problem too soon. So a large hospital wanted them to help predict which patients they would see again soon. It turns out the data is there and it is possible to predict it. And it has nothing to do with clinical markers. The best predictor? Did they have dementia? Addictive behaviour? Living alone? These are all social effects that predict whether they will be back or not. It’s not really a clinical answer. It’s obvious when you look at it, but not necessarily obvious when you are looking from a clinical perspective.
86,000,000 people in the USA are prediabetic. The rates of diabetes are exploding. We have 56 years of info about diabetes patients. It’s not a perfect dataset, but it contains treatment as well as lifestyle stuff. So we want to create a cognitive database of this information so that we can start producing advice about things that can be done to delay the onset of diabetes for those that are prediabetic, or delay the consequences of diabetes such as blindness and gangrene.
Yes clinical data, genomics, exogenous data like fitbits are important, but the social context and cultural context is important. We can’t make the same dietary recommendations to all cultures, for example.
As we start amassing more data and looking at it as a whole, can we start identifying new factors that allow us to intervene earlier?
Let’s look at finance. Most audits rely on the structured financial info of the firm. But the real keys are not in that data. Most audit firms can’t manage the size of the real data, so they just take a sample and hope it’s representative. So can you use a cognitive system examine the full data of the firm and surface patterns to the auditors so that they can ask the hard questions and do the higher level thinking? We want the computers to do the routine things, the data intensive things, to free up the humans to do the higher level thinking.
What about World Peace? “I would have Watson help create World Peace. However, because this is unrealistic, I would have Watson improve education around the globe.” 6th grader.
This education challenge: we have unsatisfactory numbers for HS & college graduation rates, college debt, retraining our veterans and reintegrating them into society. In the developing world the problems are different. You can’t build enough schools or train enough teachers fast enough.
Education is very data poor. There’s no such thing as a clinical trial in education. Every trial is beset by small volumes of subjects and the fear that we didn’t control all the variables. Do we really know what we think we know about learning styles? about autism? A big data, data-driven approach could change this.
What could a system do to support a teacher in the classroom? It could grab all the info that’s already in the data systems of your school that you can’t get to, to give you insights into your students. How did they perform last year? Are they gifted? Are they troubled? The system can help you with the data we already have in schools that is not surfaced to the teacher. Teachers do their planning on the computer late at night. On the planning side the teacher can get up to the minute info about her students in all their classes. If John’s grades are dropping everywhere, it’s different to if he’s only failing in yours. You can get recommendations of curated content that the school has access that are tailored to individual students. She can chat with the other teachers who teach that student. She can get that content on her tablet and send messages forward to other teachers who are about to teach that student.
The ability to informally create a coordinated care feeling around that child is the single thing teachers find most valuable. Schools put in the effort to put care around the bottom and the top students. Those of us who have fabulous kids in the great unwashed middle, technology can help us surround those kids with personal care.
The more engaged a student is, the better they learn. They become learners rather than repeaters. They become explorers and discoverers. Sesame and IBM are developing platforms and products to address early childhood education. Not just to create smarter children but kinder children. Sesame tackles the tough issues, but til now it’s been in broadcast mode. It’s the same Elmo for every child. What if it weren’t? What if a parent could help design the way in which their child experiences it? My son is crazy about dinosaurs, can we do the alphabet content with dinosaurs? My daughter is crazy about horses, let’s put the stories in that context.
How do we engage children? 40 years ago TV was magic. Now kids try to interact with the TV. So how do we use technology to engage these youngest learners? There’s a vocabulary gap by the time the kids reach the age of 4 that is frighteningly predictive of the rest of their lives. How do we change that? How do we make it not be true?

Do doctors & teachers get better when they use cognitive systems?
How do you measure betterness? Most individual physicians don’t think they need the help. 🙂 But in the aggregate quality goes up. One of the big challenges for a teacher – you’ve always bee a 4th grade teacher. You’ve been asked to teach 3rd grade. It’s a new standard. What should you be expecting your students to come in with? You don’t want other people to know that you have questions. You’re afraid it will undermine your standing in the school. But we can provide an anonymous place to get that info without feeling like you’re embarrassing yourself.
This ability to start using systems in support of what we do in the classroom is extraordinarily powerful.
Education isn’t cancer research. The financial incentive to create these tools doesn’t exist. Fighting cancer is an industry with cash, education is not.
Schools are already spending money on tech. They’re not necessarily spending it in a way that helps the teacher. What is that threshold of affordability that starts getting to adoption, and then scales to help us drop costs. Every teacher and every school has particular things they wish the system could do. How do we start from the subset that is affordable and then we can raise the bar? As in every industry business process change is harder than we as technologists every want to admit that it is.  So you have to start with the individuals that really want to change.

How do we deal with the lack of diversity in HPC and machine learning, and help the systems avoid the bias that comes with biased input data?
If the system gets biased data it’s going to give biased output. But the system can actually help us identify biases in the datasets. It can surface the problems like disproportionate gender representation, missing cultural/ethnic groups etc, that we might not otherwise be aware of.


Well that was awesome. So many different fields where cognitive computing can make a difference, so many ways HPC is changing the world. If these summaries are useful to you, please share, retweet, and leave me a message!

Posted in Uncategorized | Tagged , , , , | Leave a comment

HPC matters, Impact on Precision Medicine

I’m sitting in the first real session of the technical programme at SC16: the Plenary session of HPC Impact on Precision medicine. I’m going to try to live blog as much of the conference as possible. Bear in mind it’s what I see and hear, filtered through what I find interesting and what I have time to actually get down in text, so please don’t take it as a true and correct record of the session. But if it inspires you to look a little further into a topic, or gives you some perspective on the use of HPC – High Performance Computing – then it’s a worthwhile effort.

Here goes!

“We have a war on cancer going back many decades, but we can’t say we’ve cracked the problem. ”

“We are now on the path to produce computers that can do 10**18 operations per second, but we’re also about using computers that operate at that scale to solve problems. These tools will be central to cracking the problem of cancer.”

“Wherever there are questions that tie our brain in knots, HPC is untangling them.”

Precision medicine takes account of individual variations to tailor the most appropriate treatments to patients. It used to be called personalised medicine.

Healthcare spending in the US is heading towards 11 trillion dollars in 2021. Precision medicine provides tools to enable better outcomes for patients. Zeroing in on the most appropriate care is the goal.

It can help to save the lives of children and adults who have exhausted the possibilities of traditional medicine.

Cancer moonshot aims to accelerate cancer research, using precision medicine.

The session has 5 distinguished experts with 5 different perspectives on precision medicine.

We have Fred Streitz from Lawrence Livermore National Laboratories, Mitchell Cohen from Colorado, a surgeon and science and translational investigator, Warren Kibbe from the National Cancer Institute, Steve Scott from Cray Inc, Marti Head from Glaxo Smith Kline.

Mitchell Cohen talks about Physiological state recognition, formerly known as clinical acumen – ie knowing the state the patient is in. Recognising the situation and adapting to it. Much of the general public believes that they will get sick, go to hospital, and have a tricorder tell us what’s wrong with them and how to fix it. In practice you’re looking for an experienced clinician who knows what he/she is looking at. It’s about experience.

The question is: Can we use computational modeling and HPC and what we know about biology and physiology to model that experienced clinician??

Unfortunately the current state of the art in Physiological state recognition is not that great.

The state of the art of ICU precision medicine is a 4×6 index card with notes and test results scrawled on it. This is how we take care of the sickest patients. We use our combined clinical gestalt to make decisions based on data which may be wrong and outdated. We treat one parameter and one threshold – we’re treating univariately in a multi-variate world. If you get the astute clinician you’re going to do well. If you don’t get that astute clinician things might not go so well.

We ask – can we sequence the cancer and you and figure out which treatment will work?

I have a hope that we look back at that several years from now and say that’s not really precision medicine, that was just better pathology. The real precision medicine is identifying their physiological state – where they are at any point in time – and where they will be in 2 minutes time, an hour, a day, and how we can modify their trajectory towards better health.

The beauty of HPC is that it can sit in that chasm between model driven basic biology and data driven medicine.

This will fundamentally change the art and practice of medicine.

Warren Kibbe talks about the cancer moonshot. National Cancer Institute’s mission is to develop the scientific evidence base for understanding cancer, and lessen the burden of cancer around the world. In 2016 there will be 1,700,000 new cancer cases and 600,000 cancer deaths in the USA alone, and nearly 14,000,000 new cancer cases around the world. It’s a really important disease for us to understand, and we are understanding it in fundamental ways. The good news is that the mortality rate of cancer is declining since 2007.

Precision medicine will lead to fundamental understanding of the complex interplay between genetics, epigenetics, nutrition, environment and clinical presentation and lead to evidence based treatments.

Cancer requires biological understanding, advances in scientific methods, instrumentation, technology, data, and computation. We need to be able to model and predict cancer in a very different way to the way we do now.

We need to share data and share ideas around the world and around the research community, such as the Genomic Data Commons. We need to get the data into the cloud so that people can access it more effectively. We’ve done something fundamentally different and fundamentally different for cancer. We want to have every patient’s data across the country and across the world be accessible from a prediction standpoint.

Go to to read the recommendations for the best ideas we have on the cancer moonshot.

Steve Scott – we ‘re trying to bring together all the information we have available to better understand the patient’s situation. We’re dealing with increasingly complex and large datasets. We need to include all the information we have about the population, about medical literature, about the environment, and mine it for the long tail, the statistical outliers, so that we can better understand the individual.

We have databases filled with a vast number of cancer mutations. Historically what we’ve done is look at the most common ones and find good treatments for those. That gives you good treatments for the most common cancers.  HPC and data analytics give you the power to look at the uncommon ones too, and use the entire database. These are the sort of understandings that can lead to precision treatments for individuals.

One group created scalable software to do complex pattern matching on a graph database of medical literature. Using this they can answer some complex questions about medical treatments and technologies, and develop specific treatments and even diagnose individual rare cases.

The emerging data analytics world has a disconnect with clinicians. We need to work together and produce practical solutions that clinicians can use now.

Computational needs are getting more and more complex. The real power of supercomputers depends more on ability to move data – on memory and interconnect – than on operations per second.

Precision medicine is moving towards large scale graph analytics and machine learning and we can learn from HPC disciplines to make this possible. We need to build solutions that actual clinicians can use without having to be computer scientists.

ExAC database is a tool that is being used by clinicians and is a great example of usable systems that are useful to the people at the coalface.

Fred Streitz from LLNL

We say that HPC matters, and in this community we know that’s true. But it’s not always easy to convince the rest of the world. We come up with stories about the development of the iphone or the creation of a Boeing jet, but we’re now talking about using HPC to change medicine, and that will matter to everybody.

We have 3 different programs at 3 different scales – cellular level looking at predictive models for pre-clinical screening. If you give a drug to a particular cell line, even one that’s ostensibly identical, you don’t get identical results. We have developed a large and growing database of these responses, so can we use machine learning to uncover the predictive patterns in that data from the pre-clinical data?

Pilot 2 is focused at the atomic/protein level. Looking at the RAS protein mutation that causes unrestrained growth, which is responsible for around 30% of human cancers, and some of the really nasty ones like pancreatic cancer and lung cancer. We want to identify targetable areas where we can develop therapeutics.

Pilot 3 is focused on the population scale, developing an effective national cancer surveillance programme, looking at all the data we already have about who has had cancer and what treatment and response they’ve had. Using natural language processing to combine all the different data sets from state to state and then using machine learning to find stuff out.

In this research partnership is crucial. The partnership  changes how you view the problem. You need to put people together with different skills and perspectives and that’s how you solve the problem.

Marti Head from GSK. We think of disease as a war and it’s an imperfect metaphor. It’s often the response of our own bodies that causes the real damage and is the real disease. Cancer we think of as something that’s coming after us, but really it’s a part of who we are. A part of our genetics, our physiology, and our environment. We need to take a holistic approach to all that I am as an individual and how that contributes to throwing my body out of homeostasis, away from health, and towards disease.

We have this amazing increase in the data available to us. The clinical data. The data that’s locked up within pharmaceutical companies. The data is complex and complicated and on different scales, and we need a transformation in the way that we discover drugs and deliver them to our patients. We need to be able to combine all the different types of data and create a holistic view of us and how we manage our health.

It takes us 5-7 years to go from a disease hypothesis to a drug candidate that can be used in the clinic, and then there are years of clinical trials before we know whether that disease hypothesis is anything close to true. If we want to work at an individual level we have to go faster. We can’t just take the same processes we’re already doing and shrinking the whitespace and rushing. We need to transform the way we do things. This is why HPC matters.

Question session

We have only a few hundreds of thousands of genomes available to us now. We need a lot more data. We need to understand the questions we’re trying to ask and the problems we’re trying to solve, and find as many creative ways as possible to fill the data gaps. We need to understand where the gaps in data are and try to fill them in order to solve the problems.

Solving cancer requires us to understand fundamentals about biology and we’re not even close to that. We need a tremendous amount of data and much better predictive models than we currently have. In the short term we can look at what we can do with the data we already know how to generate and get better at doing that.

We could start really impacting patients’ lives within 5 years. We are going for a fundamental paradigm shift in the way we do medicine, but each incremental step helps our clinicians save lives today. Even a small incremental improvement in our understanding of biology can change things at the bedside right now.

So that’s my rapid brain dump of the plenary session. If anything is unclear, please ask me! Comment, discuss, engage. 🙂

All mistakes are doubtless my own and not the responsibility of the speakers.



Posted in Uncategorized | Tagged | Leave a comment

Why send high school students to SC?


Why would we send four year 10 students to an academic conference? What could they possibly gain from it?

Supercomputing is an extraordinary conference. There are certainly plenty of talks that will go over the students’ heads – and indeed over mine, despite the PhD in Computer Science, and my years in academia before I became a high school teacher.

But there is so much more than that. Last year we heard about how supercomputing allows us to model the world in so much detail that we can measure the impact of closing schools on the spread of a pandemic. Or of shutting down public transport. Or just asking people to stay home.

We heard how modeling of earthquakes actually changed our understanding of the earthquake risks in areas of California, changing building codes, crisis plans, and insurance rates.

We heard Alan Alda talk about science communication, and thought a lot about how it doesn’t matter how amazing the science you do is, if you can’t communicate it, it really hasn’t happened.

And we got to talk to scientists and programmers from organisations like NASA, Nvidia, NOAA (National Oceanic and Atmospheric Administration), Texas Advanced Computing Centre, and SGI. The kids got to ask their own questions, and hear countless stories about how people got into Computation, what impact it has on their work, and how it changes the world.

Most importantly they got excited about the power of Computation, and they were able to bring that excitement home and share it with the rest of the school. Each year a teacher goes with the students and changes their own understanding of Computation, and how it relates to their particular discipline. This is my third time at SC, and every time the impact on me and my teaching has been huge.

This opportunity to link high school students and teachers with Computer Science academia and the supercomputing industry is extraordinary. And I actually think the impact goes both ways. Academics and professionals get to share their work with bright, enthusiastic young people, and think about how to present their field to a different audience. And we get to take back a new, intense understanding of the incredible change supercomputing is creating in the world.

The ripple effect of the stories we take home with us is vast. You can change the world with supercomputing, and you can change the world with young people. Combine the two and who knows what will happen? We are casting pebbles of knowledge and enthusiasm into the universe.

Everyone we spoke to at SC15 had a story, and everyone was generous and encouraging with the students.

So if you’re at SC16 and you see us around, come and say Hi. Share your story, and we’ll share ours. Who knows what effect your pebble might have?

Posted in Uncategorized | Tagged , , | Leave a comment

Supercomputing Odyssey

Today I am in Salt Lake City, Utah, ready for the start of SC16. Four year 10s and I will spend a week learning about the applications of Supercomputing, and being blown away by the technology, effect, and passion of those involved in Supercomputing.

From NASA to Nvidia, from earthquakes to nuclear power, supercomputing is increasingly pervasive. It’s an extraordinary opportunity for 4 year 10 students and one high school Computer Science teacher. Our challenge is to bring back and share as much of the wonder, the passion, and the extraordinary reach of supercomputing as we can.

From Computational Science to just-in-time delivery, from the A380 we flew in on to Amazon’s book deliveries, from pandemic management to climate change, supercomputing is changing our understanding of, and our approach to, the universe.

I’ll be blogging about it here as much as possible, and you can follow our journey at the student blog, or on twitter as @scjmss , or on Facebook

Our trip is generously funded by Monash University Faculty of Information Technology and Papercut Software International.

Come join us and learn how supercomputing is shaping your world!


Posted in Uncategorized | Tagged , , | 2 Comments