John Guttag

Dugald C Jackson Professor of Computer Science and Engineering

Machine Learning Medicine

Machine Learning Medicine

John Guttag brings “small data” analyses to predictions of patient risks and treatment benefits.

By: Eric Bender

As we stockpile enormous quantities of healthcare data, and wonder what exactly to do with it all, machine learning algorithms can help us predict which patients are most likely to need a specific medical treatment or to benefit from that treatment, says John Guttag, MIT’s Dugald C. Jackson Professor of Electrical Engineering and Computer Science.

While machine learning can be defined many ways, “the basic notion is that it turns programming on its head,” he says. “Usually we take data, feed it into a program and out comes an answer. In supervised machine learning, you take the data and the outcomes, and put those into a program that produces another program that can then make predictions with unseen data.”

“One of the things you can do with machine learning is generate hypotheses,” Guttag adds. “It’s a powerful tool that allows you to do things that you can’t do with traditional statistical hypothesis-checking techniques.”

However, applying machine learning to medical data is far from straightforward, says Guttag, who researches these algorithms at MIT and commercializes them in the startup Health[at]Scale Technologies.

“You hear about tens of millions of patients, tens of thousands of variables per patient, and think, Wow, this is a big data problem,” he says. “But by the time you focus on a particular condition and patient population, you end up stratifying your data so you never have quite enough. And so what we typically end up grappling with technically is a small data problem.”

One effective machine-learning approach with “small data” is what Guttag calls a computational biomarker.

Rather than direct measures of, say, cholesterol or certain proteins in the blood, “we can apply sophisticated computational techniques and construct an artificial biomarker that is actually more informative than the directly measured ones,” he explains. For example, analyzing electrocardiogram data for signs of small heart defects that are invisible to the human eye might yield better predictions about who will benefit from a heart defibrillator.

Among many other health applications for machine learning, Guttag and his colleagues are studying ways to help in finding patients who represent the best candidates for clinical trials of drugs or medical devices, predicting which patients are most likely to get new infections during their hospital care, and predicting the outcomes of routing patients to different care providers.

Pinpointing patients

A veteran software engineer, Guttag began studying medical uses for machine learning when he took a sabbatical from MIT at Massachusetts General Hospital (MGH). He had planned to bring his expertise in wireless networking and software-defined radios to build better medical devices. “But I saw there was a huge gap not so much in the device end of the world but in the data science end of the world — increasingly, medicine was accumulating vast amounts of information and yet didn’t know what to do with it,” he says. “So I reoriented myself, to figure instead how we can take this treasure trove of data and turn it into useful information.”

Guttag focused on building models for more personalized medicine, predicting which patients would respond well to treatment, which would not respond, and which might respond adversely. The models could also aid in analyzing who actually needs treatment.

To take an example from cardiac care, each year about two million people in the United States suffer a heart attack. “Most of those people will do just fine afterwards, but a small fraction of them will do very badly,” he says. “Figuring out who needs what is a hard problem.”

Some can benefit from an implantable cardiac defibrillator, which can shock the heart into restarting. But only a small fraction of patients who get the implants end up using them. “Right now, we’re not very good at defining who would benefit from them,” Guttag says. “We wanted to find a better indicator.”

Damage to the heart muscle makes it electrically unstable. If the damage is large, doctors can see instabilities in electrocardiogram data, which helps them to make a decision about an implant. If the damage is small, however, they can’t see it. But Guttag and his colleagues showed that by examining many hours of heartbeats, they could detect small instabilities that provide good indicators for treatment with a defibrillator.

In one collaboration with a healthcare provider, a joint MIT/MGH project will examine one way to cut down on healthcare-associated infections.

“An embarrassingly large fraction of the people admitted to a hospital end up with an infection totally unrelated to the reason for their admission,” Guttag explains. “This is a huge burden on the patients and a huge burden on the healthcare system.”

Aiming to cut down on these infections, he and his colleagues built a predictive model that taps into the hospital information system and can assess each patient’s likelihood of contracting a specific infection within a few days. Testing patients at greatest risk before they’re symptomatic can speed their treatments and reduce the chances for the infection to spread.

Partnering for clinical change

Guttag knew that pharmaceutical companies, medical device makers, healthcare providers, and payers all may be more comfortable working with a startup than licensing intellectual property from academia. That was the thinking behind the launch of Health[at]Scale Technologies, which offers validated machine learning technologies to such partners.

“If you’re a pharmaceutical company, this technology will be useful in targeting drugs,” Guttag says. “If you’re a healthcare provider, it’s useful in choosing treatments. If you’re a healthcare payer, it’s useful in deciding what you should pay for and what you should not pay for.”

One likely role in the company’s collaborations with pharmaceutical and device companies is to aid in selecting appropriate patients for clinical trials. “These trials, particularly the large-scale trials, are very expensive, and most of them fail,” he points out. “If you could design the trial around individuals who are likely to benefit, you could dramatically reduce the chance of failure.”

But picking the best patient candidates is tricky. “A big clinical trial is a few thousand patients,” Guttag says. “That’s not a big number.” This opens up the opportunity to use “small data” approaches that draw effectively on data from related clinical trials (including trials that fail), normal clinical records, and other non-trial sources to assist in designing better trials.

Working with healthcare payers, Health[at]Scale can analyze healthcare records, insurance claims, prescription information, and other data to predict “the right therapy in the right amount delivered by the right provider to the right patient at the right time,” Guttag says.

And in a collaboration with providers and payers, Health[at]Scale Technologies is matching surgical patients with appropriate providers and post-acute care patients with appropriate skilled nursing facilities.

Broadening the scope of data

While Health[at]Scale brings proven analytic techniques to healthcare problems, in his day job at MIT Guttag and his students are pursuing two open challenges in medical research.

One line of work studies methods to correlate the effects of treatments on different patient populations — attempting to use data from one disease to predict outcomes for a different disease, for instance, or to use data from one population to make accurate predictions about another population.

The second effort looks at opportunities in applying computer vision to medicine.

Doctors often gain useful information just from eyeballing patients, Guttag points out. In a physical exam, for example, a doctor might pinch a patient’s finger and see how long it takes to get pink again, or look for little black spots on the toes of someone at risk of a circulatory disease. “With a camera and sophisticated image analysis, could we actually measure these things and provide quantitative information?” he asks. “Could we track progression in ways that are very hard to do with the human eye? We don’t know the answers, but we think that in the next couple of years we will.”