Principal Investigator Caroline Uhler
Project Website https://www.ericandwendyschmidtcenter.org/
Project Start Date July 2023
The Eric and Wendy Schmidt Center (EWSC) is enabling a new field of interdisciplinary research at the intersection of the data and life sciences, aimed at improving human health. Researchers and partners work together to make the biological questions of our time key drivers for foundational advances in machine learning -- and vice versa.
The Center brings together a global network of scientists from academia and industry to promote interdisciplinary research between the data and life sciences to transform biology and ultimately improve human health.
It includes collaborators from across the Broad community, including MIT, Harvard, and Harvard teaching hospitals. It also partners with the pioneering work already underway at Broad, such as the Models, Inference & Algorithms (MIA) Initiative at the Broad and the cross-institutional Machine Learning for Health (ML4H) effort.
The Center is directed by Caroline Uhler, who is a core institute member of the Broad and a full professor of electrical engineering and computer science and faculty of the Institute for Data, Systems, and Society at MIT. Advances in biomedical technologies -- including next-generation sequencing, single-cell genomics, and medical imaging -- are resulting in an explosion of data. That’s giving researchers the opportunity to answer some of the most fundamental questions about the “programs” of life. Yet, self-driving cars, advertising, and recommender systems are the key drivers of advances in machine learning today.
Areas of Focus include:
(*) Cells & Optimal Perturbation Design -- With the development of genetic technologies to precisely alter, or "perturb," cells comes the opportunity to understand cell-state transitions, which are fundamental to any biological process. But the huge number of ways we can perturb cells makes it challenging to test out these alterations in the lab. That's why the Eric and Wendy Schmidt Center is developing novel active learning frameworks that can hone in on which perturbations can bring about desired cell state transitions — and provide other insights into how cells work.
(*) Tissues & Causal Representation Learning -- Through recent technological developments, we can now obtain RNA sequencing data from whole tissue sections without losing cell location information. But the computational methods for analyzing these spatial datasets are still inadequate. In particular, we need methods that can seamlessly integrate images, sequencing data, and 3D coordinates from large-scale spatial transcriptomic studies. To that end, the Eric and Wendy Schmidt Center is developing causal representation learning methods that allow us to integrate different kinds of data to uncover the mechanisms of how tissues are organized in health and disease.
(*) Organisms & Multimodal Representation Learning -- With the rise of biobanks around the world, we are entering an era where there will be millions of individuals with whole genome sequences, detailed health histories, and high-resolution imaging phenotypes. With this comes the opportunity to develop exquisite characterizations of diseases and more accurate predictions of who will respond best to therapies. Motivated by the goal of making precision medicine a reality, we are developing multimodal representation learning methods to better utilize rich, multimodal clinical datasets from partner health care systems.