Prof. Stephen Bates

X-Window Consortium Career Development Assistant Professor of Electrical Engineering and Computer Science

Primary DLC

Department of Electrical Engineering and Computer Science

MIT Room: 32-D758

Areas of Interest and Expertise

Artificial Intelligence and Machine Learning
Information Science and Systems
Optimization and Game Theory
Systems Theory, Control, and Autonomy

Research Summary

Bates uses data and AI for reliable decision-making in the presence of uncertainty. In particular, he develops tools for statistical inference with AI models, data impacted by strategic behavior, and settings with distribution shift. Bates also works on applications in life sciences and sustainability. He previously worked as a postdoc in the Statistics and EECS departments at the University of California at Berkeley (UC Berkeley). Bates received a B.S. in statistics and mathematics at Harvard University and a Ph.D. from Stanford University.

Professor Bates believes that the conceptual, algorithmic, and mathematical advances enable us to use data and AI models to better understand complex patterns in the physical and social world and to build reliable automated systems. To this end, he focuses on developing statistical principles and formal frameworks to understand challenging types of data that are increasingly important. In particular, Professor Bates works on:

(*) Statistical inference with AI systems. AI models based on deep neural networks are increasingly used in real-world systems. Their use is motivated by the fact that they have the best performance with high-dimensional data, such as image and natural language data. However, the standard statistical toolbox does not apply here; users seeking assurances about the reliability of these models, such as confidence intervals on predictions or bounds on the false discovery rate across multiple decisions, are left with little recourse based on the existing literature. He seeks to build out a rich statistical toolbox for AI models, so that researchers can use these powerful systems while remaining on solid statistical ground. Work in this theme builds on core statistical techniques such as resampling methods, multiple hypothesis testing, and empirical process theory.

(*) Data impacted by strategic behavior and information asymmetry. Data emerging from systems with human decision-makers is increasingly important, and the possible strategic behavior raises new inferential challenges. For example, profit-sensitive pharmaceutical companies sponsor clinical trials -- which are then analyzed according to some statistical protocol -- and are heavily rewarded for drugs that are approved. Correctly analyzing data affected by strategic agents is critical, and I am building methods for this, building on concepts from decision theory, game theory, and statistics.

(*) Shifting distributions and feedback loops. More broadly, data are increasingly collected from dynamic environments with shifting distributions, and these shifts can be caused by changes made to the system or policy. Bates works to extend statistical methods in such non-I.I.D. settings. For example, consider protein design, where the analyst has access to some set of proteins and an associated fitness score. The goal is to design a new protein that has higher fitness than those seen previously. The analyst might fit a model predicting fitness from protein structure, and then chooses a good candidate protein to synthesize and measure the fitness of in a wet-lab experiment. This process is repeated several times, so there is a feedback loop; the model the analyst fits affects the subsequent data collection. Such, non-I.I.D. settings with shifting distributions are increasingly relevant to modern data analysis, and it is essential to create techniques to address this.

He is especially interested in applications in the life sciences and sustainability.

Recent Work