Prof. Omar Ahmed Metwally Khattab

Assistant Professor of Electrical Engineering and Computer Science

Primary DLC

Department of Electrical Engineering and Computer Science


Research Summary

Professor Khattab joined the Department of Electrical Engineering and Computer Science as an assistant professor in July. He is also affiliated with the Computer Science and Artificial Intelligence Laboratory (CSAIL). His research develops new algorithms and abstractions for declarative AI programming and for composing retrieval and reasoning. Khattab previously worked as a research scientist at Databricks. He received a B.S. in computer science from Carnegie Mellon University and a Ph.D. in computer science from Stanford University.

Professor Khattab's research spans two overarching directions, consolidated in two influential open-source research systems, together downloaded millions of times a month.

(1) Building Reliable AI Systems with Language Models -- Khattab built the DSPy framework, a programming model for declaratively expressing and automatically optimizing Natural Language Programs, i.e. modular software systems that use natural language to specify parts of their behavior. In this line of work, his research develops:
(*) Natural Language Programs and their abstractions & optimizers, as in DSPy (ICLR’24 Spotlight) and its predecessor DSP. It includes state-of-the-art systems like STORM (NAACL’24), IReRa, PATH, and PAPILLON (NAACL’25) and optimizers like MIPRO (EMNLP’24) and BetterTogether (EMNLP’24).
(*) Retrieval-based NLP Systems like ColBERT-QA (TACL’21), Baleen (NeurIPS’21 Spotlight), Hindsight (ICLR’22), and ARES (NAACL’24).

(2) Developing Effective & Efficient Retrieval Models -- Khattab built the ColBERT retrieval model, which has been central to the development of the modern landscape of information retrieval. In this line of work, his research develops:
(*) Retrieval Models like ColBERT (SIGIR’20), ColBERTv2 (NAACL’22), and UDAPDR (EMNLP’23).
(*) Scalable Retrieval Infrastructure like PLAID (CIKM’22), WARP, and DeepImpact (SIGIR’21).

Recent Work