Prof. Omar Ahmed Metwally Khattab

Assistant Professor of Electrical Engineering and Computer Science

Primary DLC

Department of Electrical Engineering and Computer Science


Research Summary

Professor Khattab's research spans two overarching directions, consolidated in two influential open-source research systems, together downloaded millions of times a month.

(1) Building Reliable AI Systems with Language Models -- Khattab built the DSPy framework, a programming model for declaratively expressing and automatically optimizing Natural Language Programs, i.e. modular software systems that use natural language to specify parts of their behavior. In this line of work, his research develops:
(*) Natural Language Programs and their abstractions & optimizers, as in DSPy (ICLR’24 Spotlight) and its predecessor DSP. It includes state-of-the-art systems like STORM (NAACL’24), IReRa, PATH, and PAPILLON (NAACL’25) and optimizers like MIPRO (EMNLP’24) and BetterTogether (EMNLP’24).
(*) Retrieval-based NLP Systems like ColBERT-QA (TACL’21), Baleen (NeurIPS’21 Spotlight), Hindsight (ICLR’22), and ARES (NAACL’24).

(2) Developing Effective & Efficient Retrieval Models -- Khattab built the ColBERT retrieval model, which has been central to the development of the modern landscape of information retrieval. In this line of work, his research develops:
(*) Retrieval Models like ColBERT (SIGIR’20), ColBERTv2 (NAACL’22), and UDAPDR (EMNLP’23).
(*) Scalable Retrieval Infrastructure like PLAID (CIKM’22), WARP, and DeepImpact (SIGIR’21).

Recent Work