Prof. Aude Oliva

Director of Strategic Industry Engagement, MIT Schwarzman College of Computing
Director, MIT-IBM Watson AI Lab
Co-Lead, MIT AI Hardware Program
Senior Research Scientist, CSAIL

Primary DLC

Computer Science and Artificial Intelligence Laboratory

MIT Room: 32-D432

Assistant

Samantha Smiley
smileys@mit.edu

Areas of Interest and Expertise

Automatic Visual Understanding
Computational Perception and Cognition
Human Visual Intelligence
Big Data

Research Summary

In her role as director of strategic industry engagement, Professor Oliva develops and implements relationships between the College and corporate collaborators. The goal of these enterprising academic-industry collaborations is to develop and translate novel computing and artificial intelligence research into tools for real-world impact. For this, she interfaces with those interested in large-scale, multi-faceted engagements comprising research, student support, community building, and public interaction activities at MIT. Holding a stewardship role in the MIT AI Hardware Program, Oliva constructs and facilitates instrumental pipelines that, deliver AI hardware and software with significantly enhanced energy efficiency systems and she promotes career opportunities and visibility for students, researchers and participating companies. As inaugural lead of the MIT-Amazon Science Hub, she launched the multi-year collaboration between MIT and Amazon, to support innovative research in the fields of AI, robotics, computing, and engineering. As of Fall 2022, Oliva serves on the committee on Research Computing and Data, in the new Office of Research Computing and Data in the Office of the Vice President of Research.

Professor Oliva's cross-disciplinary research in Computational Neuroscience, Computational Cognition and Computer Vision, bridges from theory to experiments to applications, accelerating the rate at which discoveries are made by solving problems through a novel way of thinking.

(*) Computational Neuroscience -- High-resolution, spatiotemporally resolved neuroimaging is a sort of Holy Grail for neuroscience. It means that we can capture when, where, and in what form information flows through the human brain during mental operations. In the team, we study the fundamental neural mechanisms of human perception and cognition and develop computational models inspired by brain architecture. We are developing state-of-the-art human brain mapping approach fusing magnetic resonance imaging (fMRI), magnetoencephalography (MEG), and computational modeling (CNN), to investigate the neural flow of perceived or imagined events. Unpacking the structure of operations such as sensory perception, memory, imagination, action, and prediction in the human brain has far-reaching implications for understanding not just typical brain functions, but also the maintenance or even augmentation of these functions in the face of internal (disease or injury) and external (information overload) challenges.

(*) Computational Cognition -- Understanding cognition on an individual level facilitates communication between natural and artificial systems, resulting in improved interfaces, devices, and neuroprosthetics for healthy and disabled people. Our work has identified that events carry the attribute of memorability, a predictive value of whether a novel event will be later remembered or forgotten. Predicting memorability is not an inexplicable phenomenon: people have a tendency to remember and forget the same images, faces, words, and graphs. Importantly, we are developing computational models that predict what people will remember, as they are encoding an event or even before they witness an event. Cognitive-level algorithms of memory will be a game changer for society, with applications ranging from accurate medical diagnostic tools to educational materials that will foresee the needs of people, to compensate when cognition fails.

(*) Computer Vision -- Inspired by strategies from human vision and cognition, we build deep learning models of object, place, and events recognition. To this aim, we are building a core of visual knowledge (e.g., Places, a large-scale dataset with 10 million annotated images; Moments in Time, a large-scale dataset of 1 million annotated short videos) that can be used to train artificial systems for visual and auditory event understanding and common-sense tasks, such as identifying where the agent is (i.e., the place), what objects are within reach, what potential surprising events may occur, which types of actions people are performing, and what may happen next.

Recent Work