DiCarlo Laboratory

Principal Investigator James DiCarlo

Project Website http://dicarlolab.mit.edu/

How do you recognize the items on your desk? The faces of your loved ones? The words on this page? The research goal is to understand the neuronal algorithms and circuits that underlie visual object recognition -- an understanding that might help change the world. Concretely, we seek to understand how the visual system transforms each image from an initial, pixel-like representation, to a new, remarkably powerful form of representation -- one that can support our seemingly effortless ability to solve these object recognition tasks in the real world. We are focussed on the crux "invariance" problem -- the ability to distinguish among objects despite dramatic image variation. To approach this very difficult problem, the work of our research group is directed along three main lines:

(1) Elucidating Neuronal Object Codes -- One key direction is to experimentally measure and analyze the patterns of neuronal spiking activity (“codes”) found at the highest levels of the ventral visual stream (primate inferior temporal cortex, IT). At this high level, those neuronal codes have solved the “invariance” problem. While one should not be surprised that such codes exist in the brain, their discovery and continued deeper understanding enables us to focus on the algorithms that construct the codes.

(2) The Quest for Underlying Algorithms -- Discovering the key algorithms requires a tight interplay between experiment and theory. For example, we recently discovered that the key invariance properties of neuronal object codes are plastic and can be built from unsupervised, natural visual experience. To explore the potential power of such ideas, we and our collaborators implement and screen large families of brain-constrained models and test them on real-world problems. More generally, we are building a systematic foundation to bring together neuronal data, mechanistic models, and human recognition performance.

(3) The Circuits that Implement those Algorithms -- Clever computational algorithms do not exist in a vacuum, but must be implemented in specific neuronal circuits in the brain tissue. We employ high resolution MR and fMRI imaging, microfocal stereo x-ray methods, and optogenetic tools to understand the spatial layout of those circuits in the ventral visual cortex. This information will provide clues about the algorithms at work. It will also allow us to interact with those neuronal circuits to both test hypotheses and potentially enable new brain machine interfaces.

The research goal of our laboratory is to understand the mechanisms underlying visual object recognition. Specifically we seek to understand how sensory input is transformed by the brain from an initial representation (essentially a photograph on the retina), to a new, remarkably powerful form of representation -- one that can support our seemingly effortless ability to solve the computationally difficult problem of object recognition. We are particularly focused on patterns of neuronal activity in the highest levels of the ventral visual stream (primate inferior temporal cortex, IT) that likely directly underlie recognition. At these high levels, individual neurons can have the remarkable response property of being highly selective for object identity, even though each object’s image on the retinal surface is highly variable -- for example, due to changes in object position, distance, pose, lighting and background clutter. Understanding the creation of such neuronal responses by transformations carried out along the ventral visual processing stream is the key to understanding visual recognition.

To approach these very difficult problems, the work of our laboratory is directed along three main lines: 1) characterize the computational usefulness of patterns of IT neuronal activity for supporting immediate visual object recognition, 2) test and develop computational theories of how visual input is transformed along the ventral processing stream from a pixel-wise representation, to a powerful representation in IT, 3) understand the spatial organization of this representation. Our primary research approaches are: neurophysiology in awake, behaving non-human primates, functional brain imaging (fMRI), human psychophysics, and computational modeling. Across all of these endeavors we aim to develop innovative methods and tools to facilitate this work in our laboratory and others. Our approaches are often synergistic with those of other MIT laboratories, and this has greatly enhanced our progress.