Entry Date:
February 11, 2019

Bridge the Gap Between Human and Machine Vision


The past five years have seen considerable progress in using deep neural networks to model responses in the visual cortex. Deep neural networks (DNNs) are now the most successful biologically inspired models of computer vision, making them invaluable tools to study the computations performed by the human visual system. Recent work has shown these models achieve accuracy on par with human performance in many tasks. We have also shown that computer vision models share a hierarchical correspondence with neural object representations.

DNNs have adopted a feedforward architecture to sequentially transform visual signals into complex representations, akin to the human ventral stream. Even though models with purely feedforward architecture can easily recognize whole objects, they often mislabel objects in challenging conditions, such as incongruent object-background pairings, or ambiguous and partially occluded inputs. In contrast, models that incorporate recurrent connections are robust to partially occluded objects, suggesting the importance of recurrent processing for object recognition.

To continue bridging the gap between human and computer vision, we explore how the duration and sequencing of ventral stream processes can be used as constraints for guiding the development of computational models with recursive architecture.