High-Thoughput Connectomics

Principal Investigator Nir Shavit

Project Website http://www.nsf.gov/awardsearch/showAward?AWD_ID=1447786&HistoricalAwards=false

Project Start Date September 2014

Project End Date  August 2018

Connectomics is the science of mapping the connectivity between neuronal structures to help us understand how brains work. Using the analogy of astronomy, connectomics researchers wish to build 'telescopes' that will allow scientists to accurately view the brain. However, as in astronomy, the raw data collected by microtomes and electron microscopes, the instruments of connectomics, is too large to store effectively, and must be analyzed at very high computation rates. Our goal is to research, develop, and deploy a software architecture that enables high-throughput analysis of connectomics data at the speed at which it is being acquired. We will develop the first computational infrastructure to support high-throughput connectomics without human intervention. If successful, this system will allow for the first time the mapping of a cortical column of a small mammalian brain (1 cubic millimeter), and hopefully within a few years the mapping of significant sections of a mammalian cortex.

The solution to the big data problem of connectomics is a new high-throughput connectomics software architecture that we call MapRecurse. MapRecurse, named so because it bears some resemblance to the widely used MapReduce framework, will provide a unified way of specifying sequences of computational steps and validation tests to be applied to the collected data. Key to MapRecurse will be the ability to layout data and computation in a structured way that preserves locality. Using it, programmers will be able to apply fast, less accurate segmentation algorithms to low resolutions of the data in order to quickly compute a first version of the output neural network graph. Domain-specific graph theoretical methods will then check for correctness of the graph and identify areas of inconsistencies that are in need of further refinement. MapRecurse will then apply bottom-up, local processing with slower, more accurate segmentation and reconstruction algorithms to higher resolutions of the data, verifying and correcting any errors. The iterations progress recursively and in parallel across multiple cores, giving the approach its name. We believe that MapRecurse and the data structures and algorithms developed here will find applications in other high-throughput applications, such as, in astronomy, biology, social media applications, or economics.