Entry Date:
January 19, 2017

Making Big Data Accessible on Personal Devices: Big Network Algorithms, External Memory, and Data Streams

Principal Investigator Erik Demaine

Project Start Date December 2015

Project End Date
 November 2019

Big networks are constantly growing in both size and relevance: from social networks such as Facebook and Twitter, to brain networks, gene regulatory networks, and health/disease networks. The traditional approach to analyzing such big datasets is to use powerful supercomputers (clusters), ideally large enough to store the data in main memory. The downsides to this approach are that many potential users of big data lack such powerful computational resources (e.g. point-of-sale Bitcoin blockchain analysis), and it can be difficult to solve unexpected problems within such a large infrastructure (e.g. image analysis after the Boston Marathon Bombing). The algorithms developed in this project will enable the processing of huge datasets on computational devices with a limited amount of fast memory, connected to a relatively slow external data source.

This project will investigate the extent to which complex network analysis can be performed on a single computer, even a mobile device such as a smartphone. To this end, the project will develop external-memory, cache-oblivious, and streaming algorithms for analyzing and understanding big network data, even on relatively weak computational devices. These algorithms will make big data analysis accessible to a much broader audience, enabling new applications. The approach uniquely combines advanced algorithmic techniques, including approximation algorithms, parameterized algorithms, graph algorithms, graph structure theory, and computational geometry, to solve real-world problems on big networks.