Entry Date:
December 27, 2011

Smart Data Structures

Principal Investigator Anant Agarwal


As multicore processors become prevalent, the complexity of programming is skyrocketing. One major difficulty is efficiently orchestrating collaboration among threads through shared data structures. Unfortunately, choosing and hand-tuning data structure algorithms to get good performance across a variety of machines and inputs is a herculean task to add to the fundamental difficulty of getting a parallel program correct. To help mitigate these complexities, we have developed a new class of parallel data structures called Smart Data Structures that leverage online machine learning to adapt automatically. We have prototyped and evaluated an open source library of Smart Data Structures for common parallel programming needs and demonstrated significant improvements over the best existing algorithms under a variety of conditions. Results indicate that learning is a promising technique for balancing and adapting to complex, time-varying tradeoffs and achieving the best performance available.

Standard parallel data structures consist of data storage, an interface, and algorithms. The storage organizes the data, the interface specifies the operations threads may apply to the data to manipulate it, and the algorithms implement the interfaces while preserving the correct concurrent semantics. Storage and algorithms are often controlled by knobs: the thresholds or other parameters that program implementation behaviors and heuristics. Knobs are typically configured via one-size-fits-all static defaults provided by the library programmer. When the defaults perform sub-optimally, programmers must hand-tune the knobs; typically, they do so through trial and error and special cases in the code which increase complexity and reduce readability. Though often necessary, dynamic tuning of the knobs is typically beyond the reach of the programmer.

Smart Data Structures avoid the need for cumbersome hand-tuning and provide dynamic tuning by augmenting standard data structures with an online learning engine that automatically optimizes the knobs. Through learning, Smart Data Structures balance complex tradeoffs to find ideal knob settings and react to changes in the system or inputs that affect these tradeoffs.