Supervised Meta Optimization

Principal Investigator Saman Amarasinghe

Project Website http://www.cag.csail.mit.edu/metaopt

Compiler development is a black art. Compiler writers are expected to create effective and inexpensive solutions to NP-hard problems such as instruction scheduling and register allocation. Compilers cannot practically find optimal solutions to NP-hard problems. Therefore, compiler writers devise heuristics that find approximate solutions for a large class of applications. However, to achieve satisfactory performance, developers are forced to iteratively tweak their heuristics.

To this end, we propose Meta Optimization -- Meta Optimization refers to using machine-learning techniques to automatically search for effective compiler heuristics. We have utilized two different machine-learning approaches to optimizing compiler heuristics. One technique uses genetic programming to optimize compiler priority functions, while the other uses a labeled training set to create a classifier. It is important to note that in both cases, we are improving the compiler, not the application that is being compiled. In other words, after applying our techniques, the compiler's effectiveness will improve; we expect speedups on benchmarks for which the compiler heuristic was not trained. Furthermore, neither of the two approaches discussed below increase compilation time.

Computing unroll factors using supervised learning: In this work we focus on loop unrolling, a well-known optimization for exposing instruction level parallelism. Using the Open Research Compiler as a testbed, we demonstrate how one can use supervised learning techniques to model the appropriateness of loop unrolling.

We use more than 1,100 loops -- drawn from 46 benchmarks -- to train supervised learning algorithms to recognize when loop unrolling is advantageous. The resulting classifiers can predict with high accuracy whether a novel loop (i.e., one that was not in the training set) benefits from loop unrolling. We evaluate the ramifications of improved prediction accuracies using the Open Research Compiler (ORC) and the Itanium 2 architecture.

Automatically creating priority functions using genetic programming: In this work we use genetic programming to automatically search for effective compiler priority functions. Priority functions are nearly ubiquitous in compiler algorithms. For instance, instruction schedulers use cost functions to determine which instruction to schedule first. By restricting our search to priority functions -- rather than searching for an entire heuristic -- we drastically reduce the search space size. Furthermore, the quality of code generated by a compiler is highly dependent on priority functions. We investigate priority functions for three optimizations: hyperblock formation, register allocation, and data prefetching. We show that optimizing the priority functions associated with the optimizations has a huge impact on performance: 23% improvement for hyperblock formation when creating application-specific compilers, and 9% improvement for a general-purpose compiler.