Entry Date:
October 25, 2017

A Vertically-Integrated Approach to Resource Efficient Shared Computing

Principal Investigator Daniel Sanchez Martin


Current shared computing platforms, from small clusters to large datacenters, suffer from low utilization, wasting billions of dollars in energy and infrastructure every year. Low utilization stems from a disconnect between layers of the hardware and software stack. The goal of this proposal is to investigate and develop integrated intra- and inter-node resource management techniques that provide both near-peak utilization and guaranteed high performance in shared environments.

To this end, this project consists of three main thrusts:

(1) Elastic multicore systems, which combine recent hardware support for fast resource management with a novel software runtime to make hardware adaptation work for, not against, performance guarantees. Elastic multicores will use different hardware resources (such as cores, caches, and power) to achieve a given performance target as efficiently as possible, and safely share resources among guaranteed-performance and best-effort applications.

(2) Novel solutions to enable collaborative multi-tenancy, where resource-intensive workloads are co-scheduled and placed using fine-grained, automatically-collected resource usage profiles, considering aspects such as cache and memory bandwidth sharing.

(3) A shared system prototype that enables QF computing users to aggressively colocate applications on shared many-core nodes. The system will guarantee the latency requirement of performance-critical tasks (such as Al Jazeera video processing) while achieving high system utilization with intelligent placement of batch tasks such as HPC and data analytics.