Principal Investigator Samuel Madden
Project Website http://bigdata.csail.mit.edu/?q=node/36
BlinkDB is a database system that runs on top of Hadoop (MapReduce), running SQL queries and translating them into MapReduce jobs. The key idea is that rather than running queries over the entire data set, it runs queries on a random (precomputed) sample of the data, and uses sampling theory to estimate the true query answer.