Where Industry Meets Innovation

  • Contact Us
  • sign in Sign In
mit campus


Browse Videos

  • View All
  • ILP Videos
  • MIT Faculty Shorts
  • Tech-TV

Conferences Videos

  • 2015 MIT Japan Conference
  • 2014 MIT Research and Development Conference
  • 2014 The Second Machine Age Conference
  • 2014 MIT Europe Conference in Brussels
  • 2014 MIT Information and Communication Technologies Conference
  • 2014 MIT Japan Conference

Featured Videos

Please wait...


466 Results | Prev | 1 | 2 | 3 | .. | 92 | Page 93 | Last | Next

32 mins
ILP Video

Big Data in Engineering

Sanjay Sarma
Professor of Mechanical Engineering
Director, MIT/SUTD Collaboration Office
Former Chairman of Research and Co-Founder of The Auto-ID Center at MIT
Big data is often associated with health care, social networks and computer systems. However, big data is making an impact in engineering -- in applications ranging from smart cities to manufacturing. I will give examples of big data collection systems in the monitoring of urban infrastructure including buildings, street lights and vehicles. I will then talk about how RFID can be used to generate and utilize big data in the supply chain and in manufacturing. The proliferation of sensors is making the collection of data about things that we did not get a lot of information about in the past. But the generation of this data comes with challenges -- what does one do with the big data? I will talk about the applications we have been generating over the years.
Read More

39 mins
ILP Video

Data Curation at Scale: The Data Tamer System

Michael Stonebraker
Adjunct Professor
MIT Computer Science and Artificial Intelligence Laboratory
Data curation is the act of discovering a data source(s) of interest, cleaning and transforming the new data, semantically integrating it with other local data sources, and deduplicating the resulting composite. There has been much research on the various components of curation (especially data integration and deduplication). However, there has been little work on collecting all of the curation components into an integrated end-to-end system.

In addition, most of the previous work will not scale to the sizes of problems that we are finding in the field. For example, one web aggregator requires the curation of 80,000 URLs and a second biotech company has the problem of curating 8000 spreadsheets. At this scale, data curation cannot be a manual (human) effort, but must entail machine learning approaches with a human assist only when necessary.

This talk describes Data Tamer, an end-to-end curation system we have built at M.I.T. and the Qatar Computing Research Institute (QCRI). It expects as input a sequence of data sources to add to a composite being constructed over time. A new source is subjected to machine learning algorithms to perform attribute identification, grouping of attributes into tables, transformation of incoming data and deduplication. When necessary, a human can be asked for guidance. Also, Data Tamer includes a data visualization component so a human can examine a data source at will and specify manual transformations.

We have run Data Tamer on three real world enterprise curation problems, and it has been shown to lower curation cost by about 90%, relative to the currently deployed production software.
Read More

40 mins
ILP Video

Managing Innovation in a Crowd

Mohamed Mostagir
Postdoctoral Associate
MIT Laboratory for Information and Decision Systems
Crowdsourcing is an emerging technology where innovation and production are sourced out to the public through an open call. A major problem in crowdsourcing is that identifying skilled labor can be challenging, since anyone is allowed to participate and the fleeting, temporary nature of the relationship between firm and workers may provide an incentive for workers to misrepresent their skills. This misrepresentation increases the possibility of assigning tasks to workers who will not be able to finish them, leading to inefficiencies and lost revenues. We show that despite this lack of information about workers' skills, we can develop a compensation scheme that leads workers to not only reveal their true skills, but to sort themselves into an organizational hierarchy as if they are working cooperatively on the same project, even though they have no precise knowledge of each other and no inclination to cooperate.
Read More

36 mins
ILP Video

From Signals to Text-searchable repositories: Learning patterns from Big Data

Daniela Rus
Professor of Computer Science and Engineering
Director, MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)
When we need to solve an optimization problem we usually use the best available algorithm/software or try to improve it. We have started exploring a different approach: instead of improving the algorithm, reduce the input data and run the existing algorithm on the reduced data to obtain the desired output much faster. In this talk I will describe the core-set approach to realize this idea. I will also describe iDiary: a system that turns large GPS sensor signals collected from smart-phones into textual descriptions of the trajectories and supports textual queries.
Read More

44 mins
ILP Video

"Divide and Conquer" Machine Learning to Exploit Big Data Knowledge Discovery

Una-May O'Reilly
Principal Research Scientist
Leader, Evolutionary Design and Optimization Group
Director, The Alfa Group: Any Scale Learning for All
MIT Computer Science and Artificial Intelligence Laboratory
Machine learning algorithms underpin advanced knowledge discovery from data. They provide data models which facilitate predictions and prescriptions. They support the detection of patterns and hidden insights from data. My group has taken a "Divide and Conquer" approach to knowledge discovery from large datasets by developing a suite of systems which provide elastic, robust and scalable communities of 1000's of nodes each executing a machine learning algorithm either independently or in coordination. I will describe some of these systems: FlexGP runs cloud-based genetic programming for regression, ECStar uses massive scale volunteer compute to learn decision rules, BPPR derives the consensus of multiple clustering, while SCALE adaptively fuses the results of heterogeneous ML algorithms. Some of these systems support our Blood Pressure Knowledge Discovery Project which applies machine learning to a large repository of ICU patients' blood pressure waveforms.
Read More

MIT Partners

  • mit video
    MITVideo aggregates and curates video produced by the Institute's offices, laboratories, centers and administration.
  • tech tv
    MIT Tech TV is the video-sharing site for the MIT community.