Dr. James R Glass

Senior Research Scientist
Head, Spoken Language Systems Group (SLS)

Primary DLC

Computer Science and Artificial Intelligence Laboratory

MIT Room: 32-G444

Areas of Interest and Expertise

Speech Recognition and Understanding
Unsupervised Speech Pattern Discovery
Acoustic Scene Analysis
Language Acquisition
Spoken Language Systems (SLS)
Human/Machine Communication
Signal Representation
Pattern Classification

Recent Work

  • Video

    Jim Glass - 2018 RD Conference

    November 21, 2018Conference Video Duration: 38:41

    Towards Learning Spoken Language through Vision

    Despite continuous advances over many decades, automatic speech recognition remains fundamentally a supervised learning scenario that requires large quantities of annotated training data to achieve good performance. This requirement is arguably the major reason that less than 2% of the worlds' languages have achieved some form of ASR capability. Such a learning scenario also stands in stark contrast to the way that humans learn language, which inspires us to consider approaches that involve more learning and less supervision.

    In our recent research towards unsupervised learning of spoken language, we are investigating the role that visual contextual information can play in learning word-like units from unannotated speech. This talk will outline our ongoing research in CSAIL to develop deep learning models that are able to associate images with unconstrained spoken descriptions, and present analyses that indicate that the models are learning correspondences between associated objects in images and their spoken instantiation.

    2018 MIT Research and Development Conference