ILP Institute InsiderJune 3, 2010
The Zen of Zue
CSAIL Lab Director Victor Zue Makes Machines Talk…
And Blue Chip Corporate Sponsors Listen
It was the title of an award-winning situation comedy that ran for eight years on television.
It’s also the long-running question posed every day by Victor Zue, Director of MIT’s Computer Science and Artificial Intelligence Lab (CSAIL). Aimed at the sometimes frustrating relationship between humans and the speech technology behind automated customer “service” phone systems, Zue’s perpetual query du jour is simply:
“What bugs me is that in examining the interaction between those systems and humans, you have to question who is really in charge of the conversation,” says Zue with equal parts good-humored impatience and scholarly interest in this intersection of speech understanding and artificial intelligence, his research specialty.
“We should be the ones who decide how to interact with these machines, but today those systems dictate how you talk to them because they’re still too dumb to deal with free-form dialogue,” he explains. “They force you to engage in their restricted directed dialogs to make reservations, resolve a service problem or other task instead of being capable of having a natural language conversation as you would with a travel agent, for example.”
Over the last decade, these systems have admittedly come a long way in understanding what people say. But speech recognition alone, adds Zue, is only one of the many human language technology components needed to create a truly intelligent system. One that can hear the subtle difference between “Austin” and “Boston” (machines are still “hard of hearing,” says the researcher), analyze context, and incorporate other information so that these systems can finally carry on what passes for intelligent interaction with users.
With even more refinements, they’ll also be able to provide, through what is called aggregate review and other capabilities, useful information that can analyze and extract from online sources the best places to eat, the most interesting things to do, where to go, how best to get there, and much more.
CSAIL and the Zen of Zue
Examples of the prototypes coming out of the Spoken Language Systems Group, which Zue founded and led before becoming director of CSAIL, are the Mercury system, a conversational flight schedule/passenger reservation system; and the Jupiter system, an effort to perfect a natural language interface that retrieves weather information.
“They’re pretty good now,” says Zue with characteristic modesty. “Most people who use them are happy with the experience and want to use them again. The other challenge in building systems like these, though, is to build the technology in a way that makes it easy to create new applications to help people find the many different types of information they need. Establishing a generic technology that lets you rapidly develop and port these applications to different domains is a serious research topic we deal with.”
It’s also one of many research topics being pursued at CSAIL, MIT’s largest research laboratory by far with over 850 staff, approximately 100 of whom are Principal Investigators, representing a cross-section of talent from ten departments and four MIT schools.
With approximately 50 research groups working on hundreds of diverse projects, CSAIL researchers are focused on finding innovative ways to make systems and machines operate faster, better, safer, easier, and more efficiently for the benefit of humanity. That talent is being applied to such disciplines as robotics, genomics, computer graphics, computer security, cloud computing, and healthcare, to name a few areas of interest.
In addition to creating an environment conducive to finding solutions to the challenges CSAIL researchers are exploring, Zue is also interested in finding corporate sponsors with whom to collaborate in joint research efforts, both through the MIT Industrial Liaison Program (ILP), and CSAIL’s own Industry Affiliates Program (IAP).
The IAP currently numbers among its members IBM, Microsoft, Google, Yahoo!, Cisco, Ford, Northrop Grumman, Nokia, Akamai, and Quanta Computers – the Taiwanese OEM for Apple, HP, Lenovo, Sony, and others, and producer of 1/3 of the world’s laptop computers.
“The IAP is in its third year of being our industry gateway for companies interested in working with us and it provides an introductory exposure to what we do, how we do it, and what we might be able to do together,” Zue explains. “The goal is to transition them from there to becoming a sponsor.”
“We did not set out to recruit exclusively the world’s leading companies. However, it is apparent that companies with a commitment to being the best – those with long-term vision and a reputation for being forward-thinking providers of very successful products and technologies – believe that CSAIL can help them maintain their excellence.”
In fact, corporate interest in sponsoring research has increased dramatically in recent years, the CSAIL director points out.
“Today, approximately 25% of our research funding comes from industry, which is quite a change from fifteen years ago when nearly 100% of funding came from government-sponsored research,” notes Zue. “In part, that’s because our field has matured and business sees more commercial possibilities from our work, but it’s also due to a conscious effort on our part to make sure our work is relevant. Working with industry gives us the opportunity to see technology move into the real world.”
Djoo Don’t Say
Paradoxically, experience as a foreign student in the real world of American higher education led Victor Zue to move into the digital world of machine/human interaction and speech recognition. Because he couldn’t speak English well at first, his desire to converse like a native speaker triggered his interest in understanding what constitutes native language ability and how humans and speech recognition systems might ultimately negotiate its many subtleties.
“I wanted to know things like how did you go from ‘did you’ to ‘djoo,’” remembers Zue. “From there it was a short distance to the field of spoken dialogue systems for me and how computer-based systems could be made to understand English and interact intelligently with humans.”
Made to understand English, yes. Made to respond appropriately on a consistent basis to human interrogators, not so much.
“We have a lot of blooper tapes that captured some funny dialogues,” Zue admits.
An early city guide spoken dialogue system, for example, was being demonstrated to a visitor who would ask it questions, among them how to get from one location to another. The machine was working perfectly and after its final answer the visitor said, “Thank you very much.” The system replied with the out-of-left-field but institution-pleasing default, “What would you like to know about MIT?”
“Sometimes machines don’t know what they don’t know, which can make things frustrating,” says the lab director philosophically. “They’re made to take the domain knowledge they have and maximize the validity of what they think is the best answer when they should just be able to reply, ‘I don’t understand what you’re saying.’ Unfortunately, the machine’s knowledge base is still only what you put into it.”
Some of CSAIL’s own collective knowledge base will be on display at next year’s MIT 150th Anniversary celebration, for which it was one of a half dozen university entities chosen to give a day-and-a-half symposium in April on a cross-section of key lab disciplines. “We are organizing the event with the ILP, and the program will feature MIT thought leaders on the Institute's impact and role in the Information Age,” said Zue.
Until that anniversary symposium, Zue will continue trying to solve the mysteries of the human/machine language interface, content that he chose the right career path after all.
“If I hadn’t gone into this branch of computer science I might have become a medical doctor,” says Zue in a tone that conveyed equal parts confidence and relief at a narrow escape. “I’m so glad I didn’t go there.”