Ben Vigoda, the founder and CEO of a new startup, Gamalon, Inc., has created a technology that has automated the process of model writing, transforming the technology from a tool for data scientists into a full fledged machine learning technology that learns from data, by itself.
Founder and CEO
Andrew Ng, one of the progenitors of Google Brain and an advocate for the resurgence of neural networks in recent years, told his origin story in a TED talk. He said that he tried to make a self-flying helicopter by programming a mathematical model of the helicopter, but no matter how hard he worked on his mathematical model of the aerodynamics and the helicopter mechanics, he could not get the helicopter to fly. Then, he decided to just rely on data from the helicopter’s sensors to train machine learning algorithms, and the helicopter flew amazingly well. Ng says that he learned from this experience to not try to write complicated models himself, and instead rely on deep learning algorithms.
Ben Vigoda, the founder and CEO of a new startup, Gamalon, Inc., in Cambridge, Mass. had almost the same formative experience as Andrew Ng, but he reached a completely different conclusion. At his previous MIT machine learning chip startup, Lyric Semiconductor, Inc., later acquired by Analog Devices, it took a long time and a lot of funds to develop working mathematical models for signal processing applications. But Vigoda decided not to give up on writing mathematical models. Instead, he decided that what was needed were tools for developing and debugging mathematical models and fitting them to data more quickly. Vigoda says, “Writing these kinds of models, fitting them to data, seeing what was wrong, and then improving the model was slow and arduous. It could take many weeks to try out one idea for improving a model, and often it would take years to get to a working system. For regular programming, we have compilers, debuggers, profilers, etc. all of this technology that makes it very easy to rapidly write and improve your programs that you are writing. This development environment enables agile programming. But for mathematical (Bayesian) models - the kinds that scientists write in order to fit their data, we didn’t have any analogous development tools.” Vigoda created Gamalon, and with funding from DARPA, set out to build these tools. “The first result was that we were able to build and test models much faster. We could write scientific (Bayesian) models and test them against the data almost instantly, and see which models and which parts of models help to explain the data most effectively.”
But that was just the beginning. The real ‘ah-ha’ moment came when they realized that they could begin to replace the human modeler. “We found that the development tools we had built for humans could actually be used to guide the computer to make its own changes to the models and autonomously test these new models. By automating the process of model writing, we transformed the technology from a tool for data scientists into a full fledged machine learning technology that learns from data, by itself. Already the system is performing very well on traditional machine learning tasks like image recognition and natural language processing.” Gamalon calls this new invention, Bayesian Program Synthesis (BPS), and it performs quite differently than deep learning. The company has a demonstration video where the system learns to recognize drawings in a side-by-side comparison with a drawing recognition app called “Quick, Draw!” from Google DeepMind. Gamalon’s BPS system learns from just a few training examples rather than thousands or millions, from one person rather than thousands, runs on an iPad rather than on hundreds of servers, and learns almost instantly rather than taking days, weeks or months to complete its learning computations. These improvements in performance promise to make the machine learning community take notice.
Gamalon is now announcing the first two commercial applications created using their BPS technology, Gamalon Structure and Gamalon Match. “It’s machine learning as a service,” he explains. “We can host it on any of the major cloud providers.”
Companies can use the service to structure, clean, prepare, and integrate data derived from disparate sources. Vigoda says, “More than 90% of enterprise data is unusable, because it is unstructured. Companies have large collections of little blobs of free-form text such as product descriptions, customer names and addresses, spoken queries that have been converted to text, insurance claims notes, doctor’s notes, etc. They need to convert each blob of text into a database row with the right columns. There is not really a good way to do this right now, so companies outsource to mechanical turk or professional services firms, and they get a lot of errors. Our new product, Gamalon Structure, solves this data structuring problem.”
Furthermore, if you want to link and integrate multiple data sources together, you need to use data integration or data prep software,” explains Vigoda. “You then pay ten times as much as you paid for the software to review the results and eliminate errors and redundancies. The integration takes months, and then you can have dozens of people reviewing the results.” Gamalon Match solves this data integration problem.
One Gamalon customer has hundreds of brick and mortar stores across the U.S. and wants to use Gamalon to set up an inventory system for a home delivery service. “They need to link to their stores and figure out what products are on the shelves,” says Vigoda. That may sound straightforward until you consider “how many different ways there are to describe a case of diet coke,” adds Vigoda. “Our systems goes into all the databases, reads them all, and figures out what’s available in each store, so when the driver gets to a store they know the product will be waiting.”
A manufacturing and wholesaler customer, meanwhile, “wants to know what’s going on each of its hundreds of resellers and who they’re selling to,” says Vigoda. “So we go in and connect all those databases, line them up with their contracts list, and get an incredible view of how products are moving through their distribution channel.”
Gamalon’s eventual goal is to provide “the ubiquitous middleware layer for all SaaS software,” says Vigoda. “Today, there are hundreds of different enterprise SaaS apps available, each of which stores data differently, and every company buys a different mix. We can provide a single global view of what your company is selling and who you’re selling to, where the inventory is coming from and how much you’re paying for it, all without needing to migrate to a single centralized database system. We could replace database systems of record with machine intelligence that indexes your enterprise information. We are excited to see where these first products takes us!”
About STEX25 and MIT’s Industrial Liaison Program (ILP)
STEX25 is a startup accelerator focused on fostering collaboration between MIT-connected startups and member companies of MIT’s Industrial Liaison Program (ILP). STEX25 is managed by MIT Startup Exchange, and its parent, the ILP. The ILP is a key player in making industrial connections for MIT, with over 220 of the world’s leading companies using their ILP memberships to advance research agendas at MIT.