Entry Date:
January 18, 2017

Solving Information-Integration Problems Using Category Theory

Principal Investigator David I Spivak

Project Start Date January 2016

Project End Date
 June 2017


Simple business questions can be surprisingly hard to answer. For a large company, a question like "how many employees are tax-exempt?" may require querying hundreds of databases using multiple data models and possibly inconsistent definitions: for example, is a "contractor" also an 'employee?" Over the past 5 years, this I-Corps team has developed a new technology for performing information-integration tasks such as querying, combining, and evolving databases based on category theory, a branch of mathematics that has already revolutionized several areas of computer science, including functional programming. Category theory provides theoretical guidance missing from the widely-used relational model of data, and this team has used it to build a prototype software tool, FQL (categoricaldata.net/fql.html), for integrating databases more quickly and accurately than existing relational tools. But FQL is still an academic prototype. The purpose of this project is to perform customer discovery activities so that the team can better understand (1) the market demand for this new technology and (2) the exact form that an industrial-strength tool should take, e.g. a programming language, a Java library, a cloud service, etc.

During the I-Corps program, the team will interview 100+ potential customers, including IT managers of large enterprise companies in a variety of sectors, ETL and database specialists familiar with existing tools, and engineers who use databases in their process design. This team will use this customer information to find a product-market fit for the technology: e.g. are the customers interested in platforms or in-house solutions, how customized do the proposed solutions need to be, what is the best way of communicating the proposed technology-and to develop an industrial-strength tool for categorical data. Specifically, this team will develop use-case scenarios, alpha-test the existing FQL tool on potential customers, and use this information to validate and refine use-cases. By the end of the I-Corps program, the team plans to have a demo, a clear commercialization plan and a minimal viable product.