4.5.23-AI-Isola

Conference Video|Duration: 32:10
April 5, 2023
Please login to view this video.
  • Video details
    Perhaps the most crucial factor in the performance of modern AI systems is the scale and quality of their training data. It is well known that the bigger the data, the better the performance. This talk asks: rather than just building bigger data, can we design better data? For this purpose I will turn to synthetic data, and explore three kinds: images sampled from generative models (GANs), images synthesized by radiance fields (NeRFs), and procedural images generated by simple scripts (“noise”). I will show how these data sources support new approaches to model training, and I will argue that data of these kinds is not just a cheap substitute for “real data”, it can in fact be better than the real thing.
Locked Interactive transcript
Please login to view this video.
  • Video details
    Perhaps the most crucial factor in the performance of modern AI systems is the scale and quality of their training data. It is well known that the bigger the data, the better the performance. This talk asks: rather than just building bigger data, can we design better data? For this purpose I will turn to synthetic data, and explore three kinds: images sampled from generative models (GANs), images synthesized by radiance fields (NeRFs), and procedural images generated by simple scripts (“noise”). I will show how these data sources support new approaches to model training, and I will argue that data of these kinds is not just a cheap substitute for “real data”, it can in fact be better than the real thing.
Locked Interactive transcript