Principal Investigator Boris Katz
Project Website http://cbmm.mit.edu/research/projects-thrust/visual-intelligence/multi-sentence…
Existing approaches to labeling images and videos with natural-language sentences generate either one sentence or a collection of unrelated sentences. Humans, however, produce a coherent set of sentences, which reference each other and describe the salient activities and relationships being depicted.