An Engagement Learning Approach to Generating Massive Labeled Datasets for Training AI Systems

Principal Investigators:

James Landay, Michael Bernstein, and Fei-Fei Li

TRI Liaison:

Project Summary

We pursue an “engagement learning” data labeling approach that learns to trade off what an AI needs (the knowledge value of a label to the AI model) against what people want to engage with (the engagement value of the label in the current situation). We explore this strategy using an in-car conversational agent, aiming to engage potentially millions of drivers and passengers to generate a massive labeled dataset. We see parent-child and family-based interactions around knowledge sharing and conceptual learning as a natural fit with commonsense annotation tasks that adults alone typically find mundane. Scaffolding these everyday and intrinsically motivated activities, our proposed deep reinforcement learning agent can assemble both informative labeled datasets to train AI systems while delivering positive in-car experiences to people.

We tackle Machine Learning, NLP, and Human-Computer Interaction challenges to create a continually growing knowledge base about the visual world, including naturalistic driving environments.

Research Goals

Assembling a massive, high quality commonsense knowledge base.
Developing deep reinforcement learning agents for harvesting this data.
Examining novel design opportunities for family-based car experiences.