Robust and Interpretable Machine Learning via Natural Language Explanations

Principal Investigators:

Chris Manning, Percy Liang, and Dan Jurafsky

TRI Liaison:

Project Summary

We will use natural language to make machine learning models learnable with lower cost, less likely to pick up on spurious correlations, and able to provide explanations of their behavior.

The next-generation of intelligent vehicles will (i) depend heavily on machine learning and (ii) need to interact with humans and operate in complex, safety-critical situations. Natural language will allow humans to control and interact with machine learning systems in a more natural, safer, and more powerful way.

Research Goals

Develop framework for training classifiers quickly from natural language explanations based on (visual) question answering models and datasets
Develop adversarial learning techniques to find features that are predictive but uncorrelated with complex confounders, which are expressed in natural language.
Develop new neural models that can perform reasoning with natural language explanations of such steps.
Evaluate the extent to which natural language permits efficient learning from fewer samples and robust generalization to novel test scenarios.