ENGLISH

Salesforce open sources research to advance state of the art in AI for common sense reasoning

160

How Salesforce builds loyalty by investing in the success of its developers, admins, and customers
At Salesforce TrailheaDX 2019, Sarah Franklin spoke with TechRepublic about what makes the company’s relationship with developers, admins, and customers.

Ten percent is a considerable margin to improve on the state of the art on anything. This is what Salesforce research has just achieved for common sense reasoning for deep learning language models.

In its paper, Explain Yourself! Leveraging Language Models for Commonsense Reasoning, presented tomorrow in the Association for Computational Linguistics (ACL) 2019 annual meeting, Salesforce researchers unveil two important contributions: CoSE, a dataset for Commonsense Explanations; and CAGE, a model for Commonsense Auto-Generated Explanation. ZDNet took the opportunity for a Q&A with two of the Salesforce Research Scientists who worked on this, Nazneen Rajani and Bryan McCann.

Creating a common sense reasoning dataset

As a reminder, Salesforce research is focused on question answering, as a way to facilitate access to data via Einstein. We have previously seen how other Salesforce researchers investigated the use of knowledge graphs toward the same end.

Rajani and McCann’s work takes a different approach, but also builds on a number of previous contributions. Common sense reasoning is an open problem for some of the world’s leading researchers. For example, one of the key ingredients of building CAGE was OpenAI GPT. Dubbing this language model recently open sourced by Elon Musk’s OpenAI as “too dangerous” to be released in the wild may have been overly precautionary.

Nevertheless, it is the state of the art in language models. As Rajani and McCann point out, these natural language processing networks are limited to text alone, as a poor substitute for living in the real world. So, researchers train the models by having them read a human-mind-boggling amount of text, including all of Wikipedia, thousands of books, and in other approaches, results from querying Google, too.

These models are tested using a multiple-choice test called Commonsense Question Answering (CQA), which contains questions that require common sense reasoning to answer. In typical deep learning fashion, models are trained on a few examples from CQA, then tested on a different set of questions. Compared to humans, these well-read neural networks have known to perform quite poorly on this task.

Rajani and McCann created a dataset modeled after CQA, but they also included explanations, in addition to answers to the questions. This is how they have created CoSE, a dataset for Commonsense Explanations. As Rajani said, CoSE v1.0 has 8500 examples and v1.11 has 10,962 examples including training and validation sets. For deep learning standards, this is not an awful lot of data.

Rajani and McCann acknowledge this, and growing the dataset is one of their goals for future work. McCann said they would like to extend this dataset collection process to other benchmarks in the field that contain both free-form text, structured information, and visual signal from images or video so that they can train models that explain many different domains.

Explanations were generated using crowdsourcing on Mechanical Turk. Turkers were asked to provide an answer to questions, explain the answer, and highlight the part of the question that lead them to the explanation. Let us note, that as recent research in knowledge graph quality processing using Mechanical Turk has shown, crowdsourcing is a feasible solution for such tasks.

Rajani mentioned there were some examples that needed to be re-annotated even though they had initial constraints on the quality of the explanations because they fell through the cracks. It took about three weeks to design the task and collect the data. CoSE can be used, and further enhanced, by other researchers, and it’s made available on GitHub.

Creating a common sense reasoning dataset

Related Topics: