Compositional generalization in the era of pre-trained language models

Abstract

Jonathan Berant

The Blavatnik School of Computer Science – Tel-Aviv University

The ability of humans to produce and understand utterances that contain previously unseen structures is fundamental to the human experience of language. Nevertheless, sequence-to-sequence models that have been pre-trained on large amounts of text have been repeatedly shown to struggle with generalization to new structures in the context of semantic parsing and visual question answering. In this talk, I will (a) review some of the progress made in recent years on building models that generalize to new compositions and (b) present some of our insights on factors that affect the inability of pre-trained language models to generalize to new structures.

Biography

Jonathan Berant is an Associate Professor at the Blavatnik School of Computer Science, and a Research Scientist at the Allen Institute for Artificial Intelligence.