Jonathan Berant
The Blavatnik School of Computer Science – Tel-Aviv University
The ability of humans to produce and understand utterances that contain previously unseen structures is fundamental to the human experience of language. Nevertheless, sequence-to-sequence models that have been pre-trained on large amounts of text have been repeatedly shown to struggle with generalization to new structures in the context of semantic parsing and visual question answering. In this talk, I will (a) review some of the progress made in recent years on building models that generalize to new compositions and (b) present some of our insights on factors that affect the inability of pre-trained language models to generalize to new structures.