Lemke, Tyll Robin; Schäfer, Lisa; Drenhaus, Heiner; Reich, Ingo

Script Knowledge Constrains Ellipses in Fragments – Evidence from Production Data and Language Modeling

Proceedings of the Society for Computation in Linguistics, 3, 2020.

We investigate the effect of script-based (Schank and Abelson 1977) extralinguistic context on the omission of words in fragments. Our data elicited with a production task show that predictable words are more often omitted than unpredictable ones, as predicted by the Uniform Information Density (UID) hypothesis (Levy and Jaeger, 2007).

We take into account effects of linguistic and extralinguistic context on predictability and propose a method for estimating the surprisal of words in presence of ellipsis. Our study extends previous evidence for UID in two ways: First, we show that not only local linguistic context, but also extralinguistic context determines the likelihood of omissions. Second, we find UID effects on the omission of content words.