Kermes, Hannah; Teich, Elke

Average surprisal of parts-of-speech

Corpus Linguistics 2017, Birmingham, UK, 2017.

We present an approach to investigate the differences between lexical words and function words and the respective parts-of-speech from an information-theoretical point of view (cf. Shannon, 1949). We use average surprisal (AvS) to measure the amount of information transmitted by a linguistic unit. We expect to find function words to be more predictable (having a lower AvS) and lexical words to be less predictable (having a higher AvS). We also assume that function words‘ AvS is fairly constant over time and registers, while AvS of lexical words is more variable depending on time and register.