Modelling the Information Density of Event Sequences in Texts

Project A3 Completed

Project A3 aims at collecting formalized knowledge about prototypical sequences of events – script knowledge – from data, and using it to improve algorithms for natural language processing and our understanding of linguistic encoding choice and interpretation in human communication. The project will develop methods for learning scripts with wide coverage from unannotated texts and extend the representations of script events with information about their preconditions and effects to keep track of causal connections between events.

These deeper and wider-coverage script models will be applied to various natural language processing tasks, and used to model pragmatic interpretations; we will use and extend the Rational Speech Act (RSA) model as a framework for modelling pragmatic inferences and explore how the RSA model can be related to existing notions used in the SFB, specifically the UID hypothesis.

Keywords: psycholinguistics, computational linguistics, crowdsourcing, script knowledge, world knowledge, cognitive modeling, predictability