Modeling and Measuring Information Density
Classical language models predict a word given a sequence of predecessor words. We will extend this to condition on knowledge from the environment that is to condition not only on the linguistics context but also one context from the real world. In one branch of the project, we will consider language models that also condition on an image.
Knowledge of the image in whose context the text was produced should help to predict the next word. In a second branch of the project we will consider, knowledge bases, question-answer data sets and states of a game as additional context. The surprisal and the predictability of an utterance like “Pawn from E2 to E4” depends on the present state of a chess game.
Keywords: language modelling, long range dependencies, memory
Other Area-B Projects
- Information Density in English Scientific Writing: A Diachronic Perspective B1
- Cognitive Modelling of Information Density for Discourse Relations B2
- Information Theory and Ellipsis Redundancy B3
- Neural Feature and Representation Learning for Information Density Based Translationese Classification B6
- Modelling Human Translation with a Noisy Channel B7