Alves, Diego; Bagdasarov, Sergei; Teich, Elke

Cognitive Signatures of Multi-Word Expressions: Reading-Time and Surprisal

Kr. Ojha, Atul; Barbu Mititelu, Verginica; Constant, Mathieu; Stoyanova, Ivelina; Seza Doğruöz, A.; Rademaker, Alexandre (Ed.): Proceedings of the 22nd Workshop on Multiword Expressions (MWE 2026), Association for Computational Linguistics, pp. 48-53, Rabat, Marocco, 2026, ISBN 979-8-89176-363-0.

This study investigates whether eye-tracking measures predict if a word is the final token of a multi-word expression (MWE), focusing on two understudied MWE types: fixed expressions (e.g., \textit{due to}) and phrasal verbs (e.g., \textit{turn out}). Using mixed-effects logistic regression, we compared tokens in MWE contexts with the same tokens in non-MWE contexts. Results reveal a clear difference in processing. For fixed expressions, reading-time measures significantly predict MWEhood. In contrast, phrasal verbs show no consistent predictive effects. Additionally, we compared the reading-time models to models that included GPT-2 surprisal as a predictor. While surprisal does predict MWEhood, it fails to capture the distinction between types. These findings highlight the need to consider MWE typology in models of formulaic language processing.

Back