Publications

Mosbach, Marius; Khokhlova, Anna; Hedderich, Michael; Klakow, Dietrich

On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers Inproceedings

Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics, pp. 2502-2516, 2020.

Fine-tuning pre-trained contextualized embedding models has become an integral part of the NLP pipeline. At the same time, probing has emerged as a way to investigate the linguistic knowledge captured by pre-trained models. Very little is, however, understood about how fine-tuning affects the representations of pre-trained models and thereby the linguistic knowledge they encode. This paper contributes towards closing this gap. We study three different pre-trained models: BERT, RoBERTa, and ALBERT, and investigate through sentence-level probing how fine-tuning affects their representations. We find that for some probing tasks fine-tuning leads to substantial changes in accuracy, possibly suggesting that fine-tuning introduces or even removes linguistic knowledge from a pre-trained model. These changes, however, vary greatly across different models, fine-tuning and probing tasks. Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method. Based on our findings, we argue that both positive and negative effects of fine-tuning on probing require a careful interpretation.

@inproceedings{mosbach-etal-2020-interplay-fine,
title = {On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers},
author = {Marius Mosbach and Anna Khokhlova and Michael Hedderich and Dietrich Klakow},
url = {https://www.aclweb.org/anthology/2020.findings-emnlp.227},
doi = {https://doi.org/10.18653/v1/2020.findings-emnlp.227},
year = {2020},
date = {2020},
booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2020},
pages = {2502-2516},
publisher = {Association for Computational Linguistics},
abstract = {Fine-tuning pre-trained contextualized embedding models has become an integral part of the NLP pipeline. At the same time, probing has emerged as a way to investigate the linguistic knowledge captured by pre-trained models. Very little is, however, understood about how fine-tuning affects the representations of pre-trained models and thereby the linguistic knowledge they encode. This paper contributes towards closing this gap. We study three different pre-trained models: BERT, RoBERTa, and ALBERT, and investigate through sentence-level probing how fine-tuning affects their representations. We find that for some probing tasks fine-tuning leads to substantial changes in accuracy, possibly suggesting that fine-tuning introduces or even removes linguistic knowledge from a pre-trained model. These changes, however, vary greatly across different models, fine-tuning and probing tasks. Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method. Based on our findings, we argue that both positive and negative effects of fine-tuning on probing require a careful interpretation.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Crible, Ludivine; Demberg, Vera

When Do We Leave Discourse Relations Underspecified? The Effect of Formality and Relation Type Journal Article

Discours, 2020.

Speakers have several options when they express a discourse relation: they can leave it implicit, or make it explicit, usually through a connective. Although not all connectives can go with every relation, there is one that is particularly frequent and compatible with very many discourse relations, namely and. In this paper, we investigate the effect of discourse relation type and text genre on the production and perception of underspecified relations of contrast and consequence signalled by and. We combine a corpus study of spoken English, a production experiment and a perception experiment in order to test two hypotheses: (1) and is more compatible with relations of consequence than of contrast, due to factors of cognitive complexity and conceptual differences; (2) and is more compatible with informal than formal genres, because of requirements of recipient design. The three studies partially converge in identifying a stable effect of relation type and genre on the production and perception of underspecified relations of consequence and contrast marked by and.

@article{Crible2020,
title = {When Do We Leave Discourse Relations Underspecified? The Effect of Formality and Relation Type},
author = {Ludivine Crible and Vera Demberg},
url = {https://journals.openedition.org/discours/10848},
doi = {https://doi.org/10.4000/discours.10848},
year = {2020},
date = {2020},
journal = {Discours},
number = {26},
abstract = {Speakers have several options when they express a discourse relation: they can leave it implicit, or make it explicit, usually through a connective. Although not all connectives can go with every relation, there is one that is particularly frequent and compatible with very many discourse relations, namely and. In this paper, we investigate the effect of discourse relation type and text genre on the production and perception of underspecified relations of contrast and consequence signalled by and. We combine a corpus study of spoken English, a production experiment and a perception experiment in order to test two hypotheses: (1) and is more compatible with relations of consequence than of contrast, due to factors of cognitive complexity and conceptual differences; (2) and is more compatible with informal than formal genres, because of requirements of recipient design. The three studies partially converge in identifying a stable effect of relation type and genre on the production and perception of underspecified relations of consequence and contrast marked by and.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Avgustinova, Tania

Surprisal in Intercomprehension Book Chapter

Slavcheva, Milena; Simov, Kiril; Osenova, Petya; Boytcheva, Svetla (Ed.): Knowledge, Language, Models, INCOMA Ltd., pp. 6-19, Shoumen, Bulgaria, 2020, ISBN 978-954-452-062-5.

A large-scale interdisciplinary research collaboration at Saarland University (Crocker et al. 2016) investigates the hypothesis that language use may be driven by the optimal utilization of the communication channel. The information-theoretic concepts of entropy (Shannon, 1949) and surprisal (Hale 2001; Levy 2008) have gained in popularity due to their potential to predict human linguistic behavior. The underlying assumption is that there is a certain total amount of information contained in a message, which is distributed over the individual units constituting it. Capturing this distribution of information is the goal of surprisal-based modeling with the intention of predicting the processing effort experienced by humans upon encountering these units. The ease of processing linguistic material is thus correlated with its contextually determined predictability, which may be appropriately indexed by Shannon’s notion of information. Multilingualism pervasiveness suggests that human language competence is used quite robustly, taking on various types of information and employing multi-source compensatory and guessing strategies. While it is not realistic to require from every single person to master several languages, it is certainly beneficial to strive and promote a significantly higher degree of receptive skills facilitating the access to other languages. Taking advantage of linguistic similarity – genetic, typological or areal – is the key to acquiring such abilities as efficiently as possible. Awareness that linguistic structures known of a specific language apply to other varieties in which similar phenomena are detectable is indeed essential

@inbook{TAfestGA,
title = {Surprisal in Intercomprehension},
author = {Tania Avgustinova},
editor = {Milena Slavcheva and Kiril Simov and Petya Osenova and Svetla Boytcheva},
url = {https://www.coli.uni-saarland.de/~tania/ta-pub/Avgustinova2020.Festschrift.pdf},
year = {2020},
date = {2020},
booktitle = {Knowledge, Language, Models},
isbn = {978-954-452-062-5},
pages = {6-19},
publisher = {INCOMA Ltd.},
address = {Shoumen, Bulgaria},
abstract = {A large-scale interdisciplinary research collaboration at Saarland University (Crocker et al. 2016) investigates the hypothesis that language use may be driven by the optimal utilization of the communication channel. The information-theoretic concepts of entropy (Shannon, 1949) and surprisal (Hale 2001; Levy 2008) have gained in popularity due to their potential to predict human linguistic behavior. The underlying assumption is that there is a certain total amount of information contained in a message, which is distributed over the individual units constituting it. Capturing this distribution of information is the goal of surprisal-based modeling with the intention of predicting the processing effort experienced by humans upon encountering these units. The ease of processing linguistic material is thus correlated with its contextually determined predictability, which may be appropriately indexed by Shannon’s notion of information. Multilingualism pervasiveness suggests that human language competence is used quite robustly, taking on various types of information and employing multi-source compensatory and guessing strategies. While it is not realistic to require from every single person to master several languages, it is certainly beneficial to strive and promote a significantly higher degree of receptive skills facilitating the access to other languages. Taking advantage of linguistic similarity – genetic, typological or areal – is the key to acquiring such abilities as efficiently as possible. Awareness that linguistic structures known of a specific language apply to other varieties in which similar phenomena are detectable is indeed essential},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   C4

Tourtouri, Elli

Rational redundancy in situated communication PhD Thesis

Saarland University, Saarbrücken, 2020.

Contrary to the Gricean maxims of Quantity (Grice, 1975), it has been repeatedly shown that speakers often include redundant information in their utterances (over- specifications). Previous research on referential communication has long debated whether this redundancy is the result of speaker-internal or addressee-oriented processes, while it is also unclear whether referential redundancy hinders or facilitates comprehension. We present a bounded-rational account of referential redundancy, according to which any word in an utterance, even if it is redundant, can be beneficial to comprehension, to the extent that it facilitates the reduction of listeners’ uncertainty regarding the target referent in a co-present visual scene. Information-theoretic metrics, such as Shannon’s entropy (Shannon, 1948), were employed in order to quantify this uncertainty in bits of information, and gain an estimate of the cognitive effort related to referential processing. Under this account, speakers may, therefore, utilise redundant adjectives in order to reduce the visually-determined entropy (and thereby their listeners’ cognitive effort) more uniformly across their utterances. In a series of experiments, we examined both the comprehension and the production of over-specifications in complex visual contexts. Our findings are in line with the bounded-rational account. Specifically, we present evidence that: (a) in view of complex visual scenes, listeners’ processing and identification of the target referent may be facilitated by the use of redundant adjectives, as well as by a more uniform reduction of uncertainty across the utterance, and (b) that, while both speaker-internal and addressee-oriented processes are at play in the production of over-specifications, listeners’ processing concerns may also influence the encoding of redundant adjectives, at least for some speakers, who encode redundant adjectives more frequently when these adjectives contribute to a more uniform reduction of referential entropy.

@phdthesis{Tourtouri2020,
title = {Rational redundancy in situated communication},
author = {Elli Tourtouri},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/29453},
doi = {https://doi.org/10.22028/D291-31436},
year = {2020},
date = {2020},
school = {Saarland University},
address = {Saarbr{\"u}cken},
abstract = {Contrary to the Gricean maxims of Quantity (Grice, 1975), it has been repeatedly shown that speakers often include redundant information in their utterances (over- specifications). Previous research on referential communication has long debated whether this redundancy is the result of speaker-internal or addressee-oriented processes, while it is also unclear whether referential redundancy hinders or facilitates comprehension. We present a bounded-rational account of referential redundancy, according to which any word in an utterance, even if it is redundant, can be beneficial to comprehension, to the extent that it facilitates the reduction of listeners’ uncertainty regarding the target referent in a co-present visual scene. Information-theoretic metrics, such as Shannon’s entropy (Shannon, 1948), were employed in order to quantify this uncertainty in bits of information, and gain an estimate of the cognitive effort related to referential processing. Under this account, speakers may, therefore, utilise redundant adjectives in order to reduce the visually-determined entropy (and thereby their listeners’ cognitive effort) more uniformly across their utterances. In a series of experiments, we examined both the comprehension and the production of over-specifications in complex visual contexts. Our findings are in line with the bounded-rational account. Specifically, we present evidence that: (a) in view of complex visual scenes, listeners’ processing and identification of the target referent may be facilitated by the use of redundant adjectives, as well as by a more uniform reduction of uncertainty across the utterance, and (b) that, while both speaker-internal and addressee-oriented processes are at play in the production of over-specifications, listeners’ processing concerns may also influence the encoding of redundant adjectives, at least for some speakers, who encode redundant adjectives more frequently when these adjectives contribute to a more uniform reduction of referential entropy.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   C3

Batliner, Anton; Möbius, Bernd

Prosody in automatic speech processing Book Chapter

Gussenhoven, Carlos; Chen, Aoju (Ed.): The Oxford Handbook of Language Prosody, Chap. 46, Oxford University Press, pp. 633-645, 2020, ISBN 9780198832232.

Automatic speech processing (ASP) is understood as covering word recognition, the processing of higher linguistic components (syntax, semantics, and pragmatics), and the processing of computational paralinguistics (CP), which deals with speaker states and traits. This chapter attempts to track the role of prosody in ASP from the word level up to CP. A short history of the field from 1980 to 2020 distinguishes the early years (until 2000)— when the prosodic contribution to the modelling of linguistic phenomena, such as accents, boundaries, syntax, semantics, and dialogue acts, was the focus—from the later years, when the focus shifted to paralinguistics; prosody ceased to be visible. Different types of predictor variables are addressed, among them high-performance power features as well as leverage features, which can also be employed in teaching and therapy.

@inbook{Batliner/Moebius:2020,
title = {Prosody in automatic speech processing},
author = {Anton Batliner and Bernd M{\"o}bius},
editor = {Carlos Gussenhoven and Aoju Chen},
url = {https://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780198832232.001.0001/oxfordhb-9780198832232-e-42},
doi = {https://doi.org/10.1093/oxfordhb/9780198832232.013.42},
year = {2020},
date = {2020},
booktitle = {The Oxford Handbook of Language Prosody, Chap. 46},
isbn = {9780198832232},
pages = {633-645},
publisher = {Oxford University Press},
abstract = {Automatic speech processing (ASP) is understood as covering word recognition, the processing of higher linguistic components (syntax, semantics, and pragmatics), and the processing of computational paralinguistics (CP), which deals with speaker states and traits. This chapter attempts to track the role of prosody in ASP from the word level up to CP. A short history of the field from 1980 to 2020 distinguishes the early years (until 2000)— when the prosodic contribution to the modelling of linguistic phenomena, such as accents, boundaries, syntax, semantics, and dialogue acts, was the focus—from the later years, when the focus shifted to paralinguistics; prosody ceased to be visible. Different types of predictor variables are addressed, among them high-performance power features as well as leverage features, which can also be employed in teaching and therapy.},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   C1

Karpiňski, Maciej; Andreeva, Bistra; Asu, Eva Liina; Beňuš, Štefan; Daugavet, Anna; Mády, Katalin

Central and Eastern Europe Book Chapter

Gussenhoven, Carlos; Chen, Aoju (Ed.): The Oxford Handbook of Language Prosody, Chap. 15, Oxford University Press, pp. 225-235, 2020, ISBN 9780198832232.

The languages of Central and Eastern Europe addressed in this chapter form a typologically divergent collection that includes Slavic (Belarusian, Bulgarian, Czech, Macedonian, Polish, Russian, pluricentric Bosnian-Croatian-Montenegrin-Serbian, Slovak, Slovenian, Ukrainian), Baltic (Latvian, Lithuanian), Finno-Ugric (Hungarian, Finnish, Estonian), and Romance (Romanian). Their prosodic features and structures have been explored to various depths, from different theoretical perspectives, sometimes on the basis of relatively sparse material. Still, enough is known to see that their typological divergence as well as other factors contribute to vivid differences in their prosodic systems. While belonging to intonational languages, they differ in pitch patterns and their usage, duration, and rhythm (some involve phonological duration), as well as prominence mechanisms, accentuation, and word stress (fixed or mobile). Several languages in the area have what is referred to by different traditions as pitch accents, tones or syllable accents, or intonations.

 

@inbook{Karpinski/etal:2020,
title = {Central and Eastern Europe},
author = {Maciej Karpiňski and Bistra Andreeva and Eva Liina Asu and Štefan Beňuš and Anna Daugavet and Katalin M{\'a}dy},
editor = {Carlos Gussenhoven and Aoju Chen},
url = {https://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780198832232.001.0001/oxfordhb-9780198832232-e-14},
year = {2020},
date = {2020},
booktitle = {The Oxford Handbook of Language Prosody, Chap. 15},
isbn = {9780198832232},
pages = {225-235},
publisher = {Oxford University Press},
abstract = {The languages of Central and Eastern Europe addressed in this chapter form a typologically divergent collection that includes Slavic (Belarusian, Bulgarian, Czech, Macedonian, Polish, Russian, pluricentric Bosnian-Croatian-Montenegrin-Serbian, Slovak, Slovenian, Ukrainian), Baltic (Latvian, Lithuanian), Finno-Ugric (Hungarian, Finnish, Estonian), and Romance (Romanian). Their prosodic features and structures have been explored to various depths, from different theoretical perspectives, sometimes on the basis of relatively sparse material. Still, enough is known to see that their typological divergence as well as other factors contribute to vivid differences in their prosodic systems. While belonging to intonational languages, they differ in pitch patterns and their usage, duration, and rhythm (some involve phonological duration), as well as prominence mechanisms, accentuation, and word stress (fixed or mobile). Several languages in the area have what is referred to by different traditions as pitch accents, tones or syllable accents, or intonations.},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   C1

Abdullah, Badr M.; Avgustinova, Tania; Möbius, Bernd; Klakow, Dietrich

Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages Inproceedings

Proceedings of Interspeech 2020, pp. 477-481, 2020.

State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with different acoustic conditions due to domain shift. In this paper, we present a set of experiments to investigate the impact of domain mismatch on the performance of neural LID systems for a subset of six Slavic languages across two domains (read speech and radio broadcast) and examine two low-level signal descriptors (spectral and cepstral features) for this task. Our experiments show that (1) out-of-domain speech samples severely hinder the performance of neural LID models, and (2) while both spectral and cepstral features show comparable performance within-domain, spectral features show more robustness under domain mismatch. Moreover, we apply unsupervised domain adaptation to minimize the discrepancy between the two domains in our study. We achieve relative accuracy improvements that range from 9% to 77% depending on the diversity of acoustic conditions in the source domain.

@inproceedings{abdullah_etal_is2020,
title = {Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages},
author = {Badr M. Abdullah and Tania Avgustinova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://arxiv.org/abs/2008.00545},
doi = {https://doi.org/10.21437/Interspeech.2020-2930},
year = {2020},
date = {2020},
booktitle = {Proceedings of Interspeech 2020},
pages = {477-481},
abstract = {State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with different acoustic conditions due to domain shift. In this paper, we present a set of experiments to investigate the impact of domain mismatch on the performance of neural LID systems for a subset of six Slavic languages across two domains (read speech and radio broadcast) and examine two low-level signal descriptors (spectral and cepstral features) for this task. Our experiments show that (1) out-of-domain speech samples severely hinder the performance of neural LID models, and (2) while both spectral and cepstral features show comparable performance within-domain, spectral features show more robustness under domain mismatch. Moreover, we apply unsupervised domain adaptation to minimize the discrepancy between the two domains in our study. We achieve relative accuracy improvements that range from 9% to 77% depending on the diversity of acoustic conditions in the source domain.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   C1 C4

Abdullah, Badr M.; Kudera, Jacek; Avgustinova, Tania; Möbius, Bernd; Klakow, Dietrich

Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification Inproceedings

Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2020), International Committee on Computational Linguistics (ICCL), pp. 128-139, Barcelona, Spain (Online), 2020.

Deep neural networks have been employed for various spoken language recognition tasks, including tasks that are multilingual by definition such as spoken language identification (LID). In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness or non-linguists’ perception of language similarity. While our analysis shows that the language representation space indeed captures language relatedness to a great extent, we find perceptual confusability to be the best predictor of the language representation similarity.

@inproceedings{abdullah_etal_vardial2020,
title = {Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification},
author = {Badr M. Abdullah and Jacek Kudera and Tania Avgustinova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://www.aclweb.org/anthology/2020.vardial-1.12},
year = {2020},
date = {2020},
booktitle = {Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2020)},
pages = {128-139},
publisher = {International Committee on Computational Linguistics (ICCL)},
address = {Barcelona, Spain (Online)},
abstract = {Deep neural networks have been employed for various spoken language recognition tasks, including tasks that are multilingual by definition such as spoken language identification (LID). In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness or non-linguists’ perception of language similarity. While our analysis shows that the language representation space indeed captures language relatedness to a great extent, we find perceptual confusability to be the best predictor of the language representation similarity.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   C1 C4

Köhn, Arne; Wichlacz, Julia; Torralba, Álvaro; Höller, Daniel; Hoffmann, Jörg; Koller, Alexander

Generating Instructions at Different Levels of Abstraction Inproceedings

Proceedings of the 28th International Conference on Computational Linguistics, International Committee on Computational Linguistics, pp. 2802-2813, Barcelona, Spain (Online), 2020.

When generating technical instructions, it is often convenient to describe complex objects in the world at different levels of abstraction. A novice user might need an object explained piece by piece, while for an expert, talking about the complex object (e.g. a wall or railing) directly may be more succinct and efficient. We show how to generate building instructions at different levels of abstraction in Minecraft. We introduce the use of hierarchical planning to this end, a method from AI planning which can capture the structure of complex objects neatly. A crowdsourcing evaluation shows that the choice of abstraction level matters to users, and that an abstraction strategy which balances low-level and high-level object descriptions compares favorably to ones which don’t.

@inproceedings{kohn-etal-2020-generating,
title = {Generating Instructions at Different Levels of Abstraction},
author = {Arne K{\"o}hn and Julia Wichlacz and {\'A}lvaro Torralba and Daniel H{\"o}ller and J{\"o}rg Hoffmann and Alexander Koller},
url = {https://aclanthology.org/2020.coling-main.252/},
doi = {https://doi.org/10.18653/v1/2020.coling-main.252},
year = {2020},
date = {2020},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
pages = {2802-2813},
publisher = {International Committee on Computational Linguistics},
address = {Barcelona, Spain (Online)},
abstract = {When generating technical instructions, it is often convenient to describe complex objects in the world at different levels of abstraction. A novice user might need an object explained piece by piece, while for an expert, talking about the complex object (e.g. a wall or railing) directly may be more succinct and efficient. We show how to generate building instructions at different levels of abstraction in Minecraft. We introduce the use of hierarchical planning to this end, a method from AI planning which can capture the structure of complex objects neatly. A crowdsourcing evaluation shows that the choice of abstraction level matters to users, and that an abstraction strategy which balances low-level and high-level object descriptions compares favorably to ones which don't.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A7

Köhn, Arne; Wichlacz, Julia; Schäfer, Christine; Torralba, Álvaro; Hoffmann, Jörg; Koller, Alexander

MC-Saar-Instruct: a Platform for Minecraft Instruction Giving Agents Inproceedings

Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Association for Computational Linguistics, pp. 53-56, 1st virtual meeting, 2020.

We present a comprehensive platform to run human-computer experiments where an agent instructs a human in Minecraft, a 3D blocksworld environment. This platform enables comparisons between different agents by matching users to agents. It performs extensive logging and takes care of all boilerplate, allowing to easily incorporate new agents to evaluate them. Our environment is prepared to evaluate any kind of instruction giving system, recording the interaction and all actions of the user. We provide example architects, a Wizard-of-Oz architect and set-up scripts to automatically download, build and start the platform.

@inproceedings{Hoeller2020IJCAIb,
title = {MC-Saar-Instruct: a Platform for Minecraft Instruction Giving Agents},
author = {Arne K{\"o}hn and Julia Wichlacz and Christine Sch{\"a}fer and {\'A}lvaro Torralba and J{\"o}rg Hoffmann and Alexander Koller},
url = {https://www.aclweb.org/anthology/2020.sigdial-1.7},
year = {2020},
date = {2020},
booktitle = {Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue},
pages = {53-56},
publisher = {Association for Computational Linguistics},
address = {1st virtual meeting},
abstract = {We present a comprehensive platform to run human-computer experiments where an agent instructs a human in Minecraft, a 3D blocksworld environment. This platform enables comparisons between different agents by matching users to agents. It performs extensive logging and takes care of all boilerplate, allowing to easily incorporate new agents to evaluate them. Our environment is prepared to evaluate any kind of instruction giving system, recording the interaction and all actions of the user. We provide example architects, a Wizard-of-Oz architect and set-up scripts to automatically download, build and start the platform.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A7

Höller, Daniel; Bercher, Pascal; Behnke, Gregor; Biundo, Susanne

HTN Plan Repair via Model Transformation Inproceedings

Proceedings of the 43rd German Conference on Artificial Intelligence (KI), Springer, pp. 88-101, 2020.

To make planning feasible, planning models abstract from many details of the modeled system. When executing plans in the actual system, the model might be inaccurate in a critical point, and plan execution may fail. There are two options to handle this case: the previous solution can be modified to address the failure (plan repair), or the planning process can be re-started from the new situation (re-planning). In HTN planning, discarding the plan and generating a new one from the novel situation is not easily possible, because the HTN solution criteria make it necessary to take already executed actions into account. Therefore all approaches to repair plans in the literature are based on specialized algorithms. In this paper, we discuss the problem in detail and introduce a novel approach that makes it possible to use unchanged, off-the-shelf HTN planning systems to repair broken HTN plans. That way, no specialized solvers are needed.

@inproceedings{Hoeller2020KI,
title = {HTN Plan Repair via Model Transformation},
author = {Daniel H{\"o}ller and Pascal Bercher and Gregor Behnke and Susanne Biundo},
url = {https://link.springer.com/chapter/10.1007/978-3-030-58285-2_7},
year = {2020},
date = {2020},
booktitle = {Proceedings of the 43rd German Conference on Artificial Intelligence (KI)},
pages = {88-101},
publisher = {Springer},
abstract = {To make planning feasible, planning models abstract from many details of the modeled system. When executing plans in the actual system, the model might be inaccurate in a critical point, and plan execution may fail. There are two options to handle this case: the previous solution can be modified to address the failure (plan repair), or the planning process can be re-started from the new situation (re-planning). In HTN planning, discarding the plan and generating a new one from the novel situation is not easily possible, because the HTN solution criteria make it necessary to take already executed actions into account. Therefore all approaches to repair plans in the literature are based on specialized algorithms. In this paper, we discuss the problem in detail and introduce a novel approach that makes it possible to use unchanged, off-the-shelf HTN planning systems to repair broken HTN plans. That way, no specialized solvers are needed.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A7

Häuser, Katja; Kray, Jutta; Borovsky, Arielle

Great expectations: Evidence for graded prediction of grammatical gender Inproceedings

CogSci, 2020.

Language processing is predictive in nature. But how do people balance multiple competing options as they predict upcoming meanings? Here, we investigated whether readers generate graded predictions about grammatical gender of nouns. Sentence contexts were manipulated so that they strongly biased people’s expectations towards two or more nouns that had the same grammatical gender (single bias condition), or they biased multiple genders from different grammatical classes (multiple bias condition). Our expectation was that unexpected articles should lead to elevated reading times (RTs) in the single-bias condition when probabilistic expectations towards a particular gender are violated. Indeed, the results showed greater sensitivity among language users towards unexpected articles in the single-bias condition, however, RTs on unexpected gendermarked articles were facilitated, and not slowed. Our data confirm that difficulty in sentence processing is modulated by uncertainty about predicted information, and suggest that readers make graded predictions about grammatical gender.

@inproceedings{haeuser2020great,
title = {Great expectations: Evidence for graded prediction of grammatical gender},
author = {Katja H{\"a}user and Jutta Kray and Arielle Borovsky},
url = {https://link.springer.com/article/10.3758/s13415-015-0340-0},
doi = {https://doi.org/10.3758/s13415-015-0340-0},
year = {2020},
date = {2020},
booktitle = {CogSci},
abstract = {Language processing is predictive in nature. But how do people balance multiple competing options as they predict upcoming meanings? Here, we investigated whether readers generate graded predictions about grammatical gender of nouns. Sentence contexts were manipulated so that they strongly biased people's expectations towards two or more nouns that had the same grammatical gender (single bias condition), or they biased multiple genders from different grammatical classes (multiple bias condition). Our expectation was that unexpected articles should lead to elevated reading times (RTs) in the single-bias condition when probabilistic expectations towards a particular gender are violated. Indeed, the results showed greater sensitivity among language users towards unexpected articles in the single-bias condition, however, RTs on unexpected gendermarked articles were facilitated, and not slowed. Our data confirm that difficulty in sentence processing is modulated by uncertainty about predicted information, and suggest that readers make graded predictions about grammatical gender.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A5

Zhai, Fangzhou; Demberg, Vera; Koller, Alexander

Story Generation with Rich Details Inproceedings

Proceedings of the 28th International Conference on Computational Linguistics (CoLing 2020), International Committee on Computational Linguistics, pp. 2346-2351, Barcelona, Spain (Online), 2020.

Automatically generated stories need to be not only coherent, but also interesting. Apart from realizing a story line, the text also needs to include rich details to engage the readers. We propose a model that features two different generation components: an outliner, which proceeds the main story line to realize global coherence; a detailer, which supplies relevant details to the story in a locally coherent manner. Human evaluations show our model substantially improves the informativeness of generated text while retaining its coherence, outperforming various baselines.

@inproceedings{zhai-etal-2020-story,
title = {Story Generation with Rich Details},
author = {Fangzhou Zhai and Vera Demberg and Alexander Koller},
url = {https://www.aclweb.org/anthology/2020.coling-main.212},
doi = {https://doi.org/10.18653/v1/2020.coling-main.212},
year = {2020},
date = {2020},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics (CoLing 2020)},
pages = {2346-2351},
publisher = {International Committee on Computational Linguistics},
address = {Barcelona, Spain (Online)},
abstract = {Automatically generated stories need to be not only coherent, but also interesting. Apart from realizing a story line, the text also needs to include rich details to engage the readers. We propose a model that features two different generation components: an outliner, which proceeds the main story line to realize global coherence; a detailer, which supplies relevant details to the story in a locally coherent manner. Human evaluations show our model substantially improves the informativeness of generated text while retaining its coherence, outperforming various baselines.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A3

Torabi Asr, Fatemeh; Demberg, Vera

Interpretation of Discourse Connectives Is Probabilistic: Evidence From the Study of But and Although Journal Article

Discourse Processes, 57, pp. 376-399, 2020.

Connectives can facilitate the processing of discourse relations by helping comprehenders to infer the intended coherence relation holding between two text spans. Previous experimental studies have focused on pairs of connectives that are very different from one another to be able to compare and formalize the distinguishing effects of these particles in discourse comprehension. In this article, we compare two connectives, but and although, which overlap in terms of the relations they can signal. We demonstrate in a set of carefully controlled studies that while a connective can be a marker of several discourse relations, it can have a specific fine-grained biasing effect on linguistic inferences and that this bias can be derived (or predicted) from the connectives’ distribution of relations found in production data. The effects that we find speak to the ambiguity of discourse connectives, in general, and the different functions of but and although, in particular. These effects cannot be explained within the earlier accounts of discourse connectives, which propose that each connective has a core meaning or processing instruction. Instead, we here lay out a probabilistic account of connective meaning and interpretation, which is based on the distribution of connectives in production and is supported by our experimental findings.

@article{torabi2020interpretation,
title = {Interpretation of Discourse Connectives Is Probabilistic: Evidence From the Study of But and Although},
author = {Fatemeh Torabi Asr and Vera Demberg},
url = {https://www.tandfonline.com/doi/full/10.1080/0163853X.2019.1700760},
doi = {https://doi.org/10.1080/0163853X.2019.1700760},
year = {2020},
date = {2020-01-27},
journal = {Discourse Processes},
pages = {376-399},
volume = {57},
number = {4},
abstract = {Connectives can facilitate the processing of discourse relations by helping comprehenders to infer the intended coherence relation holding between two text spans. Previous experimental studies have focused on pairs of connectives that are very different from one another to be able to compare and formalize the distinguishing effects of these particles in discourse comprehension. In this article, we compare two connectives, but and although, which overlap in terms of the relations they can signal. We demonstrate in a set of carefully controlled studies that while a connective can be a marker of several discourse relations, it can have a specific fine-grained biasing effect on linguistic inferences and that this bias can be derived (or predicted) from the connectives’ distribution of relations found in production data. The effects that we find speak to the ambiguity of discourse connectives, in general, and the different functions of but and although, in particular. These effects cannot be explained within the earlier accounts of discourse connectives, which propose that each connective has a core meaning or processing instruction. Instead, we here lay out a probabilistic account of connective meaning and interpretation, which is based on the distribution of connectives in production and is supported by our experimental findings.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Mecklinger, Axel; Höltje, Gerrit; Ranker, Lika; Eschmann, Kathrin

Unexpected but plausible: The consequences of disconfirmed predictions for episodic memory formation Miscellaneous

CNS 2020 Virtual meeting, Abstract Book, pp. 53 (B79), 2020.

@miscellaneous{Mecklinger_etal2020,
title = {Unexpected but plausible: The consequences of disconfirmed predictions for episodic memory formation},
author = {Axel Mecklinger and Gerrit H{\"o}ltje and Lika Ranker and Kathrin Eschmann},
year = {2020},
date = {2020},
booktitle = {CNS 2020 Virtual meeting, Abstract Book},
pages = {53 (B79)},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   A6

Chen, Yu; Avgustinova, Tania

Machine Translation from an Intercomprehension Perspective Inproceedings

Proc. Fourth Conference on Machine Translation (WMT), Volume 3: Shared Task Papers, pp. 192-196, Florence, Italy, 2019.

Within the first shared task on machine translation between similar languages, we present our first attempts on Czech to Polish machine translation from an intercomprehension perspective. We propose methods based on the mutual intelligibility of the two languages, taking advantage of their orthographic and phonological similarity, in the hope to improve over our baselines. The translation results are evaluated using BLEU. On this metric, none of our proposals could outperform the baselines on the final test set. The current setups are rather preliminary, and there are several potential improvements we can try in the future.

@inproceedings{csplMT,
title = {Machine Translation from an Intercomprehension Perspective},
author = {Yu Chen and Tania Avgustinova},
url = {https://aclanthology.org/W19-5425},
doi = {https://doi.org/10.18653/v1/W19-5425},
year = {2019},
date = {2019},
booktitle = {Proc. Fourth Conference on Machine Translation (WMT), Volume 3: Shared Task Papers},
pages = {192-196},
address = {Florence, Italy},
abstract = {Within the first shared task on machine translation between similar languages, we present our first attempts on Czech to Polish machine translation from an intercomprehension perspective. We propose methods based on the mutual intelligibility of the two languages, taking advantage of their orthographic and phonological similarity, in the hope to improve over our baselines. The translation results are evaluated using BLEU. On this metric, none of our proposals could outperform the baselines on the final test set. The current setups are rather preliminary, and there are several potential improvements we can try in the future.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Ankener, Christine

The influence of visual information on word predictability and processing effort PhD Thesis

Saarland University, Saarbruecken, Germany, 2019.

A word’s predictability or surprisal in linguistic context, as determined by cloze probabilities or languagemodels (e.g., Frank, 2013a) is related to processing effort, in that less expected words take more effort to process (e.g., Hale, 2001). This shows how, in purely linguistic contexts, rational approaches have been proven valid to predict and formalise results from language processing studies. However, the surprisal (or predictability) of a word may also be influenced by extra-linguistic factors, such as visual context information, as given in situated language processing. While, in the case of linguistic contexts, it is known that the incrementally processed information affects the mental model (e.g., Zwaan and Radvansky, 1998) at each word in a probabilistic way, no such observations have been made so far in the case of visual context information. Although it has been shown that in the visual world paradigm (VWP), anticipatory eye movements suggest that listeners exploit the scene to predict what will be mentioned next (Altmann and Kamide, 1999), it is so far unclear how visual information actually affects expectations for and processing effort of target words. If visual context effects on word processing effort can be observed, we hypothesise that rational concepts can be extended in order to formalise these effects, hereby making them statistically accessible for language models. In a line of experiments, I hence observe how visual information – which is inherently different from linguistic context, for instance in its non-incremental-at once-accessibility– affects target words. Our findings are a clear and robust demonstration that the non-linguistic context can immediately influence both lexical expectations, and surprisal-based processing effort as assessed by two different on-line measures of effort (a pupillary and an EEG one). Finally, I use surprisal to formalise the measured results and propose an extended formula to take visual information into account.

@phdthesis{Ankener_Diss_2019,
title = {The influence of visual information on word predictability and processing effort},
author = {Christine Ankener},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/27905},
doi = {https://doi.org/https://dx.doi.org/10.22028/D291-28451},
year = {2019},
date = {2019},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {A word’s predictability or surprisal in linguistic context, as determined by cloze probabilities or languagemodels (e.g., Frank, 2013a) is related to processing effort, in that less expected words take more effort to process (e.g., Hale, 2001). This shows how, in purely linguistic contexts, rational approaches have been proven valid to predict and formalise results from language processing studies. However, the surprisal (or predictability) of a word may also be influenced by extra-linguistic factors, such as visual context information, as given in situated language processing. While, in the case of linguistic contexts, it is known that the incrementally processed information affects the mental model (e.g., Zwaan and Radvansky, 1998) at each word in a probabilistic way, no such observations have been made so far in the case of visual context information. Although it has been shown that in the visual world paradigm (VWP), anticipatory eye movements suggest that listeners exploit the scene to predict what will be mentioned next (Altmann and Kamide, 1999), it is so far unclear how visual information actually affects expectations for and processing effort of target words. If visual context effects on word processing effort can be observed, we hypothesise that rational concepts can be extended in order to formalise these effects, hereby making them statistically accessible for language models. In a line of experiments, I hence observe how visual information – which is inherently different from linguistic context, for instance in its non-incremental-at once-accessibility– affects target words. Our findings are a clear and robust demonstration that the non-linguistic context can immediately influence both lexical expectations, and surprisal-based processing effort as assessed by two different on-line measures of effort (a pupillary and an EEG one). Finally, I use surprisal to formalise the measured results and propose an extended formula to take visual information into account.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   A5

Jágrová, Klára

Reading Polish with Czech Eyes: Distance and Surprisal in Quantitative, Qualitative, and Error Analyses of Intelligibility PhD Thesis

Saarland University, Saarbruecken, Germany, 2019.

In CHAPTER I, I first introduce the thesis in the context of the project workflow in section 1. I then summarise the methods and findings from the project publications about the languages in focus. There I also introduce the relevant concepts and terminology viewed in the literature as possible predictors of intercomprehension and processing difficulty. CHAPTER II presents a quantitative (section 4) and a qualitative (section 5) analysis of the results of the cooperative translation experiments. The focus of this thesis – the language pair PL-CS – is explained and the hypotheses are introduced in section 6. The experiment website is introduced in section 7 with an overview over participants, the different experiments conducted and in which section they are discussed. In CHAPTER IV, free translation experiments are discussed in which two different sets of individual word stimuli were presented to Czech readers: (i) Cognates that are transformable with regular PL-CS correspondences (section 12) and (ii) the 100 most frequent PL nouns (section 13). CHAPTER V presents the findings of experiments in which PL NPs in two different linearisation conditions were presented to Czech readers (section 14.1-14.6). A short digression is made when I turn to experiments with PL internationalisms which were presented to German readers (14.7). CHAPTER VI discusses the methods and results of cloze translation experiments with highly predictable target words in sentential context (section 15) and random context with sentences from the cooperative translation experiments (section 16). A final synthesis of the findings, together with an outlook, is provided in CHAPTER VII.


In KAPITEL I stelle ich zunächst die These im Kontext des Projektablaufs in Abschnitt 1 vor. Anschließend fasse ich die Methoden und Erkenntnisse aus den Projektpublikationen zu den untersuchten Sprachen zusammen. Dort stelle ich auch die relevanten Konzepte und die Terminologie vor, die in der Literatur als mögliche Prädiktoren für Interkomprehension und Verarbeitungsschwierigkeiten angesehen werden. KAPITEL II enthält eine quantitative (Abschnitt 4) und eine qualitative (Abschnitt 5) Analyse der Ergebnisse der kooperativen Übersetzungsexperimente. Der Fokus dieser Arbeit – das Sprachenpaar PL-CS – wird erläutert und die Hypothesen werden in Abschnitt 6 vorgestellt. Die Experiment-Website wird in Abschnitt 7 mit einer Übersicht über die Teilnehmer, die verschiedenen durchgeführten Experimente und die Abschnitte, in denen sie besprochen werden, vorgestellt. In KAPITEL IV werden Experimente zur freien Übersetzung besprochen, bei denen tschechischen Lesern zwei verschiedene Sätze einzelner Wortstimuli präsentiert wurden: (i) Kognaten, die mit regulären PL-CS-Korrespondenzen umgewandelt werden können (Abschnitt 12) und (ii) die 100 häufigsten PL-Substantive (Abschnitt 13). KAPITEL V stellt die Ergebnisse von Experimenten vor, in denen tschechischen Lesern PL-NP in zwei verschiedenen Linearisierungszuständen präsentiert wurden (Abschnitt 14.1-14.6). Einen kurzen Exkurs mache ich, wenn ich mich den Experimenten mit PL-Internationalismen zuwende, die deutschen Lesern präsentiert wurden (14.7). KAPITEL VI erörtert die Methoden und Ergebnisse von Lückentexten mit hochgradig vorhersehbaren Zielwörtern im Satzkontext (Abschnitt 15) und Zufallskontext mit Sätzen aus den kooperativen Übersetzungsexperimenten (Abschnitt 16). Eine abschließende Synthese der Ergebnisse und ein Ausblick finden sich in KAPITEL VII.

@phdthesis{Jagrova_Diss_2019,
title = {Reading Polish with Czech Eyes: Distance and Surprisal in Quantitative, Qualitative, and Error Analyses of Intelligibility},
author = {Kl{\'a}ra J{\'a}grov{\'a}},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/32995},
doi = {https://doi.org/10.22028/D291-32708},
year = {2019},
date = {2019},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {In CHAPTER I, I first introduce the thesis in the context of the project workflow in section 1. I then summarise the methods and findings from the project publications about the languages in focus. There I also introduce the relevant concepts and terminology viewed in the literature as possible predictors of intercomprehension and processing difficulty. CHAPTER II presents a quantitative (section 4) and a qualitative (section 5) analysis of the results of the cooperative translation experiments. The focus of this thesis – the language pair PL-CS – is explained and the hypotheses are introduced in section 6. The experiment website is introduced in section 7 with an overview over participants, the different experiments conducted and in which section they are discussed. In CHAPTER IV, free translation experiments are discussed in which two different sets of individual word stimuli were presented to Czech readers: (i) Cognates that are transformable with regular PL-CS correspondences (section 12) and (ii) the 100 most frequent PL nouns (section 13). CHAPTER V presents the findings of experiments in which PL NPs in two different linearisation conditions were presented to Czech readers (section 14.1-14.6). A short digression is made when I turn to experiments with PL internationalisms which were presented to German readers (14.7). CHAPTER VI discusses the methods and results of cloze translation experiments with highly predictable target words in sentential context (section 15) and random context with sentences from the cooperative translation experiments (section 16). A final synthesis of the findings, together with an outlook, is provided in CHAPTER VII.


In KAPITEL I stelle ich zun{\"a}chst die These im Kontext des Projektablaufs in Abschnitt 1 vor. Anschlie{\ss}end fasse ich die Methoden und Erkenntnisse aus den Projektpublikationen zu den untersuchten Sprachen zusammen. Dort stelle ich auch die relevanten Konzepte und die Terminologie vor, die in der Literatur als m{\"o}gliche Pr{\"a}diktoren f{\"u}r Interkomprehension und Verarbeitungsschwierigkeiten angesehen werden. KAPITEL II enth{\"a}lt eine quantitative (Abschnitt 4) und eine qualitative (Abschnitt 5) Analyse der Ergebnisse der kooperativen {\"U}bersetzungsexperimente. Der Fokus dieser Arbeit - das Sprachenpaar PL-CS - wird erl{\"a}utert und die Hypothesen werden in Abschnitt 6 vorgestellt. Die Experiment-Website wird in Abschnitt 7 mit einer {\"U}bersicht {\"u}ber die Teilnehmer, die verschiedenen durchgef{\"u}hrten Experimente und die Abschnitte, in denen sie besprochen werden, vorgestellt. In KAPITEL IV werden Experimente zur freien {\"U}bersetzung besprochen, bei denen tschechischen Lesern zwei verschiedene S{\"a}tze einzelner Wortstimuli pr{\"a}sentiert wurden: (i) Kognaten, die mit regul{\"a}ren PL-CS-Korrespondenzen umgewandelt werden k{\"o}nnen (Abschnitt 12) und (ii) die 100 h{\"a}ufigsten PL-Substantive (Abschnitt 13). KAPITEL V stellt die Ergebnisse von Experimenten vor, in denen tschechischen Lesern PL-NP in zwei verschiedenen Linearisierungszust{\"a}nden pr{\"a}sentiert wurden (Abschnitt 14.1-14.6). Einen kurzen Exkurs mache ich, wenn ich mich den Experimenten mit PL-Internationalismen zuwende, die deutschen Lesern pr{\"a}sentiert wurden (14.7). KAPITEL VI er{\"o}rtert die Methoden und Ergebnisse von L{\"u}ckentexten mit hochgradig vorhersehbaren Zielw{\"o}rtern im Satzkontext (Abschnitt 15) und Zufallskontext mit S{\"a}tzen aus den kooperativen {\"U}bersetzungsexperimenten (Abschnitt 16). Eine abschlie{\ss}ende Synthese der Ergebnisse und ein Ausblick finden sich in KAPITEL VII.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   C4

Venhuizen, Noortje; Crocker, Matthew W.; Brouwer, Harm

Semantic Entropy in Language Comprehension Journal Article

Entropy, 21, pp. 1159, 2019.

Language is processed on a more or less word-by-word basis, and the processing difficulty induced by each word is affected by our prior linguistic experience as well as our general knowledge about the world. Surprisal and entropy reduction have been independently proposed as linking theories between word processing difficulty and probabilistic language models. Extant models, however, are typically limited to capturing linguistic experience and hence cannot account for the influence of world knowledge. A recent comprehension model by Venhuizen, Crocker, and Brouwer (2019, Discourse Processes) improves upon this situation by instantiating a comprehension-centric metric of surprisal that integrates linguistic experience and world knowledge at the level of interpretation and combines them in determining online expectations. Here, we extend this work by deriving a comprehension-centric metric of entropy reduction from this model. In contrast to previous work, which has found that surprisal and entropy reduction are not easily dissociated, we do find a clear dissociation in our model. While both surprisal and entropy reduction derive from the same cognitive process – the word-by-word updating of the unfolding interpretation – they reflect different aspects of this process: state-by-state expectation (surprisal) versus end-state confirmation (entropy reduction).

@article{Venhuizen2019,
title = {Semantic Entropy in Language Comprehension},
author = {Noortje Venhuizen and Matthew W. Crocker and Harm Brouwer},
url = {https://www.mdpi.com/1099-4300/21/12/1159},
doi = {https://doi.org/10.3390/e21121159},
year = {2019},
date = {2019-11-27},
journal = {Entropy},
pages = {1159},
volume = {21},
number = {12},
abstract = {Language is processed on a more or less word-by-word basis, and the processing difficulty induced by each word is affected by our prior linguistic experience as well as our general knowledge about the world. Surprisal and entropy reduction have been independently proposed as linking theories between word processing difficulty and probabilistic language models. Extant models, however, are typically limited to capturing linguistic experience and hence cannot account for the influence of world knowledge. A recent comprehension model by Venhuizen, Crocker, and Brouwer (2019, Discourse Processes) improves upon this situation by instantiating a comprehension-centric metric of surprisal that integrates linguistic experience and world knowledge at the level of interpretation and combines them in determining online expectations. Here, we extend this work by deriving a comprehension-centric metric of entropy reduction from this model. In contrast to previous work, which has found that surprisal and entropy reduction are not easily dissociated, we do find a clear dissociation in our model. While both surprisal and entropy reduction derive from the same cognitive process - the word-by-word updating of the unfolding interpretation - they reflect different aspects of this process: state-by-state expectation (surprisal) versus end-state confirmation (entropy reduction).},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A1

Shi, Wei; Demberg, Vera

Next Sentence Prediction helps Implicit Discourse Relation Classification within and across Domains Inproceedings

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, pp. 5789-5795, Hong Kong, China, 2019.

Implicit discourse relation classification is one of the most difficult tasks in discourse parsing. Previous studies have generally focused on extracting better representations of the relational arguments. In order to solve the task, it is however additionally necessary to capture what events are expected to cause or follow each other. Current discourse relation classifiers fall short in this respect. We here show that this shortcoming can be effectively addressed by using the bidirectional encoder representation from transformers (BERT) proposed by Devlin et al. (2019), which were trained on a nextsentence prediction task, and thus encode a representation of likely next sentences. The BERT-based model outperforms the current state of the art in 11-way classification by 8% points on the standard PDTB dataset. Our experiments also demonstrate that the model can be successfully ported to other domains: on the BioDRB dataset, the model outperforms
the state of the art system around 15% points.

@inproceedings{shi-demberg-2019-next,
title = {Next Sentence Prediction helps Implicit Discourse Relation Classification within and across Domains},
author = {Wei Shi and Vera Demberg},
url = {https://www.aclweb.org/anthology/D19-1586},
doi = {https://doi.org/10.18653/v1/D19-1586},
year = {2019},
date = {2019-11-03},
booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
pages = {5789-5795},
publisher = {Association for Computational Linguistics},
address = {Hong Kong, China},
abstract = {Implicit discourse relation classification is one of the most difficult tasks in discourse parsing. Previous studies have generally focused on extracting better representations of the relational arguments. In order to solve the task, it is however additionally necessary to capture what events are expected to cause or follow each other. Current discourse relation classifiers fall short in this respect. We here show that this shortcoming can be effectively addressed by using the bidirectional encoder representation from transformers (BERT) proposed by Devlin et al. (2019), which were trained on a nextsentence prediction task, and thus encode a representation of likely next sentences. The BERT-based model outperforms the current state of the art in 11-way classification by 8% points on the standard PDTB dataset. Our experiments also demonstrate that the model can be successfully ported to other domains: on the BioDRB dataset, the model outperforms the state of the art system around 15% points.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Successfully