Publications

Brandt, Erika; Andreeva, Bistra; Möbius, Bernd

Information density and vowel dispersion in the productions of Bulgarian L2 speakers of German Inproceedings

Proceedings of the 19th International Congress of Phonetic Sciences , pp. 3165-3169, Melbourne, Australia, 2019.

We investigated the influence of information density (ID) on vowel space size in L2. Vowel dispersion was measured for the stressed tense vowels /i:, o:, a:/ and their lax counterpart /I, O, a/ in read speech from six German speakers, six advanced and six intermediate Bulgarian speakers of German. The Euclidean distance between center of the vowel space and formant values for each speaker was used as a measure for vowel dispersion. ID was calculated as the surprisal of the triphone of the preceding context. We found a significant positive correlation between surprisal and vowel dispersion in German native speakers. The advanced L2 speakers showed a significant positive relationship between these two measures, while this was not observed in intermediate L2 vowel productions. The intermediate speakers raised their vowel space, reflecting native Bulgarian vowel raising in unstressed positions.

@inproceedings{Brandt2019,
title = {Information density and vowel dispersion in the productions of Bulgarian L2 speakers of German},
author = {Erika Brandt and Bistra Andreeva and Bernd M{\"o}bius},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/29548},
year = {2019},
date = {2019},
booktitle = {Proceedings of the 19th International Congress of Phonetic Sciences},
pages = {3165-3169},
address = {Melbourne, Australia},
abstract = {We investigated the influence of information density (ID) on vowel space size in L2. Vowel dispersion was measured for the stressed tense vowels /i:, o:, a:/ and their lax counterpart /I, O, a/ in read speech from six German speakers, six advanced and six intermediate Bulgarian speakers of German. The Euclidean distance between center of the vowel space and formant values for each speaker was used as a measure for vowel dispersion. ID was calculated as the surprisal of the triphone of the preceding context. We found a significant positive correlation between surprisal and vowel dispersion in German native speakers. The advanced L2 speakers showed a significant positive relationship between these two measures, while this was not observed in intermediate L2 vowel productions. The intermediate speakers raised their vowel space, reflecting native Bulgarian vowel raising in unstressed positions.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C1

van Genabith, Josef; España-Bonet, Cristina; Lapshinova-Koltunski, Ekaterina

Analysing Coreference in Transformer Outputs Inproceedings

Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019), Association for Computational Linguistics, pp. 1-12, Hong Kong, China, 2019.

We analyse coreference phenomena in three neural machine translation systems trained with different data settings with or without access to explicit intra- and cross-sentential anaphoric information. We compare system performance on two different genres: news and TED talks. To do this, we manually annotate (the possibly incorrect) coreference chains in the MT outputs and evaluate the coreference chain translations. We define an error typology that aims to go further than pronoun translation adequacy and includes types such as incorrect word selection or missing words. The features of coreference chains in automatic translations are also compared to those of the source texts and human translations. The analysis shows stronger potential translationese effects in machine translated outputs than in human translations.

@inproceedings{lapshinovaEtal:2019iscoMT,
title = {Analysing Coreference in Transformer Outputs},
author = {Josef van Genabith and Cristina Espa{\~n}a-Bonet andEkaterina Lapshinova-Koltunski},
url = {https://www.aclweb.org/anthology/D19-6501},
doi = {https://doi.org/10.18653/v1/D19-6501},
year = {2019},
date = {2019},
booktitle = {Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019)},
pages = {1-12},
publisher = {Association for Computational Linguistics},
address = {Hong Kong, China},
abstract = {We analyse coreference phenomena in three neural machine translation systems trained with different data settings with or without access to explicit intra- and cross-sentential anaphoric information. We compare system performance on two different genres: news and TED talks. To do this, we manually annotate (the possibly incorrect) coreference chains in the MT outputs and evaluate the coreference chain translations. We define an error typology that aims to go further than pronoun translation adequacy and includes types such as incorrect word selection or missing words. The features of coreference chains in automatic translations are also compared to those of the source texts and human translations. The analysis shows stronger potential translationese effects in machine translated outputs than in human translations.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B6

Biswas, Rajarshi; Mogadala, Aditya; Barz, Michael; Sonntag, Daniel; Klakow, Dietrich

Automatic Judgement of Neural Network-Generated Image Captions Inproceedings

7th International Conference on Statistical Language and Speech Processing (SLSP2019), 11816, Ljubljana, Slovenia, 2019.

Manual evaluation of individual results of natural language generation tasks is one of the bottlenecks. It is very time consuming and expensive if it is, for example, crowdsourced. In this work, we address this problem for the specific task of automatic image captioning. We automatically generate human-like judgements on grammatical correctness, image relevance and diversity of the captions obtained from a neural image caption generator. For this purpose, we use pool-based active learning with uncertainty sampling and represent the captions using fixed size vectors from Google’s Universal Sentence Encoder. In addition, we test common metrics, such as BLEU, ROUGE, METEOR, Levenshtein distance, and n-gram counts and report F1 score for the classifiers used under the active learning scheme for this task. To the best of our knowledge, our work is the first in this direction and promises to reduce time, cost, and human effort.

 

@inproceedings{Biswas2019,
title = {Automatic Judgement of Neural Network-Generated Image Captions},
author = {Rajarshi Biswas and Aditya Mogadala and Michael Barz and Daniel Sonntag and Dietrich Klakow},
url = {https://link.springer.com/chapter/10.1007/978-3-030-31372-2_22},
year = {2019},
date = {2019},
booktitle = {7th International Conference on Statistical Language and Speech Processing (SLSP2019)},
address = {Ljubljana, Slovenia},
abstract = {Manual evaluation of individual results of natural language generation tasks is one of the bottlenecks. It is very time consuming and expensive if it is, for example, crowdsourced. In this work, we address this problem for the specific task of automatic image captioning. We automatically generate human-like judgements on grammatical correctness, image relevance and diversity of the captions obtained from a neural image caption generator. For this purpose, we use pool-based active learning with uncertainty sampling and represent the captions using fixed size vectors from Google’s Universal Sentence Encoder. In addition, we test common metrics, such as BLEU, ROUGE, METEOR, Levenshtein distance, and n-gram counts and report F1 score for the classifiers used under the active learning scheme for this task. To the best of our knowledge, our work is the first in this direction and promises to reduce time, cost, and human effort.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Lange, Lukas; Hedderich, Michael; Klakow, Dietrich

Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels Inproceedings

Inui, Kentaro; Jiang, Jing; Ng, Vincent; Wan, Xiaojun (Ed.): Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, pp. 3552-3557, Hong Kong, China, 2019.

In low-resource settings, the performance of supervised labeling models can be improved with automatically annotated or distantly supervised data, which is cheap to create but often noisy. Previous works have shown that significant improvements can be reached by injecting information about the confusion between clean and noisy labels in this additional training data into the classifier training. However, for noise estimation, these approaches either do not take the input features (in our case word embeddings) into account, or they need to learn the noise modeling from scratch which can be difficult in a low-resource setting. We propose to cluster the training data using the input features and then compute different confusion matrices for each cluster. To the best of our knowledge, our approach is the first to leverage feature-dependent noise modeling with pre-initialized confusion matrices. We evaluate on low-resource named entity recognition settings in several languages, showing that our methods improve upon other confusion-matrix based methods by up to 9%.

@inproceedings{lange-etal-2019-feature,
title = {Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels},
author = {Lukas Lange and Michael Hedderich and Dietrich Klakow},
editor = {Kentaro Inui and Jing Jiang and Vincent Ng and Xiaojun Wan},
url = {https://aclanthology.org/D19-1362/},
doi = {https://doi.org/10.18653/v1/D19-1362},
year = {2019},
date = {2019},
booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
pages = {3552-3557},
publisher = {Association for Computational Linguistics},
address = {Hong Kong, China},
abstract = {In low-resource settings, the performance of supervised labeling models can be improved with automatically annotated or distantly supervised data, which is cheap to create but often noisy. Previous works have shown that significant improvements can be reached by injecting information about the confusion between clean and noisy labels in this additional training data into the classifier training. However, for noise estimation, these approaches either do not take the input features (in our case word embeddings) into account, or they need to learn the noise modeling from scratch which can be difficult in a low-resource setting. We propose to cluster the training data using the input features and then compute different confusion matrices for each cluster. To the best of our knowledge, our approach is the first to leverage feature-dependent noise modeling with pre-initialized confusion matrices. We evaluate on low-resource named entity recognition settings in several languages, showing that our methods improve upon other confusion-matrix based methods by up to 9%.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Reich, Ingo

Saulecker und supergemütlich! Pilotstudien zur fragmentarischen Verwendung expressiver Adjektive. Incollection

d'Avis, Franz; Finkbeiner, Rita (Ed.): Expressivität im Deutschen, De Gruyter, pp. 109-128, Berlin, Boston, 2019.

Schaut man auf dem Kika die „Jungs-WG“ oder „Durch die Wildnis“, dann ist gefühlt jede dritte Äußerung eine isolierte Verwendung eines expressiven Adjektivs der Art „Mega!. Ausgehend von dieser ersten impressionistischen Beobachtung wird in diesem Artikel sowohl korpuslinguistisch wie auch experimentell der Hypothese nachgegangen, dass expressive Adjektive in fragmentarischer Verwendung signifikant akzeptabler sind als deskriptive Adjektive. Während sich diese Hypothese im Korpus zunächst weitgehend bestätigt, zeigen die experimentellen Untersuchungen zwar, dass expressive Äußerungen generell besser bewertet werden als deskriptive Äußerungen, die ursprüngliche Hypothese lässt sich aber nicht bestätigen. Die Diskrepanz zwischen den korpuslinguistischen und den experimentellen Ergebnissen wird in der Folge auf eine Unterscheidung zwischen individuenbezogenen und äußerungsbezogenen (expressiven) Adjektiven zurückgeführt und festgestellt, dass die Korpusergebnisse die Verteilung äußerungsbezogener expressiver Adjektive nachzeichnen, während sich die Experimente alleine auf individuenbezogene (expressive) Adjektive beziehen. Die ursprüngliche Hypothese wäre daher in dem Sinne zu qualifizieren, dass sie nur Aussagen über die isolierte Verwendung äußerungsbezogener Adjektive macht.

@incollection{Reich2019,
title = {Saulecker und supergem{\"u}tlich! Pilotstudien zur fragmentarischen Verwendung expressiver Adjektive.},
author = {Ingo Reich},
editor = {Franz d'Avis and Rita Finkbeiner},
url = {https://www.degruyter.com/document/doi/10.1515/9783110630190-005/html},
doi = {https://doi.org/10.1515/9783110630190-005},
year = {2019},
date = {2019},
booktitle = {Expressivit{\"a}t im Deutschen},
pages = {109-128},
publisher = {De Gruyter},
address = {Berlin, Boston},
abstract = {Schaut man auf dem Kika die „Jungs-WG“ oder „Durch die Wildnis“, dann ist gef{\"u}hlt jede dritte {\"A}u{\ss}erung eine isolierte Verwendung eines expressiven Adjektivs der Art „Mega!. Ausgehend von dieser ersten impressionistischen Beobachtung wird in diesem Artikel sowohl korpuslinguistisch wie auch experimentell der Hypothese nachgegangen, dass expressive Adjektive in fragmentarischer Verwendung signifikant akzeptabler sind als deskriptive Adjektive. W{\"a}hrend sich diese Hypothese im Korpus zun{\"a}chst weitgehend best{\"a}tigt, zeigen die experimentellen Untersuchungen zwar, dass expressive {\"A}u{\ss}erungen generell besser bewertet werden als deskriptive {\"A}u{\ss}erungen, die urspr{\"u}ngliche Hypothese l{\"a}sst sich aber nicht best{\"a}tigen. Die Diskrepanz zwischen den korpuslinguistischen und den experimentellen Ergebnissen wird in der Folge auf eine Unterscheidung zwischen individuenbezogenen und {\"a}u{\ss}erungsbezogenen (expressiven) Adjektiven zur{\"u}ckgef{\"u}hrt und festgestellt, dass die Korpusergebnisse die Verteilung {\"a}u{\ss}erungsbezogener expressiver Adjektive nachzeichnen, w{\"a}hrend sich die Experimente alleine auf individuenbezogene (expressive) Adjektive beziehen. Die urspr{\"u}ngliche Hypothese w{\"a}re daher in dem Sinne zu qualifizieren, dass sie nur Aussagen {\"u}ber die isolierte Verwendung {\"a}u{\ss}erungsbezogener Adjektive macht.},
pubstate = {published},
type = {incollection}
}

Copy BibTeX to Clipboard

Project:   B3

Scholman, Merel

Coherence relations in discourse and cognition: comparing approaches, annotations, and interpretations PhD Thesis

Saarland University, Saarbruecken, Germany, 2019.

When readers comprehend a discourse, they do not merely interpret each clause or sentence separately; rather, they assign meaning to the text by creating semantic links between the clauses and sentences. These links are known as coherence relations (cf. Hobbs, 1979; Sanders, Spooren & Noordman, 1992). If readers are not able to construct such relations between the clauses and sentences of a text, they will fail to fully understand that text. Discourse coherence is therefore crucial to natural language comprehension in general. Most frameworks that propose inventories of coherence relation types agree on the existence of certain coarse-grained relation types, such as causal relations (relations types belonging to the causal class include Cause or Result relations), and additive relations (e.g., Conjunctions or Specifications). However, researchers often disagree on which finer-grained relation types hold and, as a result, there is no uniform set of relations that the community has agreed on (Hovy & Maier, 1995). Using a combination of corpus-based studies and off-line and on-line experimental methods, the studies reported in this dissertation examine distinctions between types of relations. The studies are based on the argument that coherence relations are cognitive entities, and distinctions of coherence relation types should therefore be validated using observations that speak to both the descriptive adequacy and the cognitive plausibility of the distinctions. Various distinctions between relation types are investigated on several levels, corresponding to the central challenges of the thesis. First, the distinctions that are made in approaches to coherence relations are analysed by comparing the relational classes and assessing the theoretical correspondences between the proposals. An interlingua is developed that can be used to map relational labels from one approach to another, therefore improving the interoperability between the different approaches. Second, practical correspondences between different approaches are studied by evaluating datasets containing coherence relation annotations from multiple approaches. A comparison of the annotations from different approaches on the same data corroborate the interlingua, but also reveal systematic patterns of discrepancies between the frameworks that are caused by different operationalizations. Finally, in the experimental part of the dissertation, readers’ interpretations are investigated to determine whether readers are able to distinguish between specific types of relations that cause the discrepancies between approaches. Results from off-line and online studies provide insight into readers’ interpretations of multi-interpretable relations, individual differences in interpretations, anticipation of discourse structure, and distributional differences between languages on readers’ processing of discourse. In sum, the studies reported in this dissertation contribute to a more detailed understanding of which types of relations comprehenders construct and how these relations are inferred and processed.

@phdthesis{Scholman_diss_2019,
title = {Coherence relations in discourse and cognition: comparing approaches, annotations, and interpretations},
author = {Merel Scholman},
url = {http://nbn-resolving.de/urn:nbn:de:bsz:291--ds-278687},
doi = {https://doi.org/http://dx.doi.org/10.22028/D291-27868},
year = {2019},
date = {2019},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {When readers comprehend a discourse, they do not merely interpret each clause or sentence separately; rather, they assign meaning to the text by creating semantic links between the clauses and sentences. These links are known as coherence relations (cf. Hobbs, 1979; Sanders, Spooren & Noordman, 1992). If readers are not able to construct such relations between the clauses and sentences of a text, they will fail to fully understand that text. Discourse coherence is therefore crucial to natural language comprehension in general. Most frameworks that propose inventories of coherence relation types agree on the existence of certain coarse-grained relation types, such as causal relations (relations types belonging to the causal class include Cause or Result relations), and additive relations (e.g., Conjunctions or Specifications). However, researchers often disagree on which finer-grained relation types hold and, as a result, there is no uniform set of relations that the community has agreed on (Hovy & Maier, 1995). Using a combination of corpus-based studies and off-line and on-line experimental methods, the studies reported in this dissertation examine distinctions between types of relations. The studies are based on the argument that coherence relations are cognitive entities, and distinctions of coherence relation types should therefore be validated using observations that speak to both the descriptive adequacy and the cognitive plausibility of the distinctions. Various distinctions between relation types are investigated on several levels, corresponding to the central challenges of the thesis. First, the distinctions that are made in approaches to coherence relations are analysed by comparing the relational classes and assessing the theoretical correspondences between the proposals. An interlingua is developed that can be used to map relational labels from one approach to another, therefore improving the interoperability between the different approaches. Second, practical correspondences between different approaches are studied by evaluating datasets containing coherence relation annotations from multiple approaches. A comparison of the annotations from different approaches on the same data corroborate the interlingua, but also reveal systematic patterns of discrepancies between the frameworks that are caused by different operationalizations. Finally, in the experimental part of the dissertation, readers’ interpretations are investigated to determine whether readers are able to distinguish between specific types of relations that cause the discrepancies between approaches. Results from off-line and online studies provide insight into readers’ interpretations of multi-interpretable relations, individual differences in interpretations, anticipation of discourse structure, and distributional differences between languages on readers’ processing of discourse. In sum, the studies reported in this dissertation contribute to a more detailed understanding of which types of relations comprehenders construct and how these relations are inferred and processed.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   B2

Juzek, Tom; Fischer, Stefan; Krielke, Marie-Pauline; Degaetano-Ortlieb, Stefania; Teich, Elke

Challenges of parsing a historical corpus of Scientific English Miscellaneous

Historical Corpora and Variation (Book of Abstracts), Cagliari, Italy, 2019.

In this contribution, we outline our experiences with syntactically parsing a diachronic historical corpus. We report on how errors like OCR inaccuracies, end-of-sentence inaccuracies, etc. propagate bottom-up and how we approach such errors by building on existing machine learning approaches for error correction. The Royal Society Corpus (RSC; Kermes et al. 2016) is a collection of scientific text from 1665 to 1869 and contains ca. 10 000 documents and 30 million tokens. Using the RSC, we wish to describe and
model how syntactic complexity changes as Scientific English of the late modern period develops. Our focus is on how common measures of syntactic complexity, e.g. length in tokens, embedding depth, and number of dependants, relate to estimates of information content. Our hypothesis is that Scientific English develops towards the use of shorter sentences with fewer clausal embeddings and increasingly complex noun phrases over time, in order to accommodate an expansion on the lexical level.

@miscellaneous{Juzek2019a,
title = {Challenges of parsing a historical corpus of Scientific English},
author = {Tom Juzek and Stefan Fischer and Marie-Pauline Krielke and Stefania Degaetano-Ortlieb and Elke Teich},
url = {https://convegni.unica.it/hicov/files/2019/01/Juzek-et-al.pdf},
year = {2019},
date = {2019},
booktitle = {Historical Corpora and Variation (Book of Abstracts)},
address = {Cagliari, Italy},
abstract = {In this contribution, we outline our experiences with syntactically parsing a diachronic historical corpus. We report on how errors like OCR inaccuracies, end-of-sentence inaccuracies, etc. propagate bottom-up and how we approach such errors by building on existing machine learning approaches for error correction. The Royal Society Corpus (RSC; Kermes et al. 2016) is a collection of scientific text from 1665 to 1869 and contains ca. 10 000 documents and 30 million tokens. Using the RSC, we wish to describe and model how syntactic complexity changes as Scientific English of the late modern period develops. Our focus is on how common measures of syntactic complexity, e.g. length in tokens, embedding depth, and number of dependants, relate to estimates of information content. Our hypothesis is that Scientific English develops towards the use of shorter sentences with fewer clausal embeddings and increasingly complex noun phrases over time, in order to accommodate an expansion on the lexical level.},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   B1

Juzek, Tom; Fischer, Stefan; Krielke, Marie-Pauline; Degaetano-Ortlieb, Stefania; Teich, Elke

Annotation quality assessment and error correction in diachronic corpora: Combining pattern-based and machine learning approaches Miscellaneous

52nd Annual Meeting of the Societas Linguistica Europaea (Book of Abstracts), 2019.

@miscellaneous{Juzek2019,
title = {Annotation quality assessment and error correction in diachronic corpora: Combining pattern-based and machine learning approaches},
author = {Tom Juzek and Stefan Fischer and Marie-Pauline Krielke and Stefania Degaetano-Ortlieb and Elke Teich},
year = {2019},
date = {2019},
booktitle = {52nd Annual Meeting of the Societas Linguistica Europaea (Book of Abstracts)},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania; Menzel, Katrin; Teich, Elke

Typical linguistic patterns of English history texts from the eighteenth to the nineteenth century Book Chapter

Moskowich, Isabel; Crespo, Begoña; Puente-Castelo, Luis; Maria Monaco, Leida (Ed.): Writing History in Late Modern English: Explorations of the Coruña Corpus, John Benjamins, pp. 58-81, Amsterdam, 2019.

@inbook{Degaetano-Ortlieb2019b,
title = {Typical linguistic patterns of English history texts from the eighteenth to the nineteenth century},
author = {Stefania Degaetano-Ortlieb and Katrin Menzel and Elke Teich},
editor = {Isabel Moskowich and Bego{\~n}a Crespo and Luis Puente-Castelo and Leida Maria Monaco},
url = {https://benjamins.com/catalog/z.225.04deg},
year = {2019},
date = {2019},
booktitle = {Writing History in Late Modern English: Explorations of the Coru{\~n}a Corpus},
pages = {58-81},
publisher = {John Benjamins},
address = {Amsterdam},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   B1

Krielke, Marie-Pauline; Fischer, Stefan; Degaetano-Ortlieb, Stefania; Teich, Elke

System and use of wh-relativizers in 200 years of English scientific writing Miscellaneous

10th International Corpus Linguistics Conference, Cardiff, Wales, UK, 2019.

We investigate the diachronic development of wh-relativizers in English scientific writing in the late modern period, characterized by an initially richly populated paradigm in the late 17th/early 18th century and a reduction to only a few options by the mid 19th century. To explain this reduction, we take the perspective of rational communication, according to which language users, while striving for successful communication, seek to reduce their effort. Previous work has shown that production effort is directly linked to the number of options at a given choice point (Milin et al. 2009, Linzen and Jaeger 2016). This effort is appropriately indexed by entropy: The more options with equal/similar probability, the higher the entropy, i.e. the higher the production effort. Similarly, processing effort is correlated with predictability in context – surprisal (Levy 2008). Highly predictable, conventionalized patterns are easier to produce and comprehend than less predictable ones. Assuming that language users strive for ease in communication, diachronically they are likely to (a) develop a preference for which options to use and discard others to reduce entropy, and (b) converge on how to use those options to reduce surprisal. We test this for the changing use of wh-relativizers in scientific text in the late modern period. Many scholars have investigated variation in relativizer choice in standard spoken and written varieties (e.g. Guy and Bayley 1995; Biber et al. 1999; Lehmann 2001; Hinrichs et al. 2015), in vernacular speech (e.g. Romaine 1982, Tottie and Harvie
2000; Tagliamonte 2002; Tagliamonte et al. 2005; Levey 2006), and from synchronic and diachronic perspectives (e.g. Romaine 1980; Ball 1996; Hundt et al. 2012; Nevalainen 2012, Nevalainen and Raumolin-Brunberg 2002). While stylistic variability of the different options in written present day English is well known (see Biber et al. 1999; Leech et al. 2009), we know little about the diachronic development of relativizers according to register, e.g. in scientific writing. Also, most research only considers most common relativizers (e.g. which, that, zero) still in use in present day English. Here, we study a more comprehensive set of relativizers across scientific and “general language” (mix of registers) from a diachronic perspective. Possible paradigmatic change is analyzed by diachronic word embeddings (cf. Fankhauser and Kupietz 2017), allowing us to select items affected by change. Then we assess the change (reduction/expansion) of a paradigm estimating its entropy over time. To check whether changes are specific to scientific language, we compare with uses in general language. Finally, we inspect possible changes in the predictability of selected wh-relativizers involved in paradigmatic change estimating their surprisal over time, looking for traces of conventionalization (cf. Degaetano-Ortlieb and Teich 2016, 2018).

@miscellaneous{Krielke2019b,
title = {System and use of wh-relativizers in 200 years of English scientific writing},
author = {Marie-Pauline Krielke and Stefan Fischer and Stefania Degaetano-Ortlieb and Elke Teich},
url = {https://stefaniadegaetano.files.wordpress.com/2019/05/cl2019_paper_266.pdf},
year = {2019},
date = {2019},
booktitle = {10th International Corpus Linguistics Conference},
address = {Cardiff, Wales, UK},
abstract = {We investigate the diachronic development of wh-relativizers in English scientific writing in the late modern period, characterized by an initially richly populated paradigm in the late 17th/early 18th century and a reduction to only a few options by the mid 19th century. To explain this reduction, we take the perspective of rational communication, according to which language users, while striving for successful communication, seek to reduce their effort. Previous work has shown that production effort is directly linked to the number of options at a given choice point (Milin et al. 2009, Linzen and Jaeger 2016). This effort is appropriately indexed by entropy: The more options with equal/similar probability, the higher the entropy, i.e. the higher the production effort. Similarly, processing effort is correlated with predictability in context – surprisal (Levy 2008). Highly predictable, conventionalized patterns are easier to produce and comprehend than less predictable ones. Assuming that language users strive for ease in communication, diachronically they are likely to (a) develop a preference for which options to use and discard others to reduce entropy, and (b) converge on how to use those options to reduce surprisal. We test this for the changing use of wh-relativizers in scientific text in the late modern period. Many scholars have investigated variation in relativizer choice in standard spoken and written varieties (e.g. Guy and Bayley 1995; Biber et al. 1999; Lehmann 2001; Hinrichs et al. 2015), in vernacular speech (e.g. Romaine 1982, Tottie and Harvie 2000; Tagliamonte 2002; Tagliamonte et al. 2005; Levey 2006), and from synchronic and diachronic perspectives (e.g. Romaine 1980; Ball 1996; Hundt et al. 2012; Nevalainen 2012, Nevalainen and Raumolin-Brunberg 2002). While stylistic variability of the different options in written present day English is well known (see Biber et al. 1999; Leech et al. 2009), we know little about the diachronic development of relativizers according to register, e.g. in scientific writing. Also, most research only considers most common relativizers (e.g. which, that, zero) still in use in present day English. Here, we study a more comprehensive set of relativizers across scientific and “general language” (mix of registers) from a diachronic perspective. Possible paradigmatic change is analyzed by diachronic word embeddings (cf. Fankhauser and Kupietz 2017), allowing us to select items affected by change. Then we assess the change (reduction/expansion) of a paradigm estimating its entropy over time. To check whether changes are specific to scientific language, we compare with uses in general language. Finally, we inspect possible changes in the predictability of selected wh-relativizers involved in paradigmatic change estimating their surprisal over time, looking for traces of conventionalization (cf. Degaetano-Ortlieb and Teich 2016, 2018).},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania; Krielke, Marie-Pauline; Scheurer, Franziska; Teich, Elke

A diachronic perspective on efficiency in language use: that-complement clause in academic writing across 300 years Inproceedings

Proceedings of the 10th International Corpus Linguistics Conference, Cardiff, Wales, UK, 2019.

Efficiency in language use and the role of predictability in context have attracted many researchers from different fields (Zipf 1949; Landau 1969; Fidelholtz 1975, Jurafsky et al. 1998; Bybee and Scheibman 1999; Genzel and Charniak 2002; Aylett and Turk 2004; Hawkins 2004; Piantadosi et al. 2009, Jaeger 2010). The analysis of reduction processes, where linguistic units are reduced/omitted has enhanced our knowledge on efficiency in communication. Possible factors affecting retention or omission of an optional element include discourse context (cf. Thompson and Mulac 1991), the amount of information a unit transmits given its context (known as surprisal, cf. Jaeger 2010) or the complexity of the syntagmatic environment (Rohdenburg 1998). So far, the role change in language use plays has been less considered.

@inproceedings{Degaetano-Ortlieb2019b,
title = {A diachronic perspective on efficiency in language use: that-complement clause in academic writing across 300 years},
author = {Stefania Degaetano-Ortlieb and Marie-Pauline Krielke and Franziska Scheurer and Elke Teich},
url = {https://stefaniadegaetano.files.wordpress.com/2019/05/abstract_that-comp_final.pdf},
year = {2019},
date = {2019},
booktitle = {Proceedings of the 10th International Corpus Linguistics Conference},
address = {Cardiff, Wales, UK},
abstract = {Efficiency in language use and the role of predictability in context have attracted many researchers from different fields (Zipf 1949; Landau 1969; Fidelholtz 1975, Jurafsky et al. 1998; Bybee and Scheibman 1999; Genzel and Charniak 2002; Aylett and Turk 2004; Hawkins 2004; Piantadosi et al. 2009, Jaeger 2010). The analysis of reduction processes, where linguistic units are reduced/omitted has enhanced our knowledge on efficiency in communication. Possible factors affecting retention or omission of an optional element include discourse context (cf. Thompson and Mulac 1991), the amount of information a unit transmits given its context (known as surprisal, cf. Jaeger 2010) or the complexity of the syntagmatic environment (Rohdenburg 1998). So far, the role change in language use plays has been less considered.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania

Hybridization effects in literary texts Inproceedings

Proceedings of the 10th International Corpus Linguistics Conference, Cardiff, Wales, UK, 2019.

We present an analysis of subregisters, whose differentiation is still a difficult task due to their hybridity reflected in conforming to a presumed “norm” and encompassing something “new”. We focus on texts at the interface between what Halliday (2002: 177) calls two opposite “cultures”, literature and science (here: science fiction texts). Texts belonging to one register will exhibit similar choices of lexico-grammatical features. Hybrid texts at the intersection between two registers will reflect a mixture of particular features (cf. Degaetano-Ortlieb et al. 2014, Biber et al. 2015, Teich et al. 2013, 2016, Underwood 2016). Consider example (1) taken from Mary Shelley’s Frankenstein. While traditionally grounded as a literary text, it shows a registerial nuance from the influential register of science. This encompasses phrases (bold) also found in scientific articles from that period (e.g. in the Royal Society Corpus, cf. Kermes et al. 2016), verbs related to scientific endeavor (e.g. become acquainted, examine, observe, discover), and scientific terminology (e.g. anatomy, decay, corruption, vertebrae, inflammable air) packed into complex nominal phrases (underlined). Note that features marking this registerial nuance include not only lexical but also grammatical features.

(1) I became acquainted with the science of anatomy, but this was not sufficient; I must also observe the natural decay and corruption of the human body. […] Now I was led to examine the cause and progress of this decay. I succeeded in discovering the cause of generation and life. (Frankenstein, Mary Shelley, 1818/1823).

Thus, we hypothesize that hybrid registers while mainly resembling their traditional register in the use of lexico-grammatical features (H1 register resemblance), will also show particular lexico-grammatical nuances of their influential register (H2 registerial nuance). In particular, we are interested in (a) variation across registers to see which lexico-grammatical features are involved in hybridization effects and (b) intra-textual variation (e.g. across chapters) to analyze in which parts of a text hybridization effects are most prominent.

@inproceedings{Degaetano-Ortlieb2019b,
title = {Hybridization effects in literary texts},
author = {Stefania Degaetano-Ortlieb},
url = {https://stefaniadegaetano.files.wordpress.com/2019/05/abstact_cl2019_hybridization_final.pdf},
year = {2019},
date = {2019},
booktitle = {Proceedings of the 10th International Corpus Linguistics Conference},
address = {Cardiff, Wales, UK},
abstract = {We present an analysis of subregisters, whose differentiation is still a difficult task due to their hybridity reflected in conforming to a presumed “norm” and encompassing something “new”. We focus on texts at the interface between what Halliday (2002: 177) calls two opposite “cultures”, literature and science (here: science fiction texts). Texts belonging to one register will exhibit similar choices of lexico-grammatical features. Hybrid texts at the intersection between two registers will reflect a mixture of particular features (cf. Degaetano-Ortlieb et al. 2014, Biber et al. 2015, Teich et al. 2013, 2016, Underwood 2016). Consider example (1) taken from Mary Shelley’s Frankenstein. While traditionally grounded as a literary text, it shows a registerial nuance from the influential register of science. This encompasses phrases (bold) also found in scientific articles from that period (e.g. in the Royal Society Corpus, cf. Kermes et al. 2016), verbs related to scientific endeavor (e.g. become acquainted, examine, observe, discover), and scientific terminology (e.g. anatomy, decay, corruption, vertebrae, inflammable air) packed into complex nominal phrases (underlined). Note that features marking this registerial nuance include not only lexical but also grammatical features. (1) I became acquainted with the science of anatomy, but this was not sufficient; I must also observe the natural decay and corruption of the human body. […] Now I was led to examine the cause and progress of this decay. I succeeded in discovering the cause of generation and life. (Frankenstein, Mary Shelley, 1818/1823). Thus, we hypothesize that hybrid registers while mainly resembling their traditional register in the use of lexico-grammatical features (H1 register resemblance), will also show particular lexico-grammatical nuances of their influential register (H2 registerial nuance). In particular, we are interested in (a) variation across registers to see which lexico-grammatical features are involved in hybridization effects and (b) intra-textual variation (e.g. across chapters) to analyze in which parts of a text hybridization effects are most prominent.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania; Piper, Andrew

The Scientization of Literary Study Inproceedings

Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature at NAACL 2019, Association for Computational Linguistics, pp. 18-28, Minneapolis, MN, USA, 2019.

Scholarly practices within the humanities have historically been perceived as distinct from the natural sciences. We look at literary studies, a discipline strongly anchored in the humanities, and hypothesize that over the past half-century literary studies has instead undergone a process of “scientization”, adopting linguistic behavior similar to the sciences. We test this using methods based on information theory, comparing a corpus of literary studies articles (around 63,400) with a corpus of standard English and scientific English respectively. We show evidence for “scientization” effects in literary studies, though at a more muted level than scientific English, suggesting that literary studies occupies a middle ground with respect to standard English in the larger space of academic disciplines. More generally, our methodology can be applied to investigate the social positioning and development of language use across different domains (e.g. scientific disciplines, language varieties, registers).

@inproceedings{degaetano-ortlieb-piper-2019-scientization,
title = {The Scientization of Literary Study},
author = {Stefania Degaetano-Ortlieb and Andrew Piper},
url = {https://aclanthology.org/W19-2503},
doi = {https://doi.org/10.18653/v1/W19-2503},
year = {2019},
date = {2019},
booktitle = {Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature at NAACL 2019},
pages = {18-28},
publisher = {Association for Computational Linguistics},
address = {Minneapolis, MN, USA},
abstract = {Scholarly practices within the humanities have historically been perceived as distinct from the natural sciences. We look at literary studies, a discipline strongly anchored in the humanities, and hypothesize that over the past half-century literary studies has instead undergone a process of “scientization”, adopting linguistic behavior similar to the sciences. We test this using methods based on information theory, comparing a corpus of literary studies articles (around 63,400) with a corpus of standard English and scientific English respectively. We show evidence for “scientization” effects in literary studies, though at a more muted level than scientific English, suggesting that literary studies occupies a middle ground with respect to standard English in the larger space of academic disciplines. More generally, our methodology can be applied to investigate the social positioning and development of language use across different domains (e.g. scientific disciplines, language varieties, registers).},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania; Teich, Elke

Toward an optimal code for communication: the case of scientific English Journal Article

Corpus Linguistics and Linguistic Theory, 18, pp. 1-33, 2019.

We present a model of the linguistic development of scientific English from the mid-seventeenth to the late-nineteenth century, a period that witnessed significant political and social changes, including the evolution of modern science. There is a wealth of descriptive accounts of scientific English, both from a synchronic and a diachronic perspective, but only few attempts at a unified explanation of its evolution. The explanation we offer here is a communicative one: while external pressures (specialization, diversification) push for an increase in expressivity, communicative concerns pull toward convergence on particular options (conventionalization). What emerges over time is a code which is optimized for written, specialist communication, relying on specific linguistic means to modulate information content. As we show, this is achieved by the systematic interplay between lexis and grammar. The corpora we employ are the Royal Society Corpus (RSC) and for comparative purposes, the Corpus of Late Modern English (CLMET). We build various diachronic, computational n-gram language models of these corpora and then apply formal measures of information content (here: relative entropy and surprisal) to detect the linguistic features significantly contributing to diachronic change, estimate the (changing) level of information of features and capture the time course of change.

 

@article{Degaetano-Ortlieb2019b,
title = {Toward an optimal code for communication: the case of scientific English},
author = {Stefania Degaetano-Ortlieb and Elke Teich},
url = {https://www.degruyter.com/document/doi/10.1515/cllt-2018-0088/html?lang=en},
doi = {https://doi.org/10.1515/cllt-2018-0088},
year = {2019},
date = {2019},
journal = {Corpus Linguistics and Linguistic Theory},
pages = {1-33},
volume = {18},
number = {1},
abstract = {We present a model of the linguistic development of scientific English from the mid-seventeenth to the late-nineteenth century, a period that witnessed significant political and social changes, including the evolution of modern science. There is a wealth of descriptive accounts of scientific English, both from a synchronic and a diachronic perspective, but only few attempts at a unified explanation of its evolution. The explanation we offer here is a communicative one: while external pressures (specialization, diversification) push for an increase in expressivity, communicative concerns pull toward convergence on particular options (conventionalization). What emerges over time is a code which is optimized for written, specialist communication, relying on specific linguistic means to modulate information content. As we show, this is achieved by the systematic interplay between lexis and grammar. The corpora we employ are the Royal Society Corpus (RSC) and for comparative purposes, the Corpus of Late Modern English (CLMET). We build various diachronic, computational n-gram language models of these corpora and then apply formal measures of information content (here: relative entropy and surprisal) to detect the linguistic features significantly contributing to diachronic change, estimate the (changing) level of information of features and capture the time course of change.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B1

Krielke, Marie-Pauline; Degaetano-Ortlieb, Stefania; Menzel, Katrin; Teich, Elke

Paradigmatic change and redistribution of functional load: The case of relative clauses in scientific English Miscellaneous

Symposium on Corpus Approaches to Lexicogrammar (Book of Abstracts), Edge Hill University, 2019.

@miscellaneous{Krielke2019,
title = {Paradigmatic change and redistribution of functional load: The case of relative clauses in scientific English},
author = {Marie-Pauline Krielke and Stefania Degaetano-Ortlieb and Katrin Menzel and Elke Teich},
year = {2019},
date = {2019},
booktitle = {Symposium on Corpus Approaches to Lexicogrammar (Book of Abstracts)},
address = {Edge Hill University},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   B1

Menzel, Katrin; Teich, Elke

Medical discourse across 300 years: insights from the Royal Society Corpus Inproceedings

2nd International Conference on Historical Medical Discourse (CHIMED-2), 2019.

@inproceedings{Menzel2019b,
title = {Medical discourse across 300 years: insights from the Royal Society Corpus},
author = {Katrin Menzel and Elke Teich},
year = {2019},
date = {2019},
booktitle = {2nd International Conference on Historical Medical Discourse (CHIMED-2)},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania; Teich, Elke; Khamis, Ashraf; Kermes, Hannah

An Information-Theoretic Approach to Modeling Diachronic Change in Scientific English Book Chapter

Suhr, Carla; Nevalainen, Terttu; Taavitsainen, Irma (Ed.): From Data to Evidence in English Language Research, Brill, pp. 258-281, Leiden, 2019.

We present an information-theoretic approach to investigate diachronic change in scientific English. Our main assumption is that over time scientific English has become increasingly dense, i.e. linguistic constructions allowing dense packing of information are progressively used. So far, diachronic change in scientific writing has been investigated by means of frequency-based approaches (see e.g. Halliday (1988); Atkinson (1998); Biber (2006b, c); Biber and Gray (2016); Banks (2008); Taavitsainen and Pahta (2010)). We use information-theoretic measures (entropy, surprisal; Shannon (1949)) to assess features previously stated to change over time and to discover new, latent features from the data itself that are involved in diachronic change. For this, we use the Royal Society Corpus (rsc) (Kermes et al. (2016)), which spans over the time period 1665 to 1869. We present three kinds of analyses: nominal compounding (typical of academic writing), modal verbs (shown to have changed in frequency over time), and an analysis based on part-of-speech trigrams to detect new features that change diachronically. We show how information-theoretic measures help to investigate, evaluate and detect features involved in diachronic change.

@inbook{Degaetano-Ortlieb2019,
title = {An Information-Theoretic Approach to Modeling Diachronic Change in Scientific English},
author = {Stefania Degaetano-Ortlieb and Elke Teich and Ashraf Khamis and Hannah Kermes},
editor = {Carla Suhr and Terttu Nevalainen and Irma Taavitsainen},
url = {https://brill.com/display/book/edcoll/9789004390652/BP000014.xml},
doi = {https://doi.org/10.1163/9789004390652},
year = {2019},
date = {2019},
booktitle = {From Data to Evidence in English Language Research},
pages = {258-281},
publisher = {Brill},
address = {Leiden},
abstract = {We present an information-theoretic approach to investigate diachronic change in scientific English. Our main assumption is that over time scientific English has become increasingly dense, i.e. linguistic constructions allowing dense packing of information are progressively used. So far, diachronic change in scientific writing has been investigated by means of frequency-based approaches (see e.g. Halliday (1988); Atkinson (1998); Biber (2006b, c); Biber and Gray (2016); Banks (2008); Taavitsainen and Pahta (2010)). We use information-theoretic measures (entropy, surprisal; Shannon (1949)) to assess features previously stated to change over time and to discover new, latent features from the data itself that are involved in diachronic change. For this, we use the Royal Society Corpus (rsc) (Kermes et al. (2016)), which spans over the time period 1665 to 1869. We present three kinds of analyses: nominal compounding (typical of academic writing), modal verbs (shown to have changed in frequency over time), and an analysis based on part-of-speech trigrams to detect new features that change diachronically. We show how information-theoretic measures help to investigate, evaluate and detect features involved in diachronic change.},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   B1

Wichlacz, Julia; Torralba, Álvaro; Hoffmann, Jörg

Construction-Planning Models in Minecraft Inproceedings

Proceedings of the 2nd Workshop on Hierarchical Planning at ICAPS 2019, pp. 1-5, 2019.

Minecraft is a videogame that offers many interesting challenges for AI systems. In this paper, we focus in construction scenarios where an agent must build a complex structure made of individual blocks. As higher-level objects are formed of lower-level objects, the construction can naturally be modelled as a hierarchical task network. We model a house-construction scenario in classical and HTN planning and compare the advantages and disadvantages of both kinds of models.

@inproceedings{Wichlacz2019,
title = {Construction-Planning Models in Minecraft},
author = {Julia Wichlacz and {\'A}lvaro Torralba and J{\"o}rg Hoffmann},
url = {https://www.semanticscholar.org/paper/Construction-Planning-Models-in-Minecraft-Wichlacz-Torralba/d2ffb1c4b815f1b245f248d436baf9a3c28cc148},
year = {2019},
date = {2019},
booktitle = {Proceedings of the 2nd Workshop on Hierarchical Planning at ICAPS 2019},
pages = {1-5},
abstract = {Minecraft is a videogame that offers many interesting challenges for AI systems. In this paper, we focus in construction scenarios where an agent must build a complex structure made of individual blocks. As higher-level objects are formed of lower-level objects, the construction can naturally be modelled as a hierarchical task network. We model a house-construction scenario in classical and HTN planning and compare the advantages and disadvantages of both kinds of models.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A7

Köhn, Arne; Koller, Alexander

Talking about what is not there: Generating indefinite referring expressions in Minecraft Inproceedings

Proceedings of the 12th International Conference on Natural Language Generation, Association for Computational Linguistics, pp. 1-10, Tokyo, Japan, 2019.

When generating technical instructions, it is often necessary to describe an object that does not exist yet. For example, an NLG system which explains how to build a house needs to generate sentences like “build a wall of height five to your left” and “now build a wall on the other side.” Generating (indefinite) referring expressions to objects that do not exist yet is fundamentally different from generating the usual definite referring expressions, because the new object must be distinguished from an infinite set of possible alternatives. We formalize this problem and present an algorithm for generating such expressions, in the context of generating building instructions within the Minecraft video game.

@inproceedings{Köhn2019,
title = {Talking about what is not there: Generating indefinite referring expressions in Minecraft},
author = {Arne K{\"o}hn and Alexander Koller},
url = {https://www.aclweb.org/anthology/W19-8601},
doi = {https://doi.org/10.18653/v1/W19-8601},
year = {2019},
date = {2019},
booktitle = {Proceedings of the 12th International Conference on Natural Language Generation},
pages = {1-10},
publisher = {Association for Computational Linguistics},
address = {Tokyo, Japan},
abstract = {When generating technical instructions, it is often necessary to describe an object that does not exist yet. For example, an NLG system which explains how to build a house needs to generate sentences like “build a wall of height five to your left” and “now build a wall on the other side.” Generating (indefinite) referring expressions to objects that do not exist yet is fundamentally different from generating the usual definite referring expressions, because the new object must be distinguished from an infinite set of possible alternatives. We formalize this problem and present an algorithm for generating such expressions, in the context of generating building instructions within the Minecraft video game.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A7

Höltje, Gerrit; Lubahn, Bente; Mecklinger, Axel

The congruent, the incongruent, and the unexpected: Event-related potentials unveil the processes involved in schematic encoding Journal Article

Neuropsychologia, 131, pp. 285-293, 2019.

Learning is most effective when new information can be related to a preexisting knowledge structure or schema. In the present study, event-related potentials (ERPs) were used to investigate the temporal dynamics of the processes by which activated schemata support the encoding of schema-congruent information. Participants learned category exemplar words that were either semantically congruent or incongruent with a preceding category cue phrase. Congruent words were composed of expected (high typicality, HT) and unexpected (low typicality, LT) category exemplars. On the next day, recognition memory for the exemplars and the category cues they were presented with was tested. Semantically related lures were used in order to ascertain that memory judgements were based on episodic memory for specific category exemplars. Generally, congruent (HT and LT) exemplars were remembered better than incongruent exemplars. ERPs recorded during the encoding of the exemplar words were compared for subsequently remembered and forgotten items. Subsequent memory effects (SME) emerged in the N400 time window at frontal electrodes and did not differ between congruent and incongruent exemplars. In the same epoch, an SME with a parietal distribution was specific for congruent exemplars, suggesting that activated schemata strengthened memory for congruent exemplars by supporting the encoding of item-specific details. Subsequently remembered LT exemplars were associated with a late frontal positivity that is assumed to reflect expectancy mismatch-related processing such as the contextual integration of an unexpected word by the suppression of strongly expected words. A correlation analysis revealed that the greater the involvement of the processes reflected by the frontal positivity, the lower the level of false positive memory responses in the test phase one day later. These results suggest that the contextual integration of schema-congruent but unexpected events involves a weakening of the representations of semantically related, but unstudied items in memory and by this benefits subsequent memory.

@article{Höltje2019,
title = {The congruent, the incongruent, and the unexpected: Event-related potentials unveil the processes involved in schematic encoding},
author = {Gerrit H{\"o}ltje and Bente Lubahn and Axel Mecklinger},
url = {https://www.sciencedirect.com/science/article/pii/S0028393219301228?via%3Dihub},
doi = {https://doi.org/10.1016/j.neuropsychologia.2019.05.013},
year = {2019},
date = {2019},
journal = {Neuropsychologia},
pages = {285-293},
volume = {131},
abstract = {Learning is most effective when new information can be related to a preexisting knowledge structure or schema. In the present study, event-related potentials (ERPs) were used to investigate the temporal dynamics of the processes by which activated schemata support the encoding of schema-congruent information. Participants learned category exemplar words that were either semantically congruent or incongruent with a preceding category cue phrase. Congruent words were composed of expected (high typicality, HT) and unexpected (low typicality, LT) category exemplars. On the next day, recognition memory for the exemplars and the category cues they were presented with was tested. Semantically related lures were used in order to ascertain that memory judgements were based on episodic memory for specific category exemplars. Generally, congruent (HT and LT) exemplars were remembered better than incongruent exemplars. ERPs recorded during the encoding of the exemplar words were compared for subsequently remembered and forgotten items. Subsequent memory effects (SME) emerged in the N400 time window at frontal electrodes and did not differ between congruent and incongruent exemplars. In the same epoch, an SME with a parietal distribution was specific for congruent exemplars, suggesting that activated schemata strengthened memory for congruent exemplars by supporting the encoding of item-specific details. Subsequently remembered LT exemplars were associated with a late frontal positivity that is assumed to reflect expectancy mismatch-related processing such as the contextual integration of an unexpected word by the suppression of strongly expected words. A correlation analysis revealed that the greater the involvement of the processes reflected by the frontal positivity, the lower the level of false positive memory responses in the test phase one day later. These results suggest that the contextual integration of schema-congruent but unexpected events involves a weakening of the representations of semantically related, but unstudied items in memory and by this benefits subsequent memory.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A6

Successfully