Publications

Degaetano-Ortlieb, Stefania; Menzel, Katrin; Teich, Elke

The course of grammatical change in scientific writing: Interdependency between convention and productivity Inproceedings

Proceedings of the Corpus and Language Variation in English Research Conference (CLAVIER), Bari, Italy, 2017.

We present an empirical approach to analyze the course of usage change in scientific writing. A great amount of linguistic research has dealt with grammatical changes, showing their gradual course of change, which nearly always progresses stepwise (see e.g. Bybee et al. 1994, Hopper and Traugott 2003, Lee 2011, De Smet and Van de Velde 2013). Less well understood is under which conditions these changes occur. According to De Smet (2016), specific expressions increase in frequency in one grammatical context, adopting a more conventionalized use, which in turn makes them available in closely related grammatical contexts.

@inproceedings{Degaetano-Ortlieb2017b,
title = {The course of grammatical change in scientific writing: Interdependency between convention and productivity},
author = {Stefania Degaetano-Ortlieb and Katrin Menzel and Elke Teich},
url = {https://stefaniadegaetano.files.wordpress.com/2017/07/clavier2017-degaetano-etal_accepted_final.pdf},
year = {2017},
date = {2017},
booktitle = {Proceedings of the Corpus and Language Variation in English Research Conference (CLAVIER)},
address = {Bari, Italy},
abstract = {We present an empirical approach to analyze the course of usage change in scientific writing. A great amount of linguistic research has dealt with grammatical changes, showing their gradual course of change, which nearly always progresses stepwise (see e.g. Bybee et al. 1994, Hopper and Traugott 2003, Lee 2011, De Smet and Van de Velde 2013). Less well understood is under which conditions these changes occur. According to De Smet (2016), specific expressions increase in frequency in one grammatical context, adopting a more conventionalized use, which in turn makes them available in closely related grammatical contexts.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Menzel, Katrin; Degaetano-Ortlieb, Stefania

The diachronic development of combining forms in scientific writing Journal Article

Lege Artis. Language yesterday, today, tomorrow. The Journal of University of SS Cyril and Methodius in Trnava. Warsaw: De Gruyter Open, 2, pp. 185-249, 2017.
This paper addresses the diachronic development of combining forms in English scientific texts over approximately 350 years, from the early stages of the first scholarly journals that were published in English to contemporary English scientific publications. In this paper a critical discussion of the category of combining forms is presented and a case study is produced to examine the role of selected combining forms in two diachronic English corpora.

@article{Menzel2017,
title = {The diachronic development of combining forms in scientific writing},
author = {Katrin Menzel and Stefania Degaetano-Ortlieb},
url = {https://www.researchgate.net/publication/321776056_The_diachronic_development_of_combining_forms_in_scientific_writing},
year = {2017},
date = {2017},
journal = {Lege Artis. Language yesterday, today, tomorrow. The Journal of University of SS Cyril and Methodius in Trnava. Warsaw: De Gruyter Open},
pages = {185-249},
volume = {2},
number = {2},
abstract = {

This paper addresses the diachronic development of combining forms in English scientific texts over approximately 350 years, from the early stages of the first scholarly journals that were published in English to contemporary English scientific publications. In this paper a critical discussion of the category of combining forms is presented and a case study is produced to examine the role of selected combining forms in two diachronic English corpora.
},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania; Fischer, Stefan; Demberg, Vera; Teich, Elke

An information-theoretic account on the diachronic development of discourse connectors in scientific writing Inproceedings

39th DGfS AG1, Saarbrücken, Germany, 2017.

@inproceedings{Degaetano-Ortlieb2017b,
title = {An information-theoretic account on the diachronic development of discourse connectors in scientific writing},
author = {Stefania Degaetano-Ortlieb and Stefan Fischer and Vera Demberg and Elke Teich},
year = {2017},
date = {2017},
publisher = {39th DGfS AG1},
address = {Saarbr{\"u}cken, Germany},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Knappen, Jörg; Fischer, Stefan; Kermes, Hannah; Teich, Elke; Fankhauser, Peter

The making of the Royal Society Corpus Inproceedings

21st Nordic Conference on Computational Linguistics (NoDaLiDa) Workshop on Processing Historical language, Workshop on Processing Historical language, pp. 7-11, Gothenburg, Sweden, 2017.
The Royal Society Corpus is a corpus of Early and Late modern English built in an agile process covering publications of the Royal Society of London from 1665 to 1869 (Kermes et al., 2016) with a size of approximately 30 million words. In this paper we will provide details on two aspects of the building process namely the mining of patterns for OCR correction and the improvement and evaluation of part-of-speech tagging.

@inproceedings{Knappen2017,
title = {The making of the Royal Society Corpus},
author = {J{\"o}rg Knappen and Stefan Fischer and Hannah Kermes and Elke Teich and Peter Fankhauser},
url = {https://www.researchgate.net/publication/331648134_The_Making_of_the_Royal_Society_Corpus},
year = {2017},
date = {2017},
booktitle = {21st Nordic Conference on Computational Linguistics (NoDaLiDa) Workshop on Processing Historical language},
pages = {7-11},
publisher = {Workshop on Processing Historical language},
address = {Gothenburg, Sweden},
abstract = {

The Royal Society Corpus is a corpus of Early and Late modern English built in an agile process covering publications of the Royal Society of London from 1665 to 1869 (Kermes et al., 2016) with a size of approximately 30 million words. In this paper we will provide details on two aspects of the building process namely the mining of patterns for OCR correction and the improvement and evaluation of part-of-speech tagging.
},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Kermes, Hannah; Teich, Elke

Average surprisal of parts-of-speech Inproceedings

Corpus Linguistics 2017, Birmingham, UK, 2017.

We present an approach to investigate the differences between lexical words and function words and the respective parts-of-speech from an information-theoretical point of view (cf. Shannon, 1949). We use average surprisal (AvS) to measure the amount of information transmitted by a linguistic unit. We expect to find function words to be more predictable (having a lower AvS) and lexical words to be less predictable (having a higher AvS). We also assume that function words‘ AvS is fairly constant over time and registers, while AvS of lexical words is more variable depending on time and register.

@inproceedings{Kermes2017,
title = {Average surprisal of parts-of-speech},
author = {Hannah Kermes and Elke Teich},
url = {https://www.birmingham.ac.uk/Documents/college-artslaw/corpus/conference-archives/2017/general/paper207.pdf},
year = {2017},
date = {2017},
publisher = {Corpus Linguistics 2017},
address = {Birmingham, UK},
abstract = {We present an approach to investigate the differences between lexical words and function words and the respective parts-of-speech from an information-theoretical point of view (cf. Shannon, 1949). We use average surprisal (AvS) to measure the amount of information transmitted by a linguistic unit. We expect to find function words to be more predictable (having a lower AvS) and lexical words to be less predictable (having a higher AvS). We also assume that function words' AvS is fairly constant over time and registers, while AvS of lexical words is more variable depending on time and register.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Degaetano-Ortlieb, Stefania; Teich, Elke

Modeling intra-textual variation with entropy and surprisal: Topical vs. stylistic patterns Inproceedings

Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Association for Computational Linguistics, pp. 68-77, Vancouver, Canada, 2017.

We present a data-driven approach to investigate intra-textual variation by combining entropy and surprisal. With this approach we detect linguistic variation based on phrasal lexico-grammatical patterns across sections of research articles. Entropy is used to detect patterns typical of specific sections. Surprisal is used to differentiate between more and less informationally-loaded patterns as well as type of information (topical vs. stylistic). While we here focus on research articles in biology/genetics, the methodology is especially interesting for digital humanities scholars, as it can be applied to any text type or domain and combined with additional variables (e.g. time, author or social group).

@inproceedings{Degaetano-Ortlieb2017,
title = {Modeling intra-textual variation with entropy and surprisal: Topical vs. stylistic patterns},
author = {Stefania Degaetano-Ortlieb and Elke Teich},
url = {https://aclanthology.org/W17-2209},
doi = {https://doi.org/10.18653/v1/W17-2209},
year = {2017},
date = {2017},
booktitle = {Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature},
pages = {68-77},
publisher = {Association for Computational Linguistics},
address = {Vancouver, Canada},
abstract = {We present a data-driven approach to investigate intra-textual variation by combining entropy and surprisal. With this approach we detect linguistic variation based on phrasal lexico-grammatical patterns across sections of research articles. Entropy is used to detect patterns typical of specific sections. Surprisal is used to differentiate between more and less informationally-loaded patterns as well as type of information (topical vs. stylistic). While we here focus on research articles in biology/genetics, the methodology is especially interesting for digital humanities scholars, as it can be applied to any text type or domain and combined with additional variables (e.g. time, author or social group).},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Sekicki, Mirjana; Staudte, Maria

Cognitive load in the visual world: The facilitatory effect of gaze Miscellaneous

39th Annual Meeting of the Cognitive Science Society, London, UK, 2017.
  1. Does following a gaze cue influence the cognitive load required for processing the corresponding linguistic referent?
  2. Is considering the gaze cue costly? Is there a distribution of cognitive load between the cue and the referent?
  3. Can a gaze cue have a disruptive effect on processing the linguistic referent?

@miscellaneous{Sekicki2017,
title = {Cognitive load in the visual world: The facilitatory effect of gaze},
author = {Mirjana Sekicki and Maria Staudte},
year = {2017},
date = {2017},
publisher = {39th Annual Meeting of the Cognitive Science Society},
address = {London, UK},
abstract = {

  1. Does following a gaze cue influence the cognitive load required for processing the corresponding linguistic referent?
  2. Is considering the gaze cue costly? Is there a distribution of cognitive load between the cue and the referent?
  3. Can a gaze cue have a disruptive effect on processing the linguistic referent?
},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   A5

Vogels, Jorrig; Howcroft, David M.; Demberg, Vera

Referential overspecification in response to the listener's cognitive load Inproceedings

International Cognitive Linguistics Conference, Tarttu, Estonia, 2017.

According to the Uniform Information Density hypothesis (UID; Jaeger 2010, inter alia), speakers strive to distribute information equally over their utterances. They do this to avoid both peaks and troughs in information density, which may lead to processing difficulty for the listener. Several studies have shown how speakers consistently make linguistic choices that result in a more equal distribution of information (e.g., Jaeger 2010, Mahowald, Fedorenko, Piantadosi, & Gibson 2013, Piantadosi, Tily, & Gibson 2011). However, it is not clear whether speakers also adapt the information density of their utterances to the processing capacity of a specific addressee. For example, when the addressee is involved in a difficult task that is clearly reducing his cognitive capacity for processing linguistic information, will the speaker lower the overall information density of her utterances to accommodate the reduced processing capacity?

@inproceedings{Vogels2017,
title = {Referential overspecification in response to the listener's cognitive load},
author = {Jorrig Vogels and David M. Howcroft and Vera Demberg},
year = {2017},
date = {2017},
publisher = {International Cognitive Linguistics Conference},
address = {Tarttu, Estonia},
abstract = {According to the Uniform Information Density hypothesis (UID; Jaeger 2010, inter alia), speakers strive to distribute information equally over their utterances. They do this to avoid both peaks and troughs in information density, which may lead to processing difficulty for the listener. Several studies have shown how speakers consistently make linguistic choices that result in a more equal distribution of information (e.g., Jaeger 2010, Mahowald, Fedorenko, Piantadosi, & Gibson 2013, Piantadosi, Tily, & Gibson 2011). However, it is not clear whether speakers also adapt the information density of their utterances to the processing capacity of a specific addressee. For example, when the addressee is involved in a difficult task that is clearly reducing his cognitive capacity for processing linguistic information, will the speaker lower the overall information density of her utterances to accommodate the reduced processing capacity?},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Häuser, Katja; Demberg, Vera; Kray, Jutta

Age-differences in recovery from prediction error: Evidence from a simulated driving and combined sentence verification task. Inproceedings

39th Annual Meeting of the Cognitive Science Society, 2017.

@inproceedings{Häuser2017,
title = {Age-differences in recovery from prediction error: Evidence from a simulated driving and combined sentence verification task.},
author = {Katja H{\"a}user and Vera Demberg and Jutta Kray},
year = {2017},
date = {2017-10-17},
publisher = {39th Annual Meeting of the Cognitive Science Society},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Howcroft, David M.; Klakow, Dietrich; Demberg, Vera

The Extended SPaRKy Restaurant Corpus: Designing a Corpus with Variable Information Density Inproceedings

Proc. Interspeech 2017, pp. 3757-3761, 2017.

Natural language generation (NLG) systems rely on corpora for both hand-crafted approaches in a traditional NLG architecture and for statistical end-to-end (learned) generation systems. Limitations in existing resources, however, make it difficult to develop systems which can vary the linguistic properties of an utterance as needed. For example, when users’ attention is split between a linguistic and a secondary task such as driving, a generation system may need to reduce the information density of an utterance to compensate for the reduction in user attention. We introduce a new corpus in the restaurant recommendation and comparison domain, collected in a paraphrasing paradigm, where subjects wrote texts targeting either a general audience or an elderly family member. This design resulted in a corpus of more than 5000 texts which exhibit a variety of lexical and syntactic choices and differ with respect to average word & sentence length and surprisal. The corpus includes two levels of meaning representation: flat ‘semantic stacks’ for propositional content and Rhetorical Structure Theory (RST) relations between these propositions.

@inproceedings{Howcroft2017b,
title = {The Extended SPaRKy Restaurant Corpus: Designing a Corpus with Variable Information Density},
author = {David M. Howcroft and Dietrich Klakow and Vera Demberg},
url = {http://dx.doi.org/10.21437/Interspeech.2017-1555},
doi = {https://doi.org/10.21437/Interspeech.2017-1555},
year = {2017},
date = {2017-10-17},
booktitle = {Proc. Interspeech 2017},
pages = {3757-3761},
abstract = {Natural language generation (NLG) systems rely on corpora for both hand-crafted approaches in a traditional NLG architecture and for statistical end-to-end (learned) generation systems. Limitations in existing resources, however, make it difficult to develop systems which can vary the linguistic properties of an utterance as needed. For example, when users’ attention is split between a linguistic and a secondary task such as driving, a generation system may need to reduce the information density of an utterance to compensate for the reduction in user attention. We introduce a new corpus in the restaurant recommendation and comparison domain, collected in a paraphrasing paradigm, where subjects wrote texts targeting either a general audience or an elderly family member. This design resulted in a corpus of more than 5000 texts which exhibit a variety of lexical and syntactic choices and differ with respect to average word & sentence length and surprisal. The corpus includes two levels of meaning representation: flat ‘semantic stacks’ for propositional content and Rhetorical Structure Theory (RST) relations between these propositions.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Howcroft, David M.; Vogels, Jorrig; Demberg, Vera

G-TUNA: a corpus of referring expressions in German, including duration information Inproceedings

Proceedings of the 10th International Conference on Natural Language Generation, Association for Computational Linguistics, pp. 149-153, Santiago de Compostela, Spain, 2017.

Corpora of referring expressions elicited from human participants in a controlled environment are an important resource for research on automatic referring expression generation. We here present G-TUNA, a new corpus of referring expressions for German. Using the furniture stimuli set developed for the TUNA and D-TUNA corpora, our corpus extends on these corpora by providing data collected in a simulated driving dual-task setting, and additionally provides exact duration annotations for the spoken referring expressions. This corpus will hence allow researchers to analyze the interaction between referring expression length and speech rate, under conditions where the listener is under high vs. low cognitive load.

@inproceedings{W17-3522,
title = {G-TUNA: a corpus of referring expressions in German, including duration information},
author = {David M. Howcroft and Jorrig Vogels and Vera Demberg},
url = {http://www.aclweb.org/anthology/W17-3522},
doi = {https://doi.org/10.18653/v1/W17-3522},
year = {2017},
date = {2017},
booktitle = {Proceedings of the 10th International Conference on Natural Language Generation},
pages = {149-153},
publisher = {Association for Computational Linguistics},
address = {Santiago de Compostela, Spain},
abstract = {Corpora of referring expressions elicited from human participants in a controlled environment are an important resource for research on automatic referring expression generation. We here present G-TUNA, a new corpus of referring expressions for German. Using the furniture stimuli set developed for the TUNA and D-TUNA corpora, our corpus extends on these corpora by providing data collected in a simulated driving dual-task setting, and additionally provides exact duration annotations for the spoken referring expressions. This corpus will hence allow researchers to analyze the interaction between referring expression length and speech rate, under conditions where the listener is under high vs. low cognitive load.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Howcroft, David M.; Demberg, Vera

Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking Inproceedings

Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Association for Computational Linguistics, pp. 958-968, Valencia, Spain, 2017.

While previous research on readability has typically focused on document-level measures, recent work in areas such as natural language generation has pointed out the need of sentence-level readability measures. Much of psycholinguistics has focused for many years on processing measures that provide difficulty estimates on a word-by-word basis. However, these psycholinguistic measures have not yet been tested on sentence readability ranking tasks. In this paper, we use four psycholinguistic measures: idea density, surprisal, integration cost, and embedding depth to test whether these features are predictive of readability levels. We find that psycholinguistic features significantly improve performance by up to 3 percentage points over a standard document-level readability metric baseline.

@inproceedings{Howcroft2017,
title = {Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking},
author = {David M. Howcroft and Vera Demberg},
url = {http://www.aclweb.org/anthology/E17-1090},
year = {2017},
date = {2017-10-17},
booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers},
pages = {958-968},
publisher = {Association for Computational Linguistics},
address = {Valencia, Spain},
abstract = {While previous research on readability has typically focused on document-level measures, recent work in areas such as natural language generation has pointed out the need of sentence-level readability measures. Much of psycholinguistics has focused for many years on processing measures that provide difficulty estimates on a word-by-word basis. However, these psycholinguistic measures have not yet been tested on sentence readability ranking tasks. In this paper, we use four psycholinguistic measures: idea density, surprisal, integration cost, and embedding depth to test whether these features are predictive of readability levels. We find that psycholinguistic features significantly improve performance by up to 3 percentage points over a standard document-level readability metric baseline.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Roth, Michael; Thater, Stefan; Ostermann, Simon; Pinkal, Manfred

Aligning Script Events with Narrative Texts Inproceedings

Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), Association for Computational Linguistics, Vancouver, Canada, 2017.

Script knowledge plays a central role in text understanding and is relevant for a variety of downstream tasks. In this paper, we consider two recent datasets which provide a rich and general representation of script events in terms of paraphrase sets.

We introduce the task of mapping event mentions in narrative texts to such script event types, and present a model for this task that exploits rich linguistic representations as well as information on temporal ordering. The results of our experiments demonstrate that this complex task is indeed feasible.

@inproceedings{ostermann-EtAl:2017:starSEM,
title = {Aligning Script Events with Narrative Texts},
author = {Michael Roth and Stefan Thater andSimon Ostermann and Manfred Pinkal},
url = {http://www.aclweb.org/anthology/S17-1016},
year = {2017},
date = {2017-10-17},
booktitle = {Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)},
publisher = {Association for Computational Linguistics},
address = {Vancouver, Canada},
abstract = {Script knowledge plays a central role in text understanding and is relevant for a variety of downstream tasks. In this paper, we consider two recent datasets which provide a rich and general representation of script events in terms of paraphrase sets. We introduce the task of mapping event mentions in narrative texts to such script event types, and present a model for this task that exploits rich linguistic representations as well as information on temporal ordering. The results of our experiments demonstrate that this complex task is indeed feasible.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A3

Nguyen, Dai Quoc; Nguyen, Dat Quoc; Modi, Ashutosh; Thater, Stefan; Pinkal, Manfred

A Mixture Model for Learning Multi-Sense Word Embeddings Inproceedings

Association for Computational Linguistics, pp. 121-127, Vancouver, Canada, 2017.

Word embeddings are now a standard technique for inducing meaning representations for words. For getting good representations, it is important to take into account different senses of a word. In this paper, we propose a mixture model for learning multi-sense word embeddings.

Our model generalizes the previous works in that it allows to induce different weights of different senses of a word. The experimental results show that our model outperforms previous models on standard evaluation tasks.

@inproceedings{nguyen-EtAl:2017:starSEM,
title = {A Mixture Model for Learning Multi-Sense Word Embeddings},
author = {Dai Quoc Nguyen and Dat Quoc Nguyen and Ashutosh Modi and Stefan Thater and Manfred Pinkal},
url = {http://www.aclweb.org/anthology/S17-1015},
year = {2017},
date = {2017},
pages = {121-127},
publisher = {Association for Computational Linguistics},
address = {Vancouver, Canada},
abstract = {Word embeddings are now a standard technique for inducing meaning representations for words. For getting good representations, it is important to take into account different senses of a word. In this paper, we propose a mixture model for learning multi-sense word embeddings. Our model generalizes the previous works in that it allows to induce different weights of different senses of a word. The experimental results show that our model outperforms previous models on standard evaluation tasks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A2 A3

Nguyen, Dai Quoc; Nguyen, Dat Quoc; Chu, Cuong Xuan; Thater, Stefan; Pinkal, Manfred

Sequence to Sequence Learning for Event Prediction Inproceedings

Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Asian Federation of Natural Language Processing, pp. 37-42, Taipei, Taiwan, 2017.

This paper presents an approach to the task of predicting an event description from a preceding sentence in a text. Our approach explores sequence-to-sequence learning using a bidirectional multi-layer recurrent neural network. Our approach substantially outperforms previous work in terms of the BLEU score on two datasets derived from WikiHow and DeScript respectively.

Since the BLEU score is not easy to interpret as a measure of event prediction, we complement our study with a second evaluation that exploits the rich linguistic annotation of gold paraphrase sets of events.

@inproceedings{nguyen-EtAl:2017:I17-2,
title = {Sequence to Sequence Learning for Event Prediction},
author = {Dai Quoc Nguyen and Dat Quoc Nguyen and Cuong Xuan Chu and Stefan Thater and Manfred Pinkal},
url = {http://www.aclweb.org/anthology/I17-2007},
year = {2017},
date = {2017-10-17},
booktitle = {Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
pages = {37-42},
publisher = {Asian Federation of Natural Language Processing},
address = {Taipei, Taiwan},
abstract = {This paper presents an approach to the task of predicting an event description from a preceding sentence in a text. Our approach explores sequence-to-sequence learning using a bidirectional multi-layer recurrent neural network. Our approach substantially outperforms previous work in terms of the BLEU score on two datasets derived from WikiHow and DeScript respectively. Since the BLEU score is not easy to interpret as a measure of event prediction, we complement our study with a second evaluation that exploits the rich linguistic annotation of gold paraphrase sets of events.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A3 A2

Tourtouri, Elli; Delogu, Francesca; Crocker, Matthew W.

Overspecifications efficiently manage referential entropy in situated communication Inproceedings

Paper presented at the 39th Annual Conference of the German Linguistic Society (DGfS), Saarland University, Saarbruecken, Germany, 2017.

@inproceedings{Tourtourietal2017a,
title = {Overspecifications efficiently manage referential entropy in situated communication},
author = {Elli Tourtouri and Francesca Delogu and Matthew W. Crocker},
year = {2017},
date = {2017},
booktitle = {Paper presented at the 39th Annual Conference of the German Linguistic Society (DGfS)},
publisher = {Saarland University},
address = {Saarbruecken, Germany},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A1 C3

Delogu, Francesca; Crocker, Matthew W.; Drenhaus, Heiner

Teasing apart coercion and surprisal: Evidence from ERPs and eye-movements Journal Article

Cognition, 161, pp. 46-59, 2017.

Previous behavioral and electrophysiological studies have presented evidence suggesting that coercion expressions (e.g., began the book) are more difficult to process than control expressions like read the book. While this processing cost has been attributed to a specific coercion operation for recovering an event-sense of the complement (e.g., began reading the book), an alternative view based on the Surprisal Theory of language processing would attribute the cost to the relative unpredictability of the complement noun in the coercion compared to the control condition, with no need to postulate coercion-specific mechanisms. In two experiments, monitoring eye-tracking and event-related potentials (ERPs), respectively, we sought to determine whether there is any evidence for coercion-specific processing cost above-and-beyond the difficulty predicted by surprisal, by contrasting coercing and control expressions with a further control condition in which the predictability of the complement noun was similar to that in the coercion condition (e.g., bought the book). While the eye-tracking study showed significant effects of surprisal and a marginal effect of coercion on late reading measures, the ERP study clearly supported the surprisal account. Overall, our findings suggest that the coercion cost largely reflects the surprisal of the complement noun with coercion specific operations possibly influencing later processing stages.

@article{Brouwer2017,
title = {Teasing apart coercion and surprisal: Evidence from ERPs and eye-movements},
author = {Francesca Delogu and Matthew W. Crocker and Heiner Drenhaus},
url = {https://www.sciencedirect.com/science/article/pii/S0010027716303122},
doi = {https://doi.org/10.1016/j.cognition.2016.12.017},
year = {2017},
date = {2017},
journal = {Cognition},
pages = {46-59},
volume = {161},
abstract = {

Previous behavioral and electrophysiological studies have presented evidence suggesting that coercion expressions (e.g., began the book) are more difficult to process than control expressions like read the book. While this processing cost has been attributed to a specific coercion operation for recovering an event-sense of the complement (e.g., began reading the book), an alternative view based on the Surprisal Theory of language processing would attribute the cost to the relative unpredictability of the complement noun in the coercion compared to the control condition, with no need to postulate coercion-specific mechanisms. In two experiments, monitoring eye-tracking and event-related potentials (ERPs), respectively, we sought to determine whether there is any evidence for coercion-specific processing cost above-and-beyond the difficulty predicted by surprisal, by contrasting coercing and control expressions with a further control condition in which the predictability of the complement noun was similar to that in the coercion condition (e.g., bought the book). While the eye-tracking study showed significant effects of surprisal and a marginal effect of coercion on late reading measures, the ERP study clearly supported the surprisal account. Overall, our findings suggest that the coercion cost largely reflects the surprisal of the complement noun with coercion specific operations possibly influencing later processing stages.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A1

Brouwer, Harm; Crocker, Matthew W.; Venhuizen, Noortje

Neural semantics Journal Article

From Semantics to Dialectometry: Festschrift in Honour of John Nerbonne, pp. 75-83, 2017.

The study of language is ultimately about meaning: how can meaning be constructed from linguistic signal, and how can it be represented? he human language comprehension system is highly eicient and accurate at atributing meaning to linguistic input. Hence, in trying to identify computational principles and representations for meaning construction, we should consider how these could be implemented at the neural level in the brain. Here, we introduce a framework for such a neural semantics. his framework ofers meaning representations that are neurally plausible (can be implemented in neural hardware), expressive (capture negation, quantiication, and modality), compositional (capture complex propositional meaning as the sum of its parts), graded (are probabilistic in nature), and inferential (allow for inferences beyond literal propositional content). Moreover, it is shown how these meaning representations can be constructed incrementally, on a word-by-word basis in a neurocomputational model of language processing.

@article{Brouwer2017b,
title = {Neural semantics},
author = {Harm Brouwer and Matthew W. Crocker and Noortje Venhuizen},
url = {https://research.rug.nl/en/publications/from-semantics-to-dialectometry-festschrift-in-honor-of-john-nerb},
year = {2017},
date = {2017},
journal = {From Semantics to Dialectometry: Festschrift in Honour of John Nerbonne},
pages = {75-83},
abstract = {The study of language is ultimately about meaning: how can meaning be constructed from linguistic signal, and how can it be represented? he human language comprehension system is highly eicient and accurate at atributing meaning to linguistic input. Hence, in trying to identify computational principles and representations for meaning construction, we should consider how these could be implemented at the neural level in the brain. Here, we introduce a framework for such a neural semantics. his framework ofers meaning representations that are neurally plausible (can be implemented in neural hardware), expressive (capture negation, quantiication, and modality), compositional (capture complex propositional meaning as the sum of its parts), graded (are probabilistic in nature), and inferential (allow for inferences beyond literal propositional content). Moreover, it is shown how these meaning representations can be constructed incrementally, on a word-by-word basis in a neurocomputational model of language processing.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A1

Brouwer, Harm; Crocker, Matthew W.

On the proper treatment of the N400 and P600 in Language comprehension Journal Article

Frontiers in Psychology, 8, 2017, ISSN 1664-1078.

Event-Related Potentials (ERPs)—stimulus-locked, scalp-recorded voltage fluctuations caused by post-synaptic neural activity—have proven invaluable to the study of language comprehension. Of interest in the ERP signal are systematic, reoccurring voltage fluctuations called components, which are taken to reflect the neural activity underlying specific computational operations carried out in given neuroanatomical networks (cf. Näätänen and Picton, 1987). For language processing, the N400 component and the P600 component are of particular salience (see Kutas et al., 2006, for a review). The typical approach to determining whether a target word in a sentence leads to differential modulation of these components, relative to a control word, is to look for effects on mean amplitude in predetermined time-windows on the respective ERP waveforms, e.g., 350–550 ms for the N400 component and 600–900 ms for the P600 component. The common mode of operation in psycholinguistics, then, is to tabulate the presence/absence of N400- and/or P600-effects across studies, and to use this categorical data to inform neurocognitive models that attribute specific functional roles to the N400 and P600 component (see Kuperberg, 2007; Bornkessel-Schlesewsky and Schlesewsky, 2008; Brouwer et al., 2012, for reviews).

Here, we assert that this Waveform-based Component Structure (WCS) approach to ERPs leads to inconsistent data patterns, and hence, misinforms neurocognitive models of the electrophysiology of language processing. The reason for this is that the WCS approach ignores the latent component structure underlying ERP waveforms (cf. Luck, 2005), thereby leading to conclusions about component structure that do not factor in spatiotemporal component overlap of the N400 and the P600. This becomes particularly problematic when spatiotemporal component overlap interacts with differential P600 modulations due to task demands (cf. Kolk et al., 2003). While the problem of spatiotemporal component overlap is generally acknowledged, and occasionally invoked to account for within-study inconsistencies in the data, its implications are often overlooked in psycholinguistic theorizing that aims to integrate findings across studies. We believe WCS-centric theorizing to be the single largest reason for the lack of convergence regarding the processes underlying the N400 and the P600, thereby seriously hindering the advancement of neurocognitive theories and models of language processing.

@article{Brouwer2017,
title = {On the proper treatment of the N400 and P600 in Language comprehension},
author = {Harm Brouwer and Matthew W. Crocker},
url = {https://www.frontiersin.org/articles/10.3389/fpsyg.2017.01327/full},
doi = {https://doi.org/10.3389/fpsyg.2017.01327},
year = {2017},
date = {2017},
journal = {Frontiers in Psychology},
volume = {8},
abstract = {

Event-Related Potentials (ERPs)—stimulus-locked, scalp-recorded voltage fluctuations caused by post-synaptic neural activity—have proven invaluable to the study of language comprehension. Of interest in the ERP signal are systematic, reoccurring voltage fluctuations called components, which are taken to reflect the neural activity underlying specific computational operations carried out in given neuroanatomical networks (cf. N{\"a}{\"a}t{\"a}nen and Picton, 1987). For language processing, the N400 component and the P600 component are of particular salience (see Kutas et al., 2006, for a review). The typical approach to determining whether a target word in a sentence leads to differential modulation of these components, relative to a control word, is to look for effects on mean amplitude in predetermined time-windows on the respective ERP waveforms, e.g., 350–550 ms for the N400 component and 600–900 ms for the P600 component. The common mode of operation in psycholinguistics, then, is to tabulate the presence/absence of N400- and/or P600-effects across studies, and to use this categorical data to inform neurocognitive models that attribute specific functional roles to the N400 and P600 component (see Kuperberg, 2007; Bornkessel-Schlesewsky and Schlesewsky, 2008; Brouwer et al., 2012, for reviews).

Here, we assert that this Waveform-based Component Structure (WCS) approach to ERPs leads to inconsistent data patterns, and hence, misinforms neurocognitive models of the electrophysiology of language processing. The reason for this is that the WCS approach ignores the latent component structure underlying ERP waveforms (cf. Luck, 2005), thereby leading to conclusions about component structure that do not factor in spatiotemporal component overlap of the N400 and the P600. This becomes particularly problematic when spatiotemporal component overlap interacts with differential P600 modulations due to task demands (cf. Kolk et al., 2003). While the problem of spatiotemporal component overlap is generally acknowledged, and occasionally invoked to account for within-study inconsistencies in the data, its implications are often overlooked in psycholinguistic theorizing that aims to integrate findings across studies. We believe WCS-centric theorizing to be the single largest reason for the lack of convergence regarding the processes underlying the N400 and the P600, thereby seriously hindering the advancement of neurocognitive theories and models of language processing.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A1

Rabs, Elisabeth; Drenhaus, Heiner; Delogu, Francesca; Crocker, Matthew W.

The influence of script knowledge on language processing: Evidence from ERPs Miscellaneous

23rd AMLaP Conference, Lancaster, UK, 2017.
Previous research has shown that the semantic expectedness of a word – as established by the linguistic context – is negatively correlated with N400 amplitude. While such evidence has been used to argue that the N400 indexes semantic integration processes, findings can often be explained in terms of facilitated lexical retrieval, which, among other factors, is influenced by lexical/semantic priming. In the present study we examine this issue by manipulating script event knowledge – a person’s knowledge about structured event sequences – which has been previously shown to modulate the N400. An ERP-study (German) investigated whether N400 modulation by a mentioned script event is due to priming alone, or is further sensitive to linguistic cues which would be expected to modulate script influence.

@miscellaneous{Rabs2017,
title = {The influence of script knowledge on language processing: Evidence from ERPs},
author = {Elisabeth Rabs and Heiner Drenhaus and Francesca Delogu and Matthew W. Crocker},
url = {https://www.researchgate.net/publication/320988782_The_Influence_of_Script_Knowledge_on_Language_Processing_Evidence_from_ERPs},
year = {2017},
date = {2017},
publisher = {23rd AMLaP Conference},
address = {Lancaster, UK},
abstract = {

Previous research has shown that the semantic expectedness of a word – as established by the linguistic context – is negatively correlated with N400 amplitude. While such evidence has been used to argue that the N400 indexes semantic integration processes, findings can often be explained in terms of facilitated lexical retrieval, which, among other factors, is influenced by lexical/semantic priming. In the present study we examine this issue by manipulating script event knowledge – a person’s knowledge about structured event sequences – which has been previously shown to modulate the N400. An ERP-study (German) investigated whether N400 modulation by a mentioned script event is due to priming alone, or is further sensitive to linguistic cues which would be expected to modulate script influence.
},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   A1

Successfully