Publications

Yung, Frances Pik Yu; Anuranjana, Kaveri; Scholman, Merel; Demberg, Vera

Label distributions help implicit discourse relation classification Inproceedings

Proceedings of the 3rd Workshop on Computational Approaches to Discourse (October 2022, Gyeongju, Republic of Korea and Online), International Conference on Computational Linguistics, pp. 48–53, 2022.

Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses. Recently, DiscoGeM, a novel multi-domain corpus, which contains 10 crowd-sourced labels per relational instance, has become available. In this paper, we analyse the co-occurrences of relations in DiscoGem and show that they are systematic and characteristic of text genre. We then test whether information on multi-label distributions in the data can help implicit relation classifiers. Our results show that incorporating multiple labels in parser training can improve its performance, and yield label distributions which are more similar to human label distributions, compared to a parser that is trained on just a single most frequent label per instance.

@inproceedings{Yungetal2022,
title = {Label distributions help implicit discourse relation classification},
author = {Frances Pik Yu Yung and Kaveri Anuranjana and Merel Scholman and Vera Demberg},
url = {https://aclanthology.org/2022.codi-1.7},
year = {2022},
date = {2022},
booktitle = {Proceedings of the 3rd Workshop on Computational Approaches to Discourse (October 2022, Gyeongju, Republic of Korea and Online)},
pages = {48–53},
publisher = {International Conference on Computational Linguistics},
abstract = {Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses. Recently, DiscoGeM, a novel multi-domain corpus, which contains 10 crowd-sourced labels per relational instance, has become available. In this paper, we analyse the co-occurrences of relations in DiscoGem and show that they are systematic and characteristic of text genre. We then test whether information on multi-label distributions in the data can help implicit relation classifiers. Our results show that incorporating multiple labels in parser training can improve its performance, and yield label distributions which are more similar to human label distributions, compared to a parser that is trained on just a single most frequent label per instance.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Shi, Wei; Demberg, Vera

Entity Enhancement for Implicit Discourse Relation Classification in the Biomedical Domain Inproceedings

Proceedings of the Joint Conference of the 59th Annual Meeting of theAssociation for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), 2021.

Implicit discourse relation classification is a challenging task, in particular when the text domain is different from the standard Penn Discourse Treebank (PDTB; Prasad et al., 2008) training corpus domain (Wall Street Journal in 1990s). We here tackle the task of implicit discourse relation classification on the biomedical domain, for which the Biomedical Discourse Relation Bank (BioDRB; Prasad et al., 2011) is available. We show that entity information can be used to improve discourse relational argument representation. In a first step, we show that explicitly marked instances that are content-wise similar to the target relations can be used to achieve good performance in the cross-domain setting using a simple unsupervised voting pipeline. As a further step, we show that with the linked entity information from the first step, a transformer which is augmented with entity-related information (KBERT; Liu et al., 2020) sets the new state of the art performance on the dataset, outperforming the large pre-trained BioBERT (Lee et al., 2020) model by 2% points.

@inproceedings{shi2021entity,
title = {Entity Enhancement for Implicit Discourse Relation Classification in the Biomedical Domain},
author = {Wei Shi and Vera Demberg},
url = {https://aclanthology.org/2021.acl-short.116.pdf},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Joint Conference of the 59th Annual Meeting of theAssociation for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)},
abstract = {Implicit discourse relation classification is a challenging task, in particular when the text domain is different from the standard Penn Discourse Treebank (PDTB; Prasad et al., 2008) training corpus domain (Wall Street Journal in 1990s). We here tackle the task of implicit discourse relation classification on the biomedical domain, for which the Biomedical Discourse Relation Bank (BioDRB; Prasad et al., 2011) is available. We show that entity information can be used to improve discourse relational argument representation. In a first step, we show that explicitly marked instances that are content-wise similar to the target relations can be used to achieve good performance in the cross-domain setting using a simple unsupervised voting pipeline. As a further step, we show that with the linked entity information from the first step, a transformer which is augmented with entity-related information (KBERT; Liu et al., 2020) sets the new state of the art performance on the dataset, outperforming the large pre-trained BioBERT (Lee et al., 2020) model by 2% points.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Marchal, Marian; Scholman, Merel; Demberg, Vera

Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin Inproceedings

Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI 2021), Association for Computational Linguistics, pp. 84-94, Punta Cana, Dominican Republic and Online, 2021.

Cross-linguistic research on discourse structure and coherence marking requires discourse-annotated corpora and connective lexicons in a large number of languages. However, the availability of such resources is limited, especially for languages for which linguistic resources are scarce in general, such as Nigerian Pidgin. In this study, we demonstrate how a semi-automatic approach can be used to source connectives and their relation senses and develop a discourse-annotated corpus in a low-resource language. Connectives and their relation senses were extracted from a parallel corpus combining automatic (PDTB end-to-end parser) and manual annotations. This resulted in Naija-Lex, a lexicon of discourse connectives in Nigerian Pidgin with English translations. The lexicon shows that the majority of Nigerian Pidgin connectives are borrowed from its English lexifier, but that there are also some connectives that are unique to Nigerian Pidgin.

@inproceedings{marchal-etal-2021-semi,
title = {Semi-automatic discourse annotation in a low-resource language: Developing a connective lexicon for Nigerian Pidgin},
author = {Marian Marchal and Merel Scholman and Vera Demberg},
url = {https://aclanthology.org/2021.codi-main.8/},
doi = {https://doi.org/10.18653/v1/2021.codi-main.8},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI 2021)},
pages = {84-94},
publisher = {Association for Computational Linguistics},
address = {Punta Cana, Dominican Republic and Online},
abstract = {Cross-linguistic research on discourse structure and coherence marking requires discourse-annotated corpora and connective lexicons in a large number of languages. However, the availability of such resources is limited, especially for languages for which linguistic resources are scarce in general, such as Nigerian Pidgin. In this study, we demonstrate how a semi-automatic approach can be used to source connectives and their relation senses and develop a discourse-annotated corpus in a low-resource language. Connectives and their relation senses were extracted from a parallel corpus combining automatic (PDTB end-to-end parser) and manual annotations. This resulted in Naija-Lex, a lexicon of discourse connectives in Nigerian Pidgin with English translations. The lexicon shows that the majority of Nigerian Pidgin connectives are borrowed from its English lexifier, but that there are also some connectives that are unique to Nigerian Pidgin.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Demberg, Vera; Torabi Asr, Fatemeh; Scholman, Merel

DiscAlign for Penn and RST Discourse Treebanks Miscellaneous

Linguistic Data Consortium, Philadelphia, 2021, ISBN 1-58563-975-3.

DiscAlign for Penn and RST Discourse Treebanks was developed by Saarland University. It consists of alignment information for the discourse annotations contained in Penn Discourse Treebank Version 2.0 (LDC2008T05) (PDTB 2.0) and RST Discourse Treebank (LDC2002T07) (RST-DT). PDTB 2.0 and RST-DT annotations overlap for 385 newspaper articles in sections 6, 11, 13, 19 and 23 of the Wall Street Journal corpus contained in Treebank-2 (LDC95T7). DiscAlign for Penn and RST Discourse Treebanks contains approximately 6,700 alignments between PDTB 2.0 and RST-DT relations.

@miscellaneous{Demberg_etal_DiscAlign,
title = {DiscAlign for Penn and RST Discourse Treebanks},
author = {Vera Demberg and Fatemeh Torabi Asr and Merel Scholman},
url = {https://catalog.ldc.upenn.edu/LDC2021T16},
doi = {https://doi.org/10.35111/cf0q-c454},
year = {2021},
date = {2021},
isbn = {1-58563-975-3},
publisher = {Linguistic Data Consortium},
address = {Philadelphia},
abstract = {DiscAlign for Penn and RST Discourse Treebanks was developed by Saarland University. It consists of alignment information for the discourse annotations contained in Penn Discourse Treebank Version 2.0 (LDC2008T05) (PDTB 2.0) and RST Discourse Treebank (LDC2002T07) (RST-DT). PDTB 2.0 and RST-DT annotations overlap for 385 newspaper articles in sections 6, 11, 13, 19 and 23 of the Wall Street Journal corpus contained in Treebank-2 (LDC95T7). DiscAlign for Penn and RST Discourse Treebanks contains approximately 6,700 alignments between PDTB 2.0 and RST-DT relations.},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   B2

Crible, Ludivine; Demberg, Vera

The role of non-connective discourse cues and their interaction with connectives Journal Article

Pragmatics & Cognition, 27, pp. 313 - 338, 2021, ISSN 0929-0907.

The disambiguation and processing of coherence relations is often investigated with a focus on explicit connectives, such as but or so. Other, non-connective cues from the context also facilitate discourse inferences, although their precise disambiguating role and interaction with connectives have been largely overlooked in the psycholinguistic literature so far. This study reports on two crowdsourcing experiments that test the role of contextual cues (parallelism, antonyms, resultative verbs) in the disambiguation of contrast and consequence relations. We compare the effect of contextual cues in conceptually different relations, and with connectives that differ in their semantic precision. Using offline tasks, our results show that contextual cues significantly help disambiguating contrast and consequence relations in the absence of connectives. However, when connectives are present in the context, the effect of cues only holds if the connective is acceptable in the target relation. Overall, our study suggests that cues are decisive on their own, but only secondary in the presence of connectives. These results call for further investigation of the complex interplay between connective types, contextual cues, relation types and other linguistic and cognitive factors.

@article{Crible2021,
title = {The role of non-connective discourse cues and their interaction with connectives},
author = {Ludivine Crible and Vera Demberg},
url = {https://www.jbe-platform.com/content/journals/10.1075/pc.20003.cri},
doi = {https://doi.org/10.1075/pc.20003.cri},
year = {2021},
date = {2021},
journal = {Pragmatics & Cognition},
pages = {313 - 338},
volume = {27},
number = {2},
abstract = {The disambiguation and processing of coherence relations is often investigated with a focus on explicit connectives, such as but or so. Other, non-connective cues from the context also facilitate discourse inferences, although their precise disambiguating role and interaction with connectives have been largely overlooked in the psycholinguistic literature so far. This study reports on two crowdsourcing experiments that test the role of contextual cues (parallelism, antonyms, resultative verbs) in the disambiguation of contrast and consequence relations. We compare the effect of contextual cues in conceptually different relations, and with connectives that differ in their semantic precision. Using offline tasks, our results show that contextual cues significantly help disambiguating contrast and consequence relations in the absence of connectives. However, when connectives are present in the context, the effect of cues only holds if the connective is acceptable in the target relation. Overall, our study suggests that cues are decisive on their own, but only secondary in the presence of connectives. These results call for further investigation of the complex interplay between connective types, contextual cues, relation types and other linguistic and cognitive factors.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Yung, Frances Pik Yu; Jungbluth, Jana; Demberg, Vera

Limits to the Rational Production of Discourse Connectives Journal Article

Frontiers in Psychology, 12, pp. 1729, 2021.

Rational accounts of language use such as the uniform information density hypothesis, which asserts that speakers distribute information uniformly across their utterances, and the rational speech act (RSA) model, which suggests that speakers optimize the formulation of their message by reasoning about what the comprehender would understand, have been hypothesized to account for a wide range of language use phenomena. We here specifically focus on the production of discourse connectives. While there is some prior work indicating that discourse connective production may be governed by RSA, that work uses a strongly gamified experimental setting. In this study, we aim to explore whether speakers reason about the interpretation of their conversational partner also in more realistic settings. We thereby systematically vary the task setup to tease apart effects of task instructions and effects of the speaker explicitly seeing the interpretation alternatives for the listener. Our results show that the RSA-predicted effect of connective choice based on reasoning about the listener is only found in the original setting where explicit interpretation alternatives of the listener are available for the speaker. The effect disappears when the speaker has to reason about listener interpretations. We furthermore find that rational effects are amplified by the gamified task setting, indicating that meta-reasoning about the specific task may play an important role and potentially limit the generalizability of the found effects to more naturalistic every-day language use.

@article{yungJungbluthDemberg2021,
title = {Limits to the Rational Production of Discourse Connectives},
author = {Frances Pik Yu Yung and Jana Jungbluth and Vera Demberg},
url = {https://www.frontiersin.org/article/10.3389/fpsyg.2021.660730},
doi = {https://doi.org/10.3389/fpsyg.2021.660730},
year = {2021},
date = {2021-05-28},
journal = {Frontiers in Psychology},
pages = {1729},
volume = {12},
abstract = {Rational accounts of language use such as the uniform information density hypothesis, which asserts that speakers distribute information uniformly across their utterances, and the rational speech act (RSA) model, which suggests that speakers optimize the formulation of their message by reasoning about what the comprehender would understand, have been hypothesized to account for a wide range of language use phenomena. We here specifically focus on the production of discourse connectives. While there is some prior work indicating that discourse connective production may be governed by RSA, that work uses a strongly gamified experimental setting. In this study, we aim to explore whether speakers reason about the interpretation of their conversational partner also in more realistic settings. We thereby systematically vary the task setup to tease apart effects of task instructions and effects of the speaker explicitly seeing the interpretation alternatives for the listener. Our results show that the RSA-predicted effect of connective choice based on reasoning about the listener is only found in the original setting where explicit interpretation alternatives of the listener are available for the speaker. The effect disappears when the speaker has to reason about listener interpretations. We furthermore find that rational effects are amplified by the gamified task setting, indicating that meta-reasoning about the specific task may play an important role and potentially limit the generalizability of the found effects to more naturalistic every-day language use.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Köhne-Fuetterer, Judith; Drenhaus, Heiner; Delogu, Francesca; Demberg, Vera

The online processing of causal and concessive discourse connectives Journal Article

Linguistics, 59, pp. 417-448, 2021.

While there is a substantial amount of evidence for language processing being a highly incremental and predictive process, we still know relatively little about how top-down discourse based expectations are combined with bottom-up information such as discourse connectives. The present article reports on three experiments investigating this question using different methodologies (visual world paradigm and ERPs) in two languages (German and English). We find support for highly incremental processing of causal and concessive discourse connectives, causing anticipation of upcoming material. Our visual world study shows that anticipatory looks depend on the discourse connective; furthermore, the German ERP study revealed an N400 effect on a gender-marked adjective preceding the target noun, when the target noun was inconsistent with the expectations elicited by the combination of context and discourse connective. Moreover, our experiments reveal that the facilitation of downstream material based on earlier connectives comes at the cost of reversing original expectations, as evidenced by a P600 effect on the concessive relative to the causal connective.

@article{koehne2021online,
title = {The online processing of causal and concessive discourse connectives},
author = {Judith K{\"o}hne-Fuetterer and Heiner Drenhaus and Francesca Delogu and Vera Demberg},
url = {https://doi.org/10.1515/ling-2021-0011},
doi = {https://doi.org/doi:10.1515/ling-2021-0011},
year = {2021},
date = {2021-03-04},
journal = {Linguistics},
pages = {417-448},
volume = {59},
number = {2},
abstract = {While there is a substantial amount of evidence for language processing being a highly incremental and predictive process, we still know relatively little about how top-down discourse based expectations are combined with bottom-up information such as discourse connectives. The present article reports on three experiments investigating this question using different methodologies (visual world paradigm and ERPs) in two languages (German and English). We find support for highly incremental processing of causal and concessive discourse connectives, causing anticipation of upcoming material. Our visual world study shows that anticipatory looks depend on the discourse connective; furthermore, the German ERP study revealed an N400 effect on a gender-marked adjective preceding the target noun, when the target noun was inconsistent with the expectations elicited by the combination of context and discourse connective. Moreover, our experiments reveal that the facilitation of downstream material based on earlier connectives comes at the cost of reversing original expectations, as evidenced by a P600 effect on the concessive relative to the causal connective.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Projects:   A1 B2 B3

Hoek, Jet; Scholman, Merel; Sanders, Ted J. M.

Is there less agreement when the discourse is underspecified? Inproceedings

Proceedings of the Integrating Perspectives on Discourse Annotation (DiscAnn) Workshop, University of Tübingen, Germany, 2021.

When annotating coherence relations, interannotator agreement tends to be lower on implicit relations than on relations that are explicitly marked by means of a connective or a cue phrase. This paper explores one possible explanation for this: the additional inferencing involved in interpreting implicit relations compared to explicit relations. If this is the main source of disagreements, agreement should be highly related to the specificity of the connective. Using the CCR framework, we annotated relations from TED talks that were marked by a very specific marker, marked by a highly ambiguous connective, or not marked by means of a connective at all. We indeed reached higher inter-annotator agreement on explicit than on implicit relations. However, agreement on underspecified relations was not necessarily in between, which is what would be expected if agreement on implicit relations mainly suffers because annotators have less specific instructions for inferring the relation.

@inproceedings{hoek-etal-2021-discann,
title = {Is there less agreement when the discourse is underspecified?},
author = {Jet Hoek and Merel Scholman and Ted J. M. Sanders},
url = {https://aclanthology.org/2021.discann-1.1/},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Integrating Perspectives on Discourse Annotation (DiscAnn) Workshop},
address = {University of T{\"u}bingen, Germany},
abstract = {When annotating coherence relations, interannotator agreement tends to be lower on implicit relations than on relations that are explicitly marked by means of a connective or a cue phrase. This paper explores one possible explanation for this: the additional inferencing involved in interpreting implicit relations compared to explicit relations. If this is the main source of disagreements, agreement should be highly related to the specificity of the connective. Using the CCR framework, we annotated relations from TED talks that were marked by a very specific marker, marked by a highly ambiguous connective, or not marked by means of a connective at all. We indeed reached higher inter-annotator agreement on explicit than on implicit relations. However, agreement on underspecified relations was not necessarily in between, which is what would be expected if agreement on implicit relations mainly suffers because annotators have less specific instructions for inferring the relation.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Yung, Frances Pik Yu; Scholman, Merel; Demberg, Vera

A practical perspective on connective generation Inproceedings

Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI), Association for Computational Linguistics, pp. 72-83, Punta Cana, Dominican Republic and Online, 2021.

In data-driven natural language generation, we typically know what relation should be expressed and need to select a connective to lexicalize it. In the current contribution, we analyse whether a sophisticated connective generation module is necessary to select a connective, or whether this can be solved with simple methods (such as random choice between connectives that are known to express a given relation, or usage of a generic language model). Comparing these methods to the distributions of connective choices from a human connective insertion task, we find mixed results: for some relations, it is acceptable to lexicalize them using any of the connectives that mark this relation. However, for other relations (temporals, concessives) either a more detailed relation distinction needs to be introduced, or a more sophisticated connective choice module would be necessary.

@inproceedings{yung-etal-2021-practical,
title = {A practical perspective on connective generation},
author = {Frances Pik Yu Yung and Merel Scholman and Vera Demberg},
url = {https://aclanthology.org/2021.codi-main.7},
doi = {https://doi.org/10.18653/v1/2021.codi-main.7},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI)},
pages = {72-83},
publisher = {Association for Computational Linguistics},
address = {Punta Cana, Dominican Republic and Online},
abstract = {In data-driven natural language generation, we typically know what relation should be expressed and need to select a connective to lexicalize it. In the current contribution, we analyse whether a sophisticated connective generation module is necessary to select a connective, or whether this can be solved with simple methods (such as random choice between connectives that are known to express a given relation, or usage of a generic language model). Comparing these methods to the distributions of connective choices from a human connective insertion task, we find mixed results: for some relations, it is acceptable to lexicalize them using any of the connectives that mark this relation. However, for other relations (temporals, concessives) either a more detailed relation distinction needs to be introduced, or a more sophisticated connective choice module would be necessary.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Dong, Tianai; Yung, Frances Pik Yu; Demberg, Vera

Comparison of methods for explicit discourse connective identification across various domains Inproceedings

Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI), Association for Computational Linguistics, pp. 95-106, Punta Cana, Dominican Republic and Online, 2021.

Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles. We here assess the performance on explicit connective identification of three parse methods (PDTB e2e, Lin et al., 2014; the winner of CONLL2015, Wang et al., 2015; and DisSent, Nie et al., 2019), along with a simple heuristic. We also examine how well these systems generalize to different datasets, namely written newspaper text (PDTB), written scientific text (BioDRB), prepared spoken text (TED-MDB) and spontaneous spoken text (Disco-SPICE). The results show that the e2e parser outperforms the other parse methods in all datasets. However, performance drops significantly from the PDTB to all other datasets. We provide a more fine-grained analysis of domain differences and connectives that prove difficult to parse, in order to highlight the areas where gains can be made.

@inproceedings{scholman-etal-2021-comparison,
title = {Comparison of methods for explicit discourse connective identification across various domains},
author = {Merel Scholman and Tianai Dong and Frances Pik Yu Yung and Vera Demberg},
url = {https://aclanthology.org/2021.codi-main.9},
doi = {https://doi.org/10.18653/v1/2021.codi-main.9},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI)},
pages = {95-106},
publisher = {Association for Computational Linguistics},
address = {Punta Cana, Dominican Republic and Online},
abstract = {Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles. We here assess the performance on explicit connective identification of three parse methods (PDTB e2e, Lin et al., 2014; the winner of CONLL2015, Wang et al., 2015; and DisSent, Nie et al., 2019), along with a simple heuristic. We also examine how well these systems generalize to different datasets, namely written newspaper text (PDTB), written scientific text (BioDRB), prepared spoken text (TED-MDB) and spontaneous spoken text (Disco-SPICE). The results show that the e2e parser outperforms the other parse methods in all datasets. However, performance drops significantly from the PDTB to all other datasets. We provide a more fine-grained analysis of domain differences and connectives that prove difficult to parse, in order to highlight the areas where gains can be made.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Shi, Wei

Addressing the data bottleneck in implicit discourse relation classification PhD Thesis

Saarland University, Saarbruecken, Germany, 2020.

When humans comprehend language, their interpretation consists of more than just the sum of the content of the sentences. Additional logic and semantic links (known as coherence relations or discourse relations) are inferred between sentences/clauses in the text. The identification of discourse relations is beneficial for various NLP applications such as question-answering, summarization, machine translation, information extraction, etc. Discourse relations are categorized into implicit and explicit discourse relations depending on whether there is an explicit discourse marker between the arguments. In this thesis, we mainly focus on the implicit discourse relation classification, given that with the explicit markers acting as informative cues, the explicit relations are relatively easier to identify for machines. The recent neural network-based approaches in particular suffer from insufficient training (and test) data. As shown in Chapter 3 of this thesis, we start out by showing to what extent the limited data size is a problem in implicit discourse relation classification and propose data augmentation methods with the help of cross-lingual data. And then we propose several approaches for better exploiting and encoding various types of existing data in the discourse relation classification task. Most of the existing machine learning methods train on sections 2-21 of the PDTB and test on section 23, which only includes a total of less than 800 implicit discourse relation instances. With the help of cross validation, we argue that the standard test section of the PDTB is too small to draw conclusions upon. With more test samples in the cross validation, we would come to very different conclusions about whether a feature is generally useful. Second, we propose a simple approach to automatically extract samples of implicit discourse relations from multilingual parallel corpus via back-translation. After back-translating from target languages, it is easy for the discourse parser to identify those examples that are originally implicit but explicit in the back-translations. Having those additional data in the training set, the experiments show significant improvements on different settings. Finally, having better encoding ability is also of crucial importance in terms of improving classification performance. We propose different methods including a sequence-to-sequence neural network and a memory component to help have a better representation of the arguments. We also show that having the correct next sentence is beneficial for the task within and across domains, with the help of the BERT (Devlin et al., 2019) model. When it comes to a new domain, it is beneficial to integrate external domain-specific knowledge. In Chapter 8, we show that with the entity-enhancement, the performance on BioDRB is improved significantly, comparing with other BERT-based methods. In sum, the studies reported in this dissertation contribute to addressing the data bottleneck problem in implicit discourse relation classification and propose corresponding approaches that achieve 54.82% and 69.57% on PDTB and BioDRB respectively.


Wenn Menschen Sprache verstehen, besteht ihre Interpretation aus mehr als nur der Summe des Inhalts der Sätze. Zwischen Sätzen im Text werden zusätzliche logische und semantische Verknüpfungen (sogenannte Kohärenzrelationen oder Diskursrelationen) hergeleitet. Die Identifizierung von Diskursrelationen ist für verschiedene NLP-Anwendungen wie Frage- Antwort, Zusammenfassung, maschinelle Übersetzung, Informationsextraktion usw. von Vorteil. Diskursrelationen werden in implizite und explizite Diskursrelationen unterteilt, je nachdem, ob es eine explizite Diskursrelationen zwischen den Argumenten gibt. In dieser Arbeit konzentrieren wir uns hauptsächlich auf die Klassifizierung der impliziten Diskursrelationen, da die expliziten Marker als hilfreiche Hinweise dienen und die expliziten Beziehungen für Maschinen relativ leicht zu identifizieren sind. Es wurden verschiedene Ansätze vorgeschlagen, die bei der impliziten Diskursrelationsklassifikation beeindruckende Ergebnisse erzielt haben. Die meisten von ihnen leiden jedoch darunter, dass die Daten für auf neuronalen Netzen basierende Methoden unzureichend sind. In dieser Arbeit gehen wir zunächst auf das Problem begrenzter Daten bei dieser Aufgabe ein und schlagen dann Methoden zur Datenanreicherung mit Hilfe von sprachübergreifenden Daten vor. Zuletzt schlagen wir mehrere Methoden vor, um die Argumente aus verschiedenen Aspekten besser kodieren zu können. Die meisten der existierenden Methoden des maschinellen Lernens werden auf den Abschnitten 2-21 der PDTB trainiert und auf dem Abschnitt 23 getestet, der insgesamt nur weniger als 800 implizite Diskursrelationsinstanzen enthält. Mit Hilfe der Kreuzvalidierung argumentieren wir, dass der Standardtestausschnitt der PDTB zu klein ist um daraus Schlussfolgerungen zu ziehen. Mit mehr Teststichproben in der Kreuzvalidierung würden wir zu anderen Schlussfolgerungen darüber kommen, ob ein Merkmal für diese Aufgabe generell vorteilhaft ist oder nicht, insbesondere wenn wir einen relativ großen Labelsatz verwenden. Wenn wir nur unseren kleinen Standardtestsatz herausstellen, laufen wir Gefahr, falsche Schlüsse darüber zu ziehen, welche Merkmale hilfreich sind. Zweitens schlagen wir einen einfachen Ansatz zur automatischen Extraktion von Samples impliziter Diskursrelationen aus mehrsprachigen Parallelkorpora durch Rückübersetzung vor. Er ist durch den Explikationsprozess motiviert, wenn Menschen einen Text übersetzen. Nach der Rückübersetzung aus den Zielsprachen ist es für den Diskursparser leicht, diejenigen Beispiele zu identifizieren, die ursprünglich implizit, in den Rückübersetzungen aber explizit enthalten sind. Da diese zusätzlichen Daten im Trainingsset enthalten sind, zeigen die Experimente signifikante Verbesserungen in verschiedenen Situationen. Wir verwenden zunächst nur französisch-englische Paare und haben keine Kontrolle über die Qualität und konzentrieren uns meist auf die satzinternen Relationen. Um diese Fragen in Angriff zu nehmen, erweitern wir die Idee später mit mehr Vorverarbeitungsschritten und mehr Sprachpaaren. Mit den Mehrheitsentscheidungen aus verschiedenen Sprachpaaren sind die gemappten impliziten Labels zuverlässiger. Schließlich ist auch eine bessere Kodierfähigkeit von entscheidender Bedeutung für die Verbesserung der Klassifizierungsleistung. Wir schlagen ein neues Modell vor, das aus einem Klassifikator und einem Sequenz-zu-Sequenz-Modell besteht. Neben der korrekten Vorhersage des Labels werden sie auch darauf trainiert, eine Repräsentation der Diskursrelationsargumente zu erzeugen, indem sie versuchen, die Argumente einschließlich eines geeigneten impliziten Konnektivs vorherzusagen. Die neuartige sekundäre Aufgabe zwingt die interne Repräsentation dazu, die Semantik der Relationsargumente vollständiger zu kodieren und eine feinkörnigere Klassifikation vorzunehmen. Um das allgemeine Wissen in Kontexten weiter zu erfassen, setzen wir auch ein Gedächtnisnetzwerk ein, um eine explizite Kontextrepräsentation von Trainingsbeispielen für Kontexte zu erhalten. Für jede Testinstanz erzeugen wir durch gewichtetes Lesen des Gedächtnisses einen Wissensvektor. Wir evaluieren das vorgeschlagene Modell unter verschiedenen Bedingungen und die Ergebnisse zeigen, dass das Modell mit dem Speichernetzwerk die Vorhersage von Diskursrelationen erleichtern kann, indem es Beispiele auswählt, die eine ähnliche semantische Repräsentation und Diskursrelationen aufweisen. Auch wenn ein besseres Verständnis, eine Kodierung und semantische Interpretation für die Aufgabe der impliziten Diskursrelationsklassifikation unerlässlich und nützlich sind, so leistet sie doch nur einen Teil der Arbeit. Ein guter impliziter Diskursrelationsklassifikator sollte sich auch der bevorstehenden Ereignisse, Ursachen, Folgen usw. bewusst sein, um die Diskurserwartung in die Satzdarstellungen zu kodieren. Mit Hilfe des kürzlich vorgeschlagenen BERT-Modells versuchen wir herauszufinden, ob es für die Aufgabe vorteilhaft ist, den richtigen nächsten Satz zu haben oder nicht. Die experimentellen Ergebnisse zeigen, dass das Entfernen der Aufgabe zur Vorhersage des nächsten Satzes die Leistung sowohl innerhalb der Domäne als auch domänenübergreifend stark beeinträchtigt. Die begrenzte Fähigkeit von BioBERT, domänenspezifisches Wissen, d.h. Entitätsinformationen, Entitätsbeziehungen etc. zu erlernen, motiviert uns, externes Wissen in die vortrainierten Sprachmodelle zu integrieren. Wir schlagen eine unüberwachte Methode vor, bei der Information-Retrieval-System und Wissensgraphen-Techniken verwendet werden, mit der Annahme, dass, wenn zwei Instanzen ähnliche Entitäten in beiden relationalen Argumenten teilen, die Wahrscheinlichkeit groß ist, dass sie die gleiche oder eine ähnliche Diskursrelation haben. Der Ansatz erzielt vergleichbare Ergebnisse auf BioDRB, verglichen mit Baselinemodellen. Anschließend verwenden wir die extrahierten relevanten Entitäten zur Verbesserung des vortrainierten Modells K-BERT, um die Bedeutung der Argumente besser zu kodieren und das ursprüngliche BERT und BioBERT mit einer Genauigkeit von 6,5% bzw. 2% zu übertreffen. Zusammenfassend trägt diese Dissertation dazu bei, das Problem des Datenengpasses bei der impliziten Diskursrelationsklassifikation anzugehen, und schlägt entsprechende Ansätze in verschiedenen Aspekten vor, u.a. die Darstellung des begrenzten Datenproblems und der Risiken bei der Schlussfolgerung daraus; die Erfassung automatisch annotierter Daten durch den Explikationsprozess während der manuellen Übersetzung zwischen Englisch und anderen Sprachen; eine bessere Repräsentation von Diskursrelationsargumenten; Entity-Enhancement mit einer unüberwachten Methode und einem vortrainierten Sprachmodell.2

@phdthesis{Shi_Diss_2020,
title = {Addressing the data bottleneck in implicit discourse relation classification},
author = {Wei Shi},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/30143},
doi = {https://doi.org/https://dx.doi.org/10.22028/D291-32711},
year = {2020},
date = {2020},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {When humans comprehend language, their interpretation consists of more than just the sum of the content of the sentences. Additional logic and semantic links (known as coherence relations or discourse relations) are inferred between sentences/clauses in the text. The identification of discourse relations is beneficial for various NLP applications such as question-answering, summarization, machine translation, information extraction, etc. Discourse relations are categorized into implicit and explicit discourse relations depending on whether there is an explicit discourse marker between the arguments. In this thesis, we mainly focus on the implicit discourse relation classification, given that with the explicit markers acting as informative cues, the explicit relations are relatively easier to identify for machines. The recent neural network-based approaches in particular suffer from insufficient training (and test) data. As shown in Chapter 3 of this thesis, we start out by showing to what extent the limited data size is a problem in implicit discourse relation classification and propose data augmentation methods with the help of cross-lingual data. And then we propose several approaches for better exploiting and encoding various types of existing data in the discourse relation classification task. Most of the existing machine learning methods train on sections 2-21 of the PDTB and test on section 23, which only includes a total of less than 800 implicit discourse relation instances. With the help of cross validation, we argue that the standard test section of the PDTB is too small to draw conclusions upon. With more test samples in the cross validation, we would come to very different conclusions about whether a feature is generally useful. Second, we propose a simple approach to automatically extract samples of implicit discourse relations from multilingual parallel corpus via back-translation. After back-translating from target languages, it is easy for the discourse parser to identify those examples that are originally implicit but explicit in the back-translations. Having those additional data in the training set, the experiments show significant improvements on different settings. Finally, having better encoding ability is also of crucial importance in terms of improving classification performance. We propose different methods including a sequence-to-sequence neural network and a memory component to help have a better representation of the arguments. We also show that having the correct next sentence is beneficial for the task within and across domains, with the help of the BERT (Devlin et al., 2019) model. When it comes to a new domain, it is beneficial to integrate external domain-specific knowledge. In Chapter 8, we show that with the entity-enhancement, the performance on BioDRB is improved significantly, comparing with other BERT-based methods. In sum, the studies reported in this dissertation contribute to addressing the data bottleneck problem in implicit discourse relation classification and propose corresponding approaches that achieve 54.82% and 69.57% on PDTB and BioDRB respectively.


Wenn Menschen Sprache verstehen, besteht ihre Interpretation aus mehr als nur der Summe des Inhalts der S{\"a}tze. Zwischen S{\"a}tzen im Text werden zus{\"a}tzliche logische und semantische Verkn{\"u}pfungen (sogenannte Koh{\"a}renzrelationen oder Diskursrelationen) hergeleitet. Die Identifizierung von Diskursrelationen ist f{\"u}r verschiedene NLP-Anwendungen wie Frage- Antwort, Zusammenfassung, maschinelle {\"U}bersetzung, Informationsextraktion usw. von Vorteil. Diskursrelationen werden in implizite und explizite Diskursrelationen unterteilt, je nachdem, ob es eine explizite Diskursrelationen zwischen den Argumenten gibt. In dieser Arbeit konzentrieren wir uns haupts{\"a}chlich auf die Klassifizierung der impliziten Diskursrelationen, da die expliziten Marker als hilfreiche Hinweise dienen und die expliziten Beziehungen f{\"u}r Maschinen relativ leicht zu identifizieren sind. Es wurden verschiedene Ans{\"a}tze vorgeschlagen, die bei der impliziten Diskursrelationsklassifikation beeindruckende Ergebnisse erzielt haben. Die meisten von ihnen leiden jedoch darunter, dass die Daten f{\"u}r auf neuronalen Netzen basierende Methoden unzureichend sind. In dieser Arbeit gehen wir zun{\"a}chst auf das Problem begrenzter Daten bei dieser Aufgabe ein und schlagen dann Methoden zur Datenanreicherung mit Hilfe von sprach{\"u}bergreifenden Daten vor. Zuletzt schlagen wir mehrere Methoden vor, um die Argumente aus verschiedenen Aspekten besser kodieren zu k{\"o}nnen. Die meisten der existierenden Methoden des maschinellen Lernens werden auf den Abschnitten 2-21 der PDTB trainiert und auf dem Abschnitt 23 getestet, der insgesamt nur weniger als 800 implizite Diskursrelationsinstanzen enth{\"a}lt. Mit Hilfe der Kreuzvalidierung argumentieren wir, dass der Standardtestausschnitt der PDTB zu klein ist um daraus Schlussfolgerungen zu ziehen. Mit mehr Teststichproben in der Kreuzvalidierung w{\"u}rden wir zu anderen Schlussfolgerungen dar{\"u}ber kommen, ob ein Merkmal f{\"u}r diese Aufgabe generell vorteilhaft ist oder nicht, insbesondere wenn wir einen relativ gro{\ss}en Labelsatz verwenden. Wenn wir nur unseren kleinen Standardtestsatz herausstellen, laufen wir Gefahr, falsche Schl{\"u}sse dar{\"u}ber zu ziehen, welche Merkmale hilfreich sind. Zweitens schlagen wir einen einfachen Ansatz zur automatischen Extraktion von Samples impliziter Diskursrelationen aus mehrsprachigen Parallelkorpora durch R{\"u}ck{\"u}bersetzung vor. Er ist durch den Explikationsprozess motiviert, wenn Menschen einen Text {\"u}bersetzen. Nach der R{\"u}ck{\"u}bersetzung aus den Zielsprachen ist es f{\"u}r den Diskursparser leicht, diejenigen Beispiele zu identifizieren, die urspr{\"u}nglich implizit, in den R{\"u}ck{\"u}bersetzungen aber explizit enthalten sind. Da diese zus{\"a}tzlichen Daten im Trainingsset enthalten sind, zeigen die Experimente signifikante Verbesserungen in verschiedenen Situationen. Wir verwenden zun{\"a}chst nur franz{\"o}sisch-englische Paare und haben keine Kontrolle {\"u}ber die Qualit{\"a}t und konzentrieren uns meist auf die satzinternen Relationen. Um diese Fragen in Angriff zu nehmen, erweitern wir die Idee sp{\"a}ter mit mehr Vorverarbeitungsschritten und mehr Sprachpaaren. Mit den Mehrheitsentscheidungen aus verschiedenen Sprachpaaren sind die gemappten impliziten Labels zuverl{\"a}ssiger. Schlie{\ss}lich ist auch eine bessere Kodierf{\"a}higkeit von entscheidender Bedeutung f{\"u}r die Verbesserung der Klassifizierungsleistung. Wir schlagen ein neues Modell vor, das aus einem Klassifikator und einem Sequenz-zu-Sequenz-Modell besteht. Neben der korrekten Vorhersage des Labels werden sie auch darauf trainiert, eine Repr{\"a}sentation der Diskursrelationsargumente zu erzeugen, indem sie versuchen, die Argumente einschlie{\ss}lich eines geeigneten impliziten Konnektivs vorherzusagen. Die neuartige sekund{\"a}re Aufgabe zwingt die interne Repr{\"a}sentation dazu, die Semantik der Relationsargumente vollst{\"a}ndiger zu kodieren und eine feink{\"o}rnigere Klassifikation vorzunehmen. Um das allgemeine Wissen in Kontexten weiter zu erfassen, setzen wir auch ein Ged{\"a}chtnisnetzwerk ein, um eine explizite Kontextrepr{\"a}sentation von Trainingsbeispielen f{\"u}r Kontexte zu erhalten. F{\"u}r jede Testinstanz erzeugen wir durch gewichtetes Lesen des Ged{\"a}chtnisses einen Wissensvektor. Wir evaluieren das vorgeschlagene Modell unter verschiedenen Bedingungen und die Ergebnisse zeigen, dass das Modell mit dem Speichernetzwerk die Vorhersage von Diskursrelationen erleichtern kann, indem es Beispiele ausw{\"a}hlt, die eine {\"a}hnliche semantische Repr{\"a}sentation und Diskursrelationen aufweisen. Auch wenn ein besseres Verst{\"a}ndnis, eine Kodierung und semantische Interpretation f{\"u}r die Aufgabe der impliziten Diskursrelationsklassifikation unerl{\"a}sslich und n{\"u}tzlich sind, so leistet sie doch nur einen Teil der Arbeit. Ein guter impliziter Diskursrelationsklassifikator sollte sich auch der bevorstehenden Ereignisse, Ursachen, Folgen usw. bewusst sein, um die Diskurserwartung in die Satzdarstellungen zu kodieren. Mit Hilfe des k{\"u}rzlich vorgeschlagenen BERT-Modells versuchen wir herauszufinden, ob es f{\"u}r die Aufgabe vorteilhaft ist, den richtigen n{\"a}chsten Satz zu haben oder nicht. Die experimentellen Ergebnisse zeigen, dass das Entfernen der Aufgabe zur Vorhersage des n{\"a}chsten Satzes die Leistung sowohl innerhalb der Dom{\"a}ne als auch dom{\"a}nen{\"u}bergreifend stark beeintr{\"a}chtigt. Die begrenzte F{\"a}higkeit von BioBERT, dom{\"a}nenspezifisches Wissen, d.h. Entit{\"a}tsinformationen, Entit{\"a}tsbeziehungen etc. zu erlernen, motiviert uns, externes Wissen in die vortrainierten Sprachmodelle zu integrieren. Wir schlagen eine un{\"u}berwachte Methode vor, bei der Information-Retrieval-System und Wissensgraphen-Techniken verwendet werden, mit der Annahme, dass, wenn zwei Instanzen {\"a}hnliche Entit{\"a}ten in beiden relationalen Argumenten teilen, die Wahrscheinlichkeit gro{\ss} ist, dass sie die gleiche oder eine {\"a}hnliche Diskursrelation haben. Der Ansatz erzielt vergleichbare Ergebnisse auf BioDRB, verglichen mit Baselinemodellen. Anschlie{\ss}end verwenden wir die extrahierten relevanten Entit{\"a}ten zur Verbesserung des vortrainierten Modells K-BERT, um die Bedeutung der Argumente besser zu kodieren und das urspr{\"u}ngliche BERT und BioBERT mit einer Genauigkeit von 6,5% bzw. 2% zu {\"u}bertreffen. Zusammenfassend tr{\"a}gt diese Dissertation dazu bei, das Problem des Datenengpasses bei der impliziten Diskursrelationsklassifikation anzugehen, und schl{\"a}gt entsprechende Ans{\"a}tze in verschiedenen Aspekten vor, u.a. die Darstellung des begrenzten Datenproblems und der Risiken bei der Schlussfolgerung daraus; die Erfassung automatisch annotierter Daten durch den Explikationsprozess w{\"a}hrend der manuellen {\"U}bersetzung zwischen Englisch und anderen Sprachen; eine bessere Repr{\"a}sentation von Diskursrelationsargumenten; Entity-Enhancement mit einer un{\"u}berwachten Methode und einem vortrainierten Sprachmodell.2},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Demberg, Vera; Sanders, Ted J. M.

Individual differences in expecting coherence relations: Exploring the variability in sensitivity to contextual signals in discourse Journal Article

Discourse Processes, 57, pp. 844-861, 2020.

The current study investigated how a contextual list signal influences comprehenders’ inference generation of upcoming discourse relations and whether individual differences in working memory capacity and linguistic experience influence the generation of these inferences. Participants were asked to complete two-sentence stories, the first sentence of which contained an expression of quantity (a few, multiple). Several individual-difference measures were calculated to explore whether individual characteristics can explain the sensitivity to the contextual list signal. The results revealed that participants were sensitive to a contextual list signal (i.e., they provided list continuations), and this sensitivity was modulated by the participants’ linguistic experience, as measured by an author recognition test. The results showed no evidence that working memory affected participants’ responses. These results extend prior research by showing that contextual signals influence participants’ coherence-relation-inference generation. Further, the results of the current study emphasize the importance of individual reader characteristics when it comes to coherence-relation inferences.

@article{Scholman2020,
title = {Individual differences in expecting coherence relations: Exploring the variability in sensitivity to contextual signals in discourse},
author = {Merel Scholman and Vera Demberg and Ted J. M. Sanders},
url = {https://www.tandfonline.com/doi/full/10.1080/0163853X.2020.1813492},
doi = {https://doi.org/10.1080/0163853X.2020.1813492},
year = {2020},
date = {2020-10-02},
journal = {Discourse Processes},
pages = {844-861},
volume = {57},
number = {10},
abstract = {The current study investigated how a contextual list signal influences comprehenders’ inference generation of upcoming discourse relations and whether individual differences in working memory capacity and linguistic experience influence the generation of these inferences. Participants were asked to complete two-sentence stories, the first sentence of which contained an expression of quantity (a few, multiple). Several individual-difference measures were calculated to explore whether individual characteristics can explain the sensitivity to the contextual list signal. The results revealed that participants were sensitive to a contextual list signal (i.e., they provided list continuations), and this sensitivity was modulated by the participants’ linguistic experience, as measured by an author recognition test. The results showed no evidence that working memory affected participants’ responses. These results extend prior research by showing that contextual signals influence participants’ coherence-relation-inference generation. Further, the results of the current study emphasize the importance of individual reader characteristics when it comes to coherence-relation inferences.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Crible, Ludivine; Demberg, Vera

When Do We Leave Discourse Relations Underspecified? The Effect of Formality and Relation Type Journal Article

Discours, 2020.

Speakers have several options when they express a discourse relation: they can leave it implicit, or make it explicit, usually through a connective. Although not all connectives can go with every relation, there is one that is particularly frequent and compatible with very many discourse relations, namely and. In this paper, we investigate the effect of discourse relation type and text genre on the production and perception of underspecified relations of contrast and consequence signalled by and. We combine a corpus study of spoken English, a production experiment and a perception experiment in order to test two hypotheses: (1) and is more compatible with relations of consequence than of contrast, due to factors of cognitive complexity and conceptual differences; (2) and is more compatible with informal than formal genres, because of requirements of recipient design. The three studies partially converge in identifying a stable effect of relation type and genre on the production and perception of underspecified relations of consequence and contrast marked by and.

@article{Crible2020,
title = {When Do We Leave Discourse Relations Underspecified? The Effect of Formality and Relation Type},
author = {Ludivine Crible and Vera Demberg},
url = {https://journals.openedition.org/discours/10848},
doi = {https://doi.org/10.4000/discours.10848},
year = {2020},
date = {2020},
journal = {Discours},
number = {26},
abstract = {Speakers have several options when they express a discourse relation: they can leave it implicit, or make it explicit, usually through a connective. Although not all connectives can go with every relation, there is one that is particularly frequent and compatible with very many discourse relations, namely and. In this paper, we investigate the effect of discourse relation type and text genre on the production and perception of underspecified relations of contrast and consequence signalled by and. We combine a corpus study of spoken English, a production experiment and a perception experiment in order to test two hypotheses: (1) and is more compatible with relations of consequence than of contrast, due to factors of cognitive complexity and conceptual differences; (2) and is more compatible with informal than formal genres, because of requirements of recipient design. The three studies partially converge in identifying a stable effect of relation type and genre on the production and perception of underspecified relations of consequence and contrast marked by and.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Torabi Asr, Fatemeh; Demberg, Vera

Interpretation of Discourse Connectives Is Probabilistic: Evidence From the Study of But and Although Journal Article

Discourse Processes, 57, pp. 376-399, 2020.

Connectives can facilitate the processing of discourse relations by helping comprehenders to infer the intended coherence relation holding between two text spans. Previous experimental studies have focused on pairs of connectives that are very different from one another to be able to compare and formalize the distinguishing effects of these particles in discourse comprehension. In this article, we compare two connectives, but and although, which overlap in terms of the relations they can signal. We demonstrate in a set of carefully controlled studies that while a connective can be a marker of several discourse relations, it can have a specific fine-grained biasing effect on linguistic inferences and that this bias can be derived (or predicted) from the connectives’ distribution of relations found in production data. The effects that we find speak to the ambiguity of discourse connectives, in general, and the different functions of but and although, in particular. These effects cannot be explained within the earlier accounts of discourse connectives, which propose that each connective has a core meaning or processing instruction. Instead, we here lay out a probabilistic account of connective meaning and interpretation, which is based on the distribution of connectives in production and is supported by our experimental findings.

@article{torabi2020interpretation,
title = {Interpretation of Discourse Connectives Is Probabilistic: Evidence From the Study of But and Although},
author = {Fatemeh Torabi Asr and Vera Demberg},
url = {https://www.tandfonline.com/doi/full/10.1080/0163853X.2019.1700760},
doi = {https://doi.org/10.1080/0163853X.2019.1700760},
year = {2020},
date = {2020-01-27},
journal = {Discourse Processes},
pages = {376-399},
volume = {57},
number = {4},
abstract = {Connectives can facilitate the processing of discourse relations by helping comprehenders to infer the intended coherence relation holding between two text spans. Previous experimental studies have focused on pairs of connectives that are very different from one another to be able to compare and formalize the distinguishing effects of these particles in discourse comprehension. In this article, we compare two connectives, but and although, which overlap in terms of the relations they can signal. We demonstrate in a set of carefully controlled studies that while a connective can be a marker of several discourse relations, it can have a specific fine-grained biasing effect on linguistic inferences and that this bias can be derived (or predicted) from the connectives’ distribution of relations found in production data. The effects that we find speak to the ambiguity of discourse connectives, in general, and the different functions of but and although, in particular. These effects cannot be explained within the earlier accounts of discourse connectives, which propose that each connective has a core meaning or processing instruction. Instead, we here lay out a probabilistic account of connective meaning and interpretation, which is based on the distribution of connectives in production and is supported by our experimental findings.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Shi, Wei; Demberg, Vera

Next Sentence Prediction helps Implicit Discourse Relation Classification within and across Domains Inproceedings

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, pp. 5789-5795, Hong Kong, China, 2019.

Implicit discourse relation classification is one of the most difficult tasks in discourse parsing. Previous studies have generally focused on extracting better representations of the relational arguments. In order to solve the task, it is however additionally necessary to capture what events are expected to cause or follow each other. Current discourse relation classifiers fall short in this respect. We here show that this shortcoming can be effectively addressed by using the bidirectional encoder representation from transformers (BERT) proposed by Devlin et al. (2019), which were trained on a nextsentence prediction task, and thus encode a representation of likely next sentences. The BERT-based model outperforms the current state of the art in 11-way classification by 8% points on the standard PDTB dataset. Our experiments also demonstrate that the model can be successfully ported to other domains: on the BioDRB dataset, the model outperforms
the state of the art system around 15% points.

@inproceedings{shi-demberg-2019-next,
title = {Next Sentence Prediction helps Implicit Discourse Relation Classification within and across Domains},
author = {Wei Shi and Vera Demberg},
url = {https://www.aclweb.org/anthology/D19-1586},
doi = {https://doi.org/10.18653/v1/D19-1586},
year = {2019},
date = {2019-11-03},
booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
pages = {5789-5795},
publisher = {Association for Computational Linguistics},
address = {Hong Kong, China},
abstract = {Implicit discourse relation classification is one of the most difficult tasks in discourse parsing. Previous studies have generally focused on extracting better representations of the relational arguments. In order to solve the task, it is however additionally necessary to capture what events are expected to cause or follow each other. Current discourse relation classifiers fall short in this respect. We here show that this shortcoming can be effectively addressed by using the bidirectional encoder representation from transformers (BERT) proposed by Devlin et al. (2019), which were trained on a nextsentence prediction task, and thus encode a representation of likely next sentences. The BERT-based model outperforms the current state of the art in 11-way classification by 8% points on the standard PDTB dataset. Our experiments also demonstrate that the model can be successfully ported to other domains: on the BioDRB dataset, the model outperforms the state of the art system around 15% points.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel

Coherence relations in discourse and cognition: comparing approaches, annotations, and interpretations PhD Thesis

Saarland University, Saarbruecken, Germany, 2019.

When readers comprehend a discourse, they do not merely interpret each clause or sentence separately; rather, they assign meaning to the text by creating semantic links between the clauses and sentences. These links are known as coherence relations (cf. Hobbs, 1979; Sanders, Spooren & Noordman, 1992). If readers are not able to construct such relations between the clauses and sentences of a text, they will fail to fully understand that text. Discourse coherence is therefore crucial to natural language comprehension in general. Most frameworks that propose inventories of coherence relation types agree on the existence of certain coarse-grained relation types, such as causal relations (relations types belonging to the causal class include Cause or Result relations), and additive relations (e.g., Conjunctions or Specifications). However, researchers often disagree on which finer-grained relation types hold and, as a result, there is no uniform set of relations that the community has agreed on (Hovy & Maier, 1995). Using a combination of corpus-based studies and off-line and on-line experimental methods, the studies reported in this dissertation examine distinctions between types of relations. The studies are based on the argument that coherence relations are cognitive entities, and distinctions of coherence relation types should therefore be validated using observations that speak to both the descriptive adequacy and the cognitive plausibility of the distinctions. Various distinctions between relation types are investigated on several levels, corresponding to the central challenges of the thesis. First, the distinctions that are made in approaches to coherence relations are analysed by comparing the relational classes and assessing the theoretical correspondences between the proposals. An interlingua is developed that can be used to map relational labels from one approach to another, therefore improving the interoperability between the different approaches. Second, practical correspondences between different approaches are studied by evaluating datasets containing coherence relation annotations from multiple approaches. A comparison of the annotations from different approaches on the same data corroborate the interlingua, but also reveal systematic patterns of discrepancies between the frameworks that are caused by different operationalizations. Finally, in the experimental part of the dissertation, readers’ interpretations are investigated to determine whether readers are able to distinguish between specific types of relations that cause the discrepancies between approaches. Results from off-line and online studies provide insight into readers’ interpretations of multi-interpretable relations, individual differences in interpretations, anticipation of discourse structure, and distributional differences between languages on readers’ processing of discourse. In sum, the studies reported in this dissertation contribute to a more detailed understanding of which types of relations comprehenders construct and how these relations are inferred and processed.

@phdthesis{Scholman_diss_2019,
title = {Coherence relations in discourse and cognition: comparing approaches, annotations, and interpretations},
author = {Merel Scholman},
url = {http://nbn-resolving.de/urn:nbn:de:bsz:291--ds-278687},
doi = {https://doi.org/http://dx.doi.org/10.22028/D291-27868},
year = {2019},
date = {2019},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {When readers comprehend a discourse, they do not merely interpret each clause or sentence separately; rather, they assign meaning to the text by creating semantic links between the clauses and sentences. These links are known as coherence relations (cf. Hobbs, 1979; Sanders, Spooren & Noordman, 1992). If readers are not able to construct such relations between the clauses and sentences of a text, they will fail to fully understand that text. Discourse coherence is therefore crucial to natural language comprehension in general. Most frameworks that propose inventories of coherence relation types agree on the existence of certain coarse-grained relation types, such as causal relations (relations types belonging to the causal class include Cause or Result relations), and additive relations (e.g., Conjunctions or Specifications). However, researchers often disagree on which finer-grained relation types hold and, as a result, there is no uniform set of relations that the community has agreed on (Hovy & Maier, 1995). Using a combination of corpus-based studies and off-line and on-line experimental methods, the studies reported in this dissertation examine distinctions between types of relations. The studies are based on the argument that coherence relations are cognitive entities, and distinctions of coherence relation types should therefore be validated using observations that speak to both the descriptive adequacy and the cognitive plausibility of the distinctions. Various distinctions between relation types are investigated on several levels, corresponding to the central challenges of the thesis. First, the distinctions that are made in approaches to coherence relations are analysed by comparing the relational classes and assessing the theoretical correspondences between the proposals. An interlingua is developed that can be used to map relational labels from one approach to another, therefore improving the interoperability between the different approaches. Second, practical correspondences between different approaches are studied by evaluating datasets containing coherence relation annotations from multiple approaches. A comparison of the annotations from different approaches on the same data corroborate the interlingua, but also reveal systematic patterns of discrepancies between the frameworks that are caused by different operationalizations. Finally, in the experimental part of the dissertation, readers’ interpretations are investigated to determine whether readers are able to distinguish between specific types of relations that cause the discrepancies between approaches. Results from off-line and online studies provide insight into readers’ interpretations of multi-interpretable relations, individual differences in interpretations, anticipation of discourse structure, and distributional differences between languages on readers’ processing of discourse. In sum, the studies reported in this dissertation contribute to a more detailed understanding of which types of relations comprehenders construct and how these relations are inferred and processed.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   B2

Zhai, Fangzhou; Demberg, Vera; Shkadzko, Pavel; Shi, Wei; Sayeed, Asad

A Hybrid Model for Globally Coherent Story Generation Inproceedings

Proceedings of the Second Workshop on Storytelling, Association for Computational Linguistics, pp. 34-45, Florence, Italy, 2019.

Automatically generating globally coherent stories is a challenging problem. Neural text generation models have been shown to perform well at generating fluent sentences from data, but they usually fail to keep track of the overall coherence of the story after a couple of sentences. Existing work that incorporates a text planning module succeeded in generating recipes and dialogues, but appears quite data-demanding. We propose a novel story generation approach that generates globally coherent stories from a fairly small corpus. The model exploits a symbolic text planning module to produce text plans, thus reducing the demand of data; a neural surface realization module then generates fluent text conditioned on the text plan. Human evaluation showed that our model outperforms various baselines by a wide margin and generates stories which are fluent as well as globally coherent.

@inproceedings{Fangzhou2019,
title = {A Hybrid Model for Globally Coherent Story Generation},
author = {Fangzhou Zhai and Vera Demberg and Pavel Shkadzko and Wei Shi and Asad Sayeed},
url = {https://aclanthology.org/W19-3404},
doi = {https://doi.org/10.18653/v1/W19-3404},
year = {2019},
date = {2019},
booktitle = {Proceedings of the Second Workshop on Storytelling},
pages = {34-45},
publisher = {Association for Computational Linguistics},
address = {Florence, Italy},
abstract = {Automatically generating globally coherent stories is a challenging problem. Neural text generation models have been shown to perform well at generating fluent sentences from data, but they usually fail to keep track of the overall coherence of the story after a couple of sentences. Existing work that incorporates a text planning module succeeded in generating recipes and dialogues, but appears quite data-demanding. We propose a novel story generation approach that generates globally coherent stories from a fairly small corpus. The model exploits a symbolic text planning module to produce text plans, thus reducing the demand of data; a neural surface realization module then generates fluent text conditioned on the text plan. Human evaluation showed that our model outperforms various baselines by a wide margin and generates stories which are fluent as well as globally coherent.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A3 B2

Yung, Frances Pik Yu; Scholman, Merel; Demberg, Vera

Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task Inproceedings

Annemarie and Zeyrek, Deniz and Hoek, Jet, Friedrich, (Ed.): Linguistic Annotation Workshop at ACL. LAW XIII 2019, pp. 16-25, Stroudsburg, PA, 2019, ISBN 978-1-950737-38-3.

The perspective of being able to crowd-source coherence relations bears the promise of acquiring annotations for new texts quickly, which could then increase the size and variety of discourse-annotated corpora. It would also open the avenue to answering new research questions: Collecting annotations from a larger number of individuals per instance would allow to investigate the distribution of inferred relations, and to study individual differences in coherence relation interpretation. However, annotating coherence relations with untrained workers is not trivial. We here propose a novel two-step annotation procedure, which extends an earlier method by Scholman and Demberg (2017a). In our approach, coherence relation labels are inferred from connectives that workers insert into the text. We show that the proposed method leads to replicable coherence annotations, and analyse the agreement between the obtained relation labels and annotations from PDTB and RSTDT on the same texts.

@inproceedings{Yung2019,
title = {Crowdsourcing Discourse Relation Annotations by a Two-Step Connective Insertion Task},
author = {Frances Pik Yu Yung and Merel Scholman and Vera Demberg},
editor = {Friedrich Annemarie and Zeyrek Deniz and Hoek Jet},
url = {https://aclanthology.org/W19-4003.pdf},
doi = {https://doi.org/10.22028/D291-30470},
year = {2019},
date = {2019-08-01},
isbn = {978-1-950737-38-3},
pages = {16-25},
publisher = {Linguistic Annotation Workshop at ACL. LAW XIII 2019},
address = {Stroudsburg, PA},
abstract = {The perspective of being able to crowd-source coherence relations bears the promise of acquiring annotations for new texts quickly, which could then increase the size and variety of discourse-annotated corpora. It would also open the avenue to answering new research questions: Collecting annotations from a larger number of individuals per instance would allow to investigate the distribution of inferred relations, and to study individual differences in coherence relation interpretation. However, annotating coherence relations with untrained workers is not trivial. We here propose a novel two-step annotation procedure, which extends an earlier method by Scholman and Demberg (2017a). In our approach, coherence relation labels are inferred from connectives that workers insert into the text. We show that the proposed method leads to replicable coherence annotations, and analyse the agreement between the obtained relation labels and annotations from PDTB and RSTDT on the same texts.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Shi, Wei; Yung, Frances Pik Yu; Demberg, Vera

Acquiring Annotated Data with Cross-lingual Explicitation for Implicit Discourse Relation Classification Inproceedings

Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019, Association for Computational Linguistics, pp. 12-21, Minneapolis, USA, 2019.

Implicit discourse relation classification is one of the most challenging and important tasks in discourse parsing, due to the lack of connectives as strong linguistic cues. A principle bottleneck to further improvement is the shortage of training data (ca. 18k instances in the Penn Discourse Treebank (PDTB)). Shi et al. (2017) proposed to acquire additional data by exploiting connectives in translation: human translators mark discourse relations which are implicit in the source language explicitly in the translation. Using back-translations of such explicitated connectives improves discourse relation parsing performance. This paper addresses the open question of whether the choice of the translation language matters, and whether multiple translations into different languages can be effectively used to improve the quality of the additional data.

@inproceedings{Shi2019,
title = {Acquiring Annotated Data with Cross-lingual Explicitation for Implicit Discourse Relation Classification},
author = {Wei Shi and Frances Pik Yu Yung and Vera Demberg},
url = {https://aclanthology.org/W19-2703},
doi = {https://doi.org/10.18653/v1/W19-2703},
year = {2019},
date = {2019-06-06},
booktitle = {Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019},
pages = {12-21},
publisher = {Association for Computational Linguistics},
address = {Minneapolis, USA},
abstract = {Implicit discourse relation classification is one of the most challenging and important tasks in discourse parsing, due to the lack of connectives as strong linguistic cues. A principle bottleneck to further improvement is the shortage of training data (ca. 18k instances in the Penn Discourse Treebank (PDTB)). Shi et al. (2017) proposed to acquire additional data by exploiting connectives in translation: human translators mark discourse relations which are implicit in the source language explicitly in the translation. Using back-translations of such explicitated connectives improves discourse relation parsing performance. This paper addresses the open question of whether the choice of the translation language matters, and whether multiple translations into different languages can be effectively used to improve the quality of the additional data.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Demberg, Vera; Scholman, Merel; Torabi Asr, Fatemeh

How compatible are our discourse annotation frameworks? Insights from mapping RST-DT and PDTB annotations Journal Article

Dialogue & Discourse , 10, pp. 87-135, 2019.

Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks. This makes comparison of the annotations difficult, thereby also preventing researchers from searching the corpora in a unified way, or using all annotated data jointly to train computational systems. Several theoretical proposals have recently been made for mapping the relational labels of different frameworks to each other, but these proposals have so far not been validated against existing annotations. The two largest discourse relation annotated resources, the Penn Discourse Treebank and the Rhetorical Structure Theory Discourse Treebank, have however been annotated on the same text, allowing for a direct comparison of the annotation layers. We propose a method for automatically aligning the discourse segments, and then evaluate existing mapping proposals by comparing the empirically observed against the proposed mappings. Our analysis highlights the influence of segmentation on subsequent discourse relation labeling, and shows that while agreement between frameworks is reasonable for explicit relations, agreement on implicit relations is low. We identify several sources of systematic discrepancies between the two annotation schemes and discuss consequences of these discrepancies for future annotation and for the training of automatic discourse relation labellers.

@article{Demberg2019,
title = {How compatible are our discourse annotation frameworks? Insights from mapping RST-DT and PDTB annotations},
author = {Vera Demberg and Merel Scholman and Fatemeh Torabi Asr},
url = {http://arxiv.org/abs/1704.08893},
year = {2019},
date = {2019-06-01},
journal = {Dialogue & Discourse},
pages = {87-135},
volume = {10},
number = {1},
abstract = {Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks. This makes comparison of the annotations difficult, thereby also preventing researchers from searching the corpora in a unified way, or using all annotated data jointly to train computational systems. Several theoretical proposals have recently been made for mapping the relational labels of different frameworks to each other, but these proposals have so far not been validated against existing annotations. The two largest discourse relation annotated resources, the Penn Discourse Treebank and the Rhetorical Structure Theory Discourse Treebank, have however been annotated on the same text, allowing for a direct comparison of the annotation layers. We propose a method for automatically aligning the discourse segments, and then evaluate existing mapping proposals by comparing the empirically observed against the proposed mappings. Our analysis highlights the influence of segmentation on subsequent discourse relation labeling, and shows that while agreement between frameworks is reasonable for explicit relations, agreement on implicit relations is low. We identify several sources of systematic discrepancies between the two annotation schemes and discuss consequences of these discrepancies for future annotation and for the training of automatic discourse relation labellers.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Successfully