Publications

Yung, Frances Pik Yu; Ahmad, Mansoor; Scholman, Merel; Demberg, Vera

Prompting Implicit Discourse Relation Annotation Inproceedings Forthcoming

Proceedings of Linguistic Annotation Workshop of European Chapter of the Association for Computational Linguistics, 2024.

Pre-trained large language models, such as ChatGPT, archive outstanding performance in various reasoning tasks without supervised training and were found to have outperformed crowdsourcing workers. Nonetheless, ChatGPT’s performance in the task of implicit discourse relation classification, prompted by a standard multiple-choice question, is still far from satisfactory and considerably inferior to state-of-the-art supervised approaches. This work investigates several proven prompting techniques to improve ChatGPT’s recognition of discourse relations. In particular, we experimented with breaking down the classification task that involves numerous abstract labels into smaller subtasks. Nonetheless, experiment results show that the inference accuracy hardly changes even with sophisticated prompt engineering, suggesting that implicit discourse relation classification is not yet resolvable under zero-shot or few-shot settings.

@inproceedings{yung-etal-2024-prompting,
title = {Prompting Implicit Discourse Relation Annotation},
author = {Frances Pik Yu Yung and Mansoor Ahmad and Merel Scholman and Vera Demberg},
year = {2024},
date = {2024},
booktitle = {Proceedings of Linguistic Annotation Workshop of European Chapter of the Association for Computational Linguistics},
abstract = {Pre-trained large language models, such as ChatGPT, archive outstanding performance in various reasoning tasks without supervised training and were found to have outperformed crowdsourcing workers. Nonetheless, ChatGPT's performance in the task of implicit discourse relation classification, prompted by a standard multiple-choice question, is still far from satisfactory and considerably inferior to state-of-the-art supervised approaches. This work investigates several proven prompting techniques to improve ChatGPT's recognition of discourse relations. In particular, we experimented with breaking down the classification task that involves numerous abstract labels into smaller subtasks. Nonetheless, experiment results show that the inference accuracy hardly changes even with sophisticated prompt engineering, suggesting that implicit discourse relation classification is not yet resolvable under zero-shot or few-shot settings.},
pubstate = {forthcoming},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Yung, Frances Pik Yu; Scholman, Merel; Zikanova, Sarka; Demberg, Vera

DiscoGeM 2.0: A parallel corpus of English, German, French and Czech implicit discourse relations Inproceedings Forthcoming

The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024.

@inproceedings{yung-etal-2024,
title = {DiscoGeM 2.0: A parallel corpus of English, German, French and Czech implicit discourse relations},
author = {Frances Pik Yu Yung and Merel Scholman and Sarka Zikanova and Vera Demberg},
year = {2024},
date = {2024},
booktitle = {The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
pubstate = {forthcoming},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Lin, Pin-Jie; Saeed, Muhammed; Scholman, Merel; Demberg, Vera

Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin Inproceedings Forthcoming

The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024.

@inproceedings{lin-et-al-2024,
title = {Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin},
author = {Pin-Jie Lin and Muhammed Saeed and Merel Scholman and Vera Demberg},
year = {2024},
date = {2024},
booktitle = {The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
pubstate = {forthcoming},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Marchal, Marian; Demberg, Vera

Connective comprehension in adults: The influence of lexical transparency, frequency, and individual differences Journal Article Forthcoming

Discourse Processes, 2024.

This study aims to investigate: (1) whether differences in how well people understand the meaning and intended use of English connectives are dependent on connective-internal features, and (2) whether any individual variation between participants can be explained by participants’ linguistic experience, word knowledge, non-verbal IQ, and intrinsic motivation.

@article{Scholman_etal_2024,
title = {Connective comprehension in adults: The influence of lexical transparency, frequency, and individual differences},
author = {Merel Scholman and Marian Marchal and Vera Demberg},
year = {2024},
date = {2024},
journal = {Discourse Processes},
abstract = {This study aims to investigate: (1) whether differences in how well people understand the meaning and intended use of English connectives are dependent on connective-internal features, and (2) whether any individual variation between participants can be explained by participants’ linguistic experience, word knowledge, non-verbal IQ, and intrinsic motivation.},
pubstate = {forthcoming},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Bourgonje, Peter; Lin, Pin-Jie

Projecting Annotations for Discourse Relations: Connective Identification for Low-Resource Languages Inproceedings Forthcoming

Proceedings of the 5th Workshop on Computational Approaches to Discourse of European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Malta, 2024.

@inproceedings{Bourgonje-etal-2024,
title = {Projecting Annotations for Discourse Relations: Connective Identification for Low-Resource Languages},
author = {Peter Bourgonje and Pin-Jie Lin},
year = {2024},
date = {2024},
booktitle = {Proceedings of the 5th Workshop on Computational Approaches to Discourse of European Chapter of the Association for Computational Linguistics},
publisher = {Association for Computational Linguistics},
address = {Malta},
pubstate = {forthcoming},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Varghese, Nobel; Yung, Frances Pik Yu; Anuranjana, Kaveri; Demberg, Vera

Exploiting Knowledge about Discourse Relations for Implicit Discourse Relation Classification Inproceedings

Strube, Michael; Braud, Chloe; Hardmeier, Christian; Jessy Li, Junyi; Loaiciga, Sharid; Zeldes, Amir (Ed.): Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023), Association for Computational Linguistics, pp. 99-105, Toronto, Canada, 2023.

In discourse relation recognition, the classification labels are typically represented as one-hot vectors. However, the categories are in fact not all independent of one another on the contrary, there are several frameworks that describe the labels‘ similarities (by e.g. sorting them into a hierarchy or describing them interms of features (Sanders et al., 2021)). Recently, several methods for representing the similarities between labels have been proposed (Zhang et al., 2018; Wang et al., 2018; Xiong et al., 2021). We here explore and extend the Label Confusion Model (Guo et al., 2021) for learning a representation for discourse relation labels. We explore alternative ways of informing the model about the similarities between relations, by representing relations in terms of their names (and parent category), their typical markers, or in terms of CCR features that describe the relations. Experimental results show that exploiting label similarity improves classification results.

@inproceedings{varghese-etal-2023-exploiting,
title = {Exploiting Knowledge about Discourse Relations for Implicit Discourse Relation Classification},
author = {Nobel Varghese and Frances Pik Yu Yung and Kaveri Anuranjana and Vera Demberg},
editor = {Michael Strube and Chloe Braud and Christian Hardmeier and Junyi Jessy Li and Sharid Loaiciga and Amir Zeldes},
url = {https://doi.org/10.18653/v1/2023.codi-1.13},
doi = {https://doi.org/10.18653/v1/2023.codi-1.13},
year = {2023},
date = {2023},
booktitle = {Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)},
pages = {99-105},
publisher = {Association for Computational Linguistics},
address = {Toronto, Canada},
abstract = {In discourse relation recognition, the classification labels are typically represented as one-hot vectors. However, the categories are in fact not all independent of one another on the contrary, there are several frameworks that describe the labels' similarities (by e.g. sorting them into a hierarchy or describing them interms of features (Sanders et al., 2021)). Recently, several methods for representing the similarities between labels have been proposed (Zhang et al., 2018; Wang et al., 2018; Xiong et al., 2021). We here explore and extend the Label Confusion Model (Guo et al., 2021) for learning a representation for discourse relation labels. We explore alternative ways of informing the model about the similarities between relations, by representing relations in terms of their names (and parent category), their typical markers, or in terms of CCR features that describe the relations. Experimental results show that exploiting label similarity improves classification results.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Lin, Pin-Jie; Saeed, Muhammed; Chang, Ernie; Scholman, Merel

Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin Inproceedings

Proceedings of the 24th INTERSPEECH conference, 2023.

Developing effective spoken language processing systems for low-resource languages poses several challenges due to the lack of parallel data and limited resources for fine-tuning models. In this work, we target on improving upon both text classification and translation of Nigerian Pidgin (Naija) by collecting a large-scale parallel English-Pidgin corpus and further propose a framework of cross-lingual adaptive training that includes both continual and task adaptive training so as to adapt a base pre-trained model to low-resource languages. Our studies show that English pre-trained language models serve as a stronger prior than multilingual language models on English-Pidgin tasks with up to 2.38 BLEU improvements; and demonstrate that augmenting orthographic data and using task adaptive training with back-translation can have a significant impact on model performance.

@inproceedings{lin-et-al-2023,
title = {Low-Resource Cross-Lingual Adaptive Training for Nigerian Pidgin},
author = {Pin-Jie Lin and Muhammed Saeed and Ernie Chang and Merel Scholman},
url = {https://arxiv.org/abs/2307.00382},
year = {2023},
date = {2023},
booktitle = {Proceedings of the 24th INTERSPEECH conference},
abstract = {Developing effective spoken language processing systems for low-resource languages poses several challenges due to the lack of parallel data and limited resources for fine-tuning models. In this work, we target on improving upon both text classification and translation of Nigerian Pidgin (Naija) by collecting a large-scale parallel English-Pidgin corpus and further propose a framework of cross-lingual adaptive training that includes both continual and task adaptive training so as to adapt a base pre-trained model to low-resource languages. Our studies show that English pre-trained language models serve as a stronger prior than multilingual language models on English-Pidgin tasks with up to 2.38 BLEU improvements; and demonstrate that augmenting orthographic data and using task adaptive training with back-translation can have a significant impact on model performance.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Hoek, Jet; Scholman, Merel

Expressing non-volitional causality in English Book Chapter

Jędrzejowski, Łukasz; Fleczoreck, Constanze (Ed.): Micro- and Macro-variation of Causal Clauses: Synchronic and Diachronic Insights, John Benjamins Publishing Company, pp. 167–183, Amsterdam, 2023.
English because is assumed to be polysemous in that it can be used to mark causal relations in all domains. The current study examines this claim and explores the suitability of because to mark non-volitional content relations. In a parallel corpus study, we investigate how causal relations translated into Dutch using doordat (prototypically marking non-volitional causal relations), omdat (marking content relations), and want (marking epistemic and speech act relations) were originally expressed in English. The results show that while omdat and want are indeed typically translations of because in English, non-volitional doordat is not. A qualitative analysis reveals that non-volitional causality is more often expressed in English in a single discourse unit or using a connective restricted to the content domain. These findings have important consequences for the presumed domain generality of English because and call for a reconsideration of English translation recommendations for doordat.

@inbook{hoek-scholman-2023,
title = {Expressing non-volitional causality in English},
author = {Jet Hoek and Merel Scholman},
editor = {Łukasz Jędrzejowski and Constanze Fleczoreck},
url = {https://benjamins.com/catalog/slcs.231.06hoe},
year = {2023},
date = {2023},
booktitle = {Micro- and Macro-variation of Causal Clauses: Synchronic and Diachronic Insights},
pages = {167–183},
publisher = {John Benjamins Publishing Company},
address = {Amsterdam},
abstract = {

English because is assumed to be polysemous in that it can be used to mark causal relations in all domains. The current study examines this claim and explores the suitability of because to mark non-volitional content relations. In a parallel corpus study, we investigate how causal relations translated into Dutch using doordat (prototypically marking non-volitional causal relations), omdat (marking content relations), and want (marking epistemic and speech act relations) were originally expressed in English. The results show that while omdat and want are indeed typically translations of because in English, non-volitional doordat is not. A qualitative analysis reveals that non-volitional causality is more often expressed in English in a single discourse unit or using a connective restricted to the content domain. These findings have important consequences for the presumed domain generality of English because and call for a reconsideration of English translation recommendations for doordat.
},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   B2

Marchal, Marian; Scholman, Merel; Demberg, Vera

How Statistical Correlations Influence Discourse-Level Processing: Clause Type as a Cue for Discourse Relations Journal Article

Journal of Experimental Psychology: Learning, Memory, and Cognition, Advance online publication, 2023.
Linguistic phenomena (e.g., words and syntactic structure) co-occur with a wide variety of meanings. These systematic correlations can help readers to interpret a text and create predictions about upcoming material. However, to what extent these correlations influence discourse processing is still unknown. We address this question by examining whether clause type serves as a cue for discourse relations. We found that the co-occurrence of gerund-free adjuncts and specific discourse relations found in natural language is also reflected in readers’ offline expectations for discourse relations. However, we also found that clause structure did not facilitate the online processing of these discourse relations, nor that readers have a preference for these relations in a paraphrase selection task. The present research extends previous research on discourse relation processing, which mostly focused on lexical cues, by examining the role of non-semantic cues. We show that readers are aware of correlations between clause structure and discourse relations in natural language, but that, unlike what has been found for lexical cues, this information does not seem to influence online processing and discourse interpretation.

@article{marchal-etal-2023,
title = {How Statistical Correlations Influence Discourse-Level Processing: Clause Type as a Cue for Discourse Relations},
author = {Marian Marchal and Merel Scholman and Vera Demberg},
url = {https://doi.org/10.1037/xlm0001270},
year = {2023},
date = {2023},
journal = {Journal of Experimental Psychology: Learning, Memory, and Cognition},
publisher = {Advance online publication},
abstract = {

Linguistic phenomena (e.g., words and syntactic structure) co-occur with a wide variety of meanings. These systematic correlations can help readers to interpret a text and create predictions about upcoming material. However, to what extent these correlations influence discourse processing is still unknown. We address this question by examining whether clause type serves as a cue for discourse relations. We found that the co-occurrence of gerund-free adjuncts and specific discourse relations found in natural language is also reflected in readers’ offline expectations for discourse relations. However, we also found that clause structure did not facilitate the online processing of these discourse relations, nor that readers have a preference for these relations in a paraphrase selection task. The present research extends previous research on discourse relation processing, which mostly focused on lexical cues, by examining the role of non-semantic cues. We show that readers are aware of correlations between clause structure and discourse relations in natural language, but that, unlike what has been found for lexical cues, this information does not seem to influence online processing and discourse interpretation.
},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Yung, Frances Pik Yu; Scholman, Merel; Lapshinova-Koltunski, Ekaterina; Pollkläsener, Christina; Demberg, Vera

Investigating Explicitation of Discourse Connectives in Translation Using Automatic Annotations Inproceedings

Stoyanchev, Svetlana; Joty, Shafiq; Schlangen, David; Dusek, Ondrej; Kennington, Casey; Alikhani, Malihe (Ed.): Proceedings of the 24th Meeting of Special Interest Group on Discourse and Dialogue (SIGDAIL), Association for Computational Linguistics, pp. 21-30, Prague, Czechia, 2023.

Discourse relations have different patterns of marking across different languages. As a result, discourse connectives are often added, omitted, or rephrased in translation. Prior work has shown a tendency for explicitation of discourse connectives, but such work was conducted using restricted sample sizes due to difficulty of connective identification and alignment. The current study exploits automatic methods to facilitate a large-scale study of connectives in English and German parallel texts. Our results based on over 300 types and 18000 instances of aligned connectives and an empirical approach to compare the cross-lingual specificity gap provide strong evidence of the Explicitation Hypothesis. We conclude that discourse relations are indeed more explicit in translation than texts written originally in the same language. Automatic annotations allow us to carry out translation studies of discourse relations on a large scale. Our methodology using relative entropy to study the specificity of connectives also provides more fine-grained insights into translation patterns.

@inproceedings{yung-etal-2023-investigating,
title = {Investigating Explicitation of Discourse Connectives in Translation Using Automatic Annotations},
author = {Frances Pik Yu Yung and Merel Scholman and Ekaterina Lapshinova-Koltunski and Christina Pollkl{\"a}sener and Vera Demberg},
editor = {Svetlana Stoyanchev and Shafiq Joty and David Schlangen and Ondrej Dusek and Casey Kennington and Malihe Alikhani},
url = {https://aclanthology.org/2023.sigdial-1.2},
doi = {https://doi.org/10.18653/v1/2023.sigdial-1.2},
year = {2023},
date = {2023},
booktitle = {Proceedings of the 24th Meeting of Special Interest Group on Discourse and Dialogue (SIGDAIL)},
pages = {21-30},
publisher = {Association for Computational Linguistics},
address = {Prague, Czechia},
abstract = {Discourse relations have different patterns of marking across different languages. As a result, discourse connectives are often added, omitted, or rephrased in translation. Prior work has shown a tendency for explicitation of discourse connectives, but such work was conducted using restricted sample sizes due to difficulty of connective identification and alignment. The current study exploits automatic methods to facilitate a large-scale study of connectives in English and German parallel texts. Our results based on over 300 types and 18000 instances of aligned connectives and an empirical approach to compare the cross-lingual specificity gap provide strong evidence of the Explicitation Hypothesis. We conclude that discourse relations are indeed more explicit in translation than texts written originally in the same language. Automatic annotations allow us to carry out translation studies of discourse relations on a large scale. Our methodology using relative entropy to study the specificity of connectives also provides more fine-grained insights into translation patterns.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   B2 B7

Pyatkin, Valentina; Yung, Frances Pik Yu; Scholman, Merel; Tsarfaty, Reut; Dagan, Ido ; Demberg, Vera

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases introduced by Task Design Journal Article

Transactions of the Association for Computational Linguistics (TACL) , 2023.

Disagreement in natural language annotation has mostly been studied from a perspective of biases introduced by the annotators and the annotation frameworks. Here, we propose to analyze another source of bias: task design bias, which has a particularly strong impact on crowdsourced linguistic annotations where natural language is used to elicit the interpretation of laymen annotators. For this purpose we look at implicit discourse relation annotation, a task that has repeatedly been shown to be difficult due to the relations‘ ambiguity. We compare the annotations of 1,200 discourse relations obtained using two distinct annotation tasks and quantify the biases of both methods across four different domains. Both methods are natural language annotation tasks designed for crowdsourcing. We show that the task design can push annotators towards certain relations and that some discourse relations senses can be better elicited with one or the other annotation approach. We also conclude that this type of bias should be taken into account when training and testing models.

@article{Pyatkinetal.,
title = {Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases introduced by Task Design},
author = {Valentina Pyatkin and Frances Pik Yu Yung and Merel Scholman and Reut Tsarfaty and Ido Dagan and Vera Demberg},
url = {https://arxiv.org/abs/2304.00815},
year = {2023},
date = {2023},
journal = {Transactions of the Association for Computational Linguistics (TACL)},
abstract = {Disagreement in natural language annotation has mostly been studied from a perspective of biases introduced by the annotators and the annotation frameworks. Here, we propose to analyze another source of bias: task design bias, which has a particularly strong impact on crowdsourced linguistic annotations where natural language is used to elicit the interpretation of laymen annotators. For this purpose we look at implicit discourse relation annotation, a task that has repeatedly been shown to be difficult due to the relations' ambiguity. We compare the annotations of 1,200 discourse relations obtained using two distinct annotation tasks and quantify the biases of both methods across four different domains. Both methods are natural language annotation tasks designed for crowdsourcing. We show that the task design can push annotators towards certain relations and that some discourse relations senses can be better elicited with one or the other annotation approach. We also conclude that this type of bias should be taken into account when training and testing models.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Blything, Liam; Cain, Kate; Evers-Vermeul, Jacqueline

Discourse Rules:The Effects of Clause Order Principles on the Reading Process Journal Article

Language, Cognition and Neuroscience, 37(10), pp. 1277-1291, 2022, ISSN 2327-3798 .

In an eye-tracking-while-reading study, we investigated adult monolinguals’ (N=80) processing of two-clause sentences embedded in short narratives. Three principles theorized to guide comprehension of complex sentences were contrasted: one operating at the clause level, namely clause structure (main clause – subordinate clause or vice versa), and two operating at the discourse-level, namely givenness (given-new vs. new-given) and event order (chronological vs. reverse order). The results indicate that clause structure mainly affects early stages of processing, whereas the two principles operating at the discourse level are more important during later stages and for reading times of the entire sentence. Event order was found to operate relatively independently of the other principles. Givenness was found to overrule clause structure, a phenomenon that can be related to the grounding function of preposed subordinate clauses. We propose a new principle to reflect this interaction effect: the grounding principle.

@article{Merel_Rules_2022,
title = {Discourse Rules:The Effects of Clause Order Principles on the Reading Process},
author = {Merel Scholman and Liam Blything and Kate Cain and Jacqueline Evers-Vermeul},
url = {https://www.tandfonline.com/doi/full/10.1080/23273798.2022.2077971},
doi = {https://doi.org/10.1080/23273798.2022.2077971},
year = {2022},
date = {2022},
journal = {Language, Cognition and Neuroscience},
pages = {1277-1291},
volume = {37(10)},
abstract = {In an eye-tracking-while-reading study, we investigated adult monolinguals’ (N=80) processing of two-clause sentences embedded in short narratives. Three principles theorized to guide comprehension of complex sentences were contrasted: one operating at the clause level, namely clause structure (main clause - subordinate clause or vice versa), and two operating at the discourse-level, namely givenness (given-new vs. new-given) and event order (chronological vs. reverse order). The results indicate that clause structure mainly affects early stages of processing, whereas the two principles operating at the discourse level are more important during later stages and for reading times of the entire sentence. Event order was found to operate relatively independently of the other principles. Givenness was found to overrule clause structure, a phenomenon that can be related to the grounding function of preposed subordinate clauses. We propose a new principle to reflect this interaction effect: the grounding principle.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Mayn, Alexandra; Demberg, Vera

Pragmatics of Metaphor Revisited: Modeling the Role of Degree and Salience in Metaphor Understanding Inproceedings

Proceedings of the Annual Meeting of the Cognitive Science Society, 43(43), CogSci2022, pp. 3178ff., 2022.

Experimental pragmatics tells us that a metaphor conveys salient features of a vehicle and that highly typical featurestend to be salient. But can highly atypical features also be salient? When asking if John is loyal and hearing “John is afox”, will the hearer conclude that John is disloyal because loyalty is saliently atypical for a fox? This prediction followsfrom our RSA-based model of metaphor understanding which relies on gradient salience. Our behavioral experimentscorroborate the model’s predictions, providing evidence that high and low typicality are salient and result in high in-terpretation confidence and agreement, while average typicality is not salient and makes a metaphor confusing. Ourmodel implements the idea that other features of a vehicle, along with possible alternative vehicles, influence metaphorinterpretation. It produces a significantly better fit compared to an existing RSA model of metaphor understanding,supporting our predictions about the factors at play.

@inproceedings{Mayn_2022_of,
title = {Pragmatics of Metaphor Revisited: Modeling the Role of Degree and Salience in Metaphor Understanding},
author = {Alexandra Mayn and Vera Demberg},
url = {https://escholarship.org/uc/item/7kq207zs},
year = {2022},
date = {2022},
booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society, 43(43)},
pages = {3178ff.},
publisher = {CogSci2022},
abstract = {Experimental pragmatics tells us that a metaphor conveys salient features of a vehicle and that highly typical featurestend to be salient. But can highly atypical features also be salient? When asking if John is loyal and hearing “John is afox”, will the hearer conclude that John is disloyal because loyalty is saliently atypical for a fox? This prediction followsfrom our RSA-based model of metaphor understanding which relies on gradient salience. Our behavioral experimentscorroborate the model’s predictions, providing evidence that high and low typicality are salient and result in high in-terpretation confidence and agreement, while average typicality is not salient and makes a metaphor confusing. Ourmodel implements the idea that other features of a vehicle, along with possible alternative vehicles, influence metaphorinterpretation. It produces a significantly better fit compared to an existing RSA model of metaphor understanding,supporting our predictions about the factors at play.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Pyatkin, Valentina; Yung, Frances Pik Yu; Dagan, Ido ; Tsarfaty, Reut; Demberg, Vera

Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training Inproceedings

Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, European Language Resources Association, pp. 2148–2156, 2022.

Obtaining linguistic annotation from novice crowdworkers is far from trivial. A case in point is the annotation of discourse relations, which is a complicated task. Recent methods have obtained promising results by extracting relation labels from either discourse connectives (DCs) or question-answer (QA) pairs that participants provide. The current contribution studies the effect of worker selection and training on the agreement on implicit relation labels between workers and gold labels, for both the DC and the QA method. In Study 1, workers were not specifically selected or trained, and the results show that there is much room for improvement. Study 2 shows that a combination of selection and training does lead to improved results, but the method is cost- and time-intensive. Study 3 shows that a selection-only approach is a viable alternative; it results in annotations of comparable quality compared to annotations from trained participants. The results generalized over both the DC and QA method and therefore indicate that a selection-only approach could also be effective for other crowdsourced discourse annotation tasks.

@inproceedings{ Scholmanet-al22-3,
title = {Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training},
author = {Merel Scholman and Valentina Pyatkin and Frances Pik Yu Yung and Ido Dagan and Reut Tsarfaty and Vera Demberg},
url = {https://aclanthology.org/2022.lrec-1.231/},
year = {2022},
date = {2022},
booktitle = {Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France},
pages = {2148–2156},
publisher = {European Language Resources Association},
abstract = {Obtaining linguistic annotation from novice crowdworkers is far from trivial. A case in point is the annotation of discourse relations, which is a complicated task. Recent methods have obtained promising results by extracting relation labels from either discourse connectives (DCs) or question-answer (QA) pairs that participants provide. The current contribution studies the effect of worker selection and training on the agreement on implicit relation labels between workers and gold labels, for both the DC and the QA method. In Study 1, workers were not specifically selected or trained, and the results show that there is much room for improvement. Study 2 shows that a combination of selection and training does lead to improved results, but the method is cost- and time-intensive. Study 3 shows that a selection-only approach is a viable alternative; it results in annotations of comparable quality compared to annotations from trained participants. The results generalized over both the DC and QA method and therefore indicate that a selection-only approach could also be effective for other crowdsourced discourse annotation tasks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Dong, Tianai; Yung, Frances Pik Yu; Demberg, Vera

DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations Journal Article

Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC 22), Marseille, France, pp. 3281-3290, 2022.

We present DiscoGeM, a crowdsourced corpus of 6,505 implicit discourse relations from three genres: political speech, literature, and encyclopedic texts. Each instance was annotated by 10 crowd workers. Various label aggregation methods were explored to evaluate how to obtain a label that best captures the meaning inferred by the crowd annotators. The results show that a significant proportion of discourse relations in DiscoGeM are ambiguous and can express multiple relation senses. Probability distribution labels better capture these interpretations than single labels. Further, the results emphasize that text genre crucially affects the distribution of discourse relations, suggesting that genre should be included as a factor in automatic relation classification. We make available the newly created DiscoGeM corpus, as well as the dataset with all annotator-level labels. Both the corpus and the dataset can facilitate a multitude of applications and research purposes, for example to function as training data to improve the performance of automatic discourse relation parsers, as well as facilitate research into non-connective signals of discourse relations.

@article{Scholman_et-al22.2,
title = {DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations},
author = {Merel Scholman and Tianai Dong and Frances Pik Yu Yung and Vera Demberg},
url = {https://aclanthology.org/2022.lrec-1.351/},
year = {2022},
date = {2022},
journal = {Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC 22), Marseille, France},
pages = {3281-3290},
abstract = {We present DiscoGeM, a crowdsourced corpus of 6,505 implicit discourse relations from three genres: political speech, literature, and encyclopedic texts. Each instance was annotated by 10 crowd workers. Various label aggregation methods were explored to evaluate how to obtain a label that best captures the meaning inferred by the crowd annotators. The results show that a significant proportion of discourse relations in DiscoGeM are ambiguous and can express multiple relation senses. Probability distribution labels better capture these interpretations than single labels. Further, the results emphasize that text genre crucially affects the distribution of discourse relations, suggesting that genre should be included as a factor in automatic relation classification. We make available the newly created DiscoGeM corpus, as well as the dataset with all annotator-level labels. Both the corpus and the dataset can facilitate a multitude of applications and research purposes, for example to function as training data to improve the performance of automatic discourse relation parsers, as well as facilitate research into non-connective signals of discourse relations.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Demberg, Vera; Sanders, Ted J. M.

Descriptively adequate and cognitively plausible? Validating distinctions between types of coherence relations Journal Article

Discours, 30, pp. 1-30a, 2022.

A central issue in linguistics concerns the relationship between theories and evidence in data. We investigate this issue in the field of discourse coherence, and particularly the study of coherence relations such as causal and contrastive. Proposed inventories of coherence relations differ greatly in the type and number of proposed relations. Such proposals are often validated by focusing on either the descriptive adequacy (researcher’s intuitions on textual interpretations) or the cognitive plausibility of distinctions (empirical research on cognition). We argue that both are important, and note that the concept of cognitive plausibility is in need of a concrete definition and quantifiable operationalization. This contribution focuses on how the criterion of cognitive plausibility can be operationalized and presents a systematic validation approach to evaluate discourse frameworks. This is done by detailing how various sources of evidence can be used to support or falsify distinctions between coherence relational labels. Finally, we present methodological issues regarding verification and falsification that are of importance to all discourse researchers studying the relationship between theory and data.

@article{Scholman_etal22,
title = {Descriptively adequate and cognitively plausible? Validating distinctions between types of coherence relations},
author = {Merel Scholman and Vera Demberg and Ted J. M. Sanders},
url = {https://journals.openedition.org/discours/12075},
year = {2022},
date = {2022},
journal = {Discours},
pages = {1-30a},
volume = {30},
abstract = {A central issue in linguistics concerns the relationship between theories and evidence in data. We investigate this issue in the field of discourse coherence, and particularly the study of coherence relations such as causal and contrastive. Proposed inventories of coherence relations differ greatly in the type and number of proposed relations. Such proposals are often validated by focusing on either the descriptive adequacy (researcher’s intuitions on textual interpretations) or the cognitive plausibility of distinctions (empirical research on cognition). We argue that both are important, and note that the concept of cognitive plausibility is in need of a concrete definition and quantifiable operationalization. This contribution focuses on how the criterion of cognitive plausibility can be operationalized and presents a systematic validation approach to evaluate discourse frameworks. This is done by detailing how various sources of evidence can be used to support or falsify distinctions between coherence relational labels. Finally, we present methodological issues regarding verification and falsification that are of importance to all discourse researchers studying the relationship between theory and data.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Marchal, Marian; Scholman, Merel; Yung, Frances Pik Yu; Demberg, Vera

Establishing annotation quality in multi-label annotations Inproceedings

Proceedings of the 29th International Conference on Computational Linguistic (COLING)Proceedings of the 29th International Conference on Computational Linguistic (COLING), pp. 3659–3668, 2022.

In many linguistic fields requiring annotated data, multiple interpretations of a single item are possible. Multi-label annotations more accurately reflect this possibility. However, allowing for multi-label annotations also affects the chance that two coders agree with each other. Calculating inter-coder agreement for multi-label datasets is therefore not trivial. In the current contribution, we evaluate different metrics for calculating agreement on multi-label annotations: agreement on the intersection of annotated labels, an augmented version of Cohen’s Kappa, and precision, recall and F1. We propose a bootstrapping method to obtain chance agreement for each measure, which allows us to obtain an adjusted agreement coefficient that is more interpretable. We demonstrate how various measures affect estimates of agreement on simulated datasets and present a case study of discourse relation annotations. We also show how the proportion of double labels, and the entropy of the label distribution, influences the measures outlined above and how a bootstrapped adjusted agreement can make agreement measures more comparable across datasets in multi-label scenarios.

@inproceedings{Marchaletal22-2,
title = {Establishing annotation quality in multi-label annotations},
author = {Marian Marchal and Merel Scholman and Frances Pik Yu Yung and Vera Demberg},
url = {https://aclanthology.org/2022.coling-1.322/},
year = {2022},
date = {2022},
booktitle = {Proceedings of the 29th International Conference on Computational Linguistic (COLING)},
pages = {3659–3668},
abstract = {In many linguistic fields requiring annotated data, multiple interpretations of a single item are possible. Multi-label annotations more accurately reflect this possibility. However, allowing for multi-label annotations also affects the chance that two coders agree with each other. Calculating inter-coder agreement for multi-label datasets is therefore not trivial. In the current contribution, we evaluate different metrics for calculating agreement on multi-label annotations: agreement on the intersection of annotated labels, an augmented version of Cohen’s Kappa, and precision, recall and F1. We propose a bootstrapping method to obtain chance agreement for each measure, which allows us to obtain an adjusted agreement coefficient that is more interpretable. We demonstrate how various measures affect estimates of agreement on simulated datasets and present a case study of discourse relation annotations. We also show how the proportion of double labels, and the entropy of the label distribution, influences the measures outlined above and how a bootstrapped adjusted agreement can make agreement measures more comparable across datasets in multi-label scenarios.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Marchal, Marian; Scholman, Merel; Demberg, Vera

The effect of domain knowledge on discourse relation inferences: Relation marking and interpretation strategies Journal Article

Dialogue & Discourse, 13, pp. 49-78, 2022.

It is generally assumed that readers draw on their background knowledge to make inferences about information that is left implicit in the text. However, readers may differ in how much background knowledge they have, which may impact their text understanding. The present study investigates the role of domain knowledge in discourse relation interpretation, in order to examine how readers with high vs. low domain knowledge differ in their discourse relation inferences. We compare interpretations of experts from the field of economics and biomedical sciences in scientific biomedical texts as well as more easily accessible economic texts. The results show that high-knowledge readers from the biomedical domain are better at inferring the correct relation interpretation in biomedical texts compared to low-knowledge readers, but such an effect was not found for the economic domain. The results also suggest that, in the absence of domain knowledge, readers exploit linguistic signals other than connectives to infer the discourse relation, but domain knowledge is sometimes required to exploit these cues. The study provides insight into the impact of domain knowledge on discourse relation inferencing and how readers interpret discourse relations when they lack the required domain knowledge.

@article{Marchaletal22,
title = {The effect of domain knowledge on discourse relation inferences: Relation marking and interpretation strategies},
author = {Marian Marchal and Merel Scholman and Vera Demberg},
url = {https://journals.uic.edu/ojs/index.php/dad/article/view/12343/10711},
year = {2022},
date = {2022},
journal = {Dialogue & Discourse},
pages = {49-78},
volume = {13},
number = {(2)},
abstract = {It is generally assumed that readers draw on their background knowledge to make inferences about information that is left implicit in the text. However, readers may differ in how much background knowledge they have, which may impact their text understanding. The present study investigates the role of domain knowledge in discourse relation interpretation, in order to examine how readers with high vs. low domain knowledge differ in their discourse relation inferences. We compare interpretations of experts from the field of economics and biomedical sciences in scientific biomedical texts as well as more easily accessible economic texts. The results show that high-knowledge readers from the biomedical domain are better at inferring the correct relation interpretation in biomedical texts compared to low-knowledge readers, but such an effect was not found for the economic domain. The results also suggest that, in the absence of domain knowledge, readers exploit linguistic signals other than connectives to infer the discourse relation, but domain knowledge is sometimes required to exploit these cues. The study provides insight into the impact of domain knowledge on discourse relation inferencing and how readers interpret discourse relations when they lack the required domain knowledge.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Yung, Frances Pik Yu; Anuranjana, Kaveri; Scholman, Merel; Demberg, Vera

Label distributions help implicit discourse relation classification Inproceedings

Proceedings of the 3rd Workshop on Computational Approaches to Discourse (October 2022, Gyeongju, Republic of Korea and Online), International Conference on Computational Linguistics, pp. 48–53, 2022.

Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses. Recently, DiscoGeM, a novel multi-domain corpus, which contains 10 crowd-sourced labels per relational instance, has become available. In this paper, we analyse the co-occurrences of relations in DiscoGem and show that they are systematic and characteristic of text genre. We then test whether information on multi-label distributions in the data can help implicit relation classifiers. Our results show that incorporating multiple labels in parser training can improve its performance, and yield label distributions which are more similar to human label distributions, compared to a parser that is trained on just a single most frequent label per instance.

@inproceedings{Yungetal2022,
title = {Label distributions help implicit discourse relation classification},
author = {Frances Pik Yu Yung and Kaveri Anuranjana and Merel Scholman and Vera Demberg},
url = {https://aclanthology.org/2022.codi-1.7},
year = {2022},
date = {2022},
booktitle = {Proceedings of the 3rd Workshop on Computational Approaches to Discourse (October 2022, Gyeongju, Republic of Korea and Online)},
pages = {48–53},
publisher = {International Conference on Computational Linguistics},
abstract = {Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses. Recently, DiscoGeM, a novel multi-domain corpus, which contains 10 crowd-sourced labels per relational instance, has become available. In this paper, we analyse the co-occurrences of relations in DiscoGem and show that they are systematic and characteristic of text genre. We then test whether information on multi-label distributions in the data can help implicit relation classifiers. Our results show that incorporating multiple labels in parser training can improve its performance, and yield label distributions which are more similar to human label distributions, compared to a parser that is trained on just a single most frequent label per instance.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Shi, Wei; Demberg, Vera

Entity Enhancement for Implicit Discourse Relation Classification in the Biomedical Domain Inproceedings

Proceedings of the Joint Conference of the 59th Annual Meeting of theAssociation for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), 2021.

Implicit discourse relation classification is a challenging task, in particular when the text domain is different from the standard Penn Discourse Treebank (PDTB; Prasad et al., 2008) training corpus domain (Wall Street Journal in 1990s). We here tackle the task of implicit discourse relation classification on the biomedical domain, for which the Biomedical Discourse Relation Bank (BioDRB; Prasad et al., 2011) is available. We show that entity information can be used to improve discourse relational argument representation. In a first step, we show that explicitly marked instances that are content-wise similar to the target relations can be used to achieve good performance in the cross-domain setting using a simple unsupervised voting pipeline. As a further step, we show that with the linked entity information from the first step, a transformer which is augmented with entity-related information (KBERT; Liu et al., 2020) sets the new state of the art performance on the dataset, outperforming the large pre-trained BioBERT (Lee et al., 2020) model by 2% points.

@inproceedings{shi2021entity,
title = {Entity Enhancement for Implicit Discourse Relation Classification in the Biomedical Domain},
author = {Wei Shi and Vera Demberg},
url = {https://aclanthology.org/2021.acl-short.116.pdf},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Joint Conference of the 59th Annual Meeting of theAssociation for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)},
abstract = {Implicit discourse relation classification is a challenging task, in particular when the text domain is different from the standard Penn Discourse Treebank (PDTB; Prasad et al., 2008) training corpus domain (Wall Street Journal in 1990s). We here tackle the task of implicit discourse relation classification on the biomedical domain, for which the Biomedical Discourse Relation Bank (BioDRB; Prasad et al., 2011) is available. We show that entity information can be used to improve discourse relational argument representation. In a first step, we show that explicitly marked instances that are content-wise similar to the target relations can be used to achieve good performance in the cross-domain setting using a simple unsupervised voting pipeline. As a further step, we show that with the linked entity information from the first step, a transformer which is augmented with entity-related information (KBERT; Liu et al., 2020) sets the new state of the art performance on the dataset, outperforming the large pre-trained BioBERT (Lee et al., 2020) model by 2% points.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Successfully