Yung, Frances Pik Yu; Anuranjana, Kaveri; Scholman, Merel; Demberg, Vera

Label distributions help implicit discourse relation classification

Proceedings of the 3rd Workshop on Computational Approaches to Discourse (October 2022, Gyeongju, Republic of Korea and Online), International Conference on Computational Linguistics, pp. 48–53, 2022.

Implicit discourse relations can convey more than one relation sense, but much of the research on discourse relations has focused on single relation senses. Recently, DiscoGeM, a novel multi-domain corpus, which contains 10 crowd-sourced labels per relational instance, has become available. In this paper, we analyse the co-occurrences of relations in DiscoGem and show that they are systematic and characteristic of text genre. We then test whether information on multi-label distributions in the data can help implicit relation classifiers. Our results show that incorporating multiple labels in parser training can improve its performance, and yield label distributions which are more similar to human label distributions, compared to a parser that is trained on just a single most frequent label per instance.