Bourgonje, Peter; Lin, Pin-Jie

Projecting Annotations for Discourse Relations: Connective Identification for Low-Resource Languages

Strube, Michael; Braud, Chloe; Hardmeier, Christian; Jessy Li, Junyi; Loaiciga, Sharid; Zeldes, Amir; Li, Chuyuan (Ed.): Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024), Association for Computational Linguistics, pp. 39-49, St. Julians, Malta, 2024.

We present a pipeline for multi-lingual Shallow Discourse Parsing. The pipeline exploits Machine Translation and Word Alignment, by translating any incoming non-English input text into English, applying an English discourse parser, and projecting the found relations onto the original input text through word alignments. While the purpose of the pipeline is to provide rudimentary discourse relation annotations for low-resource languages, in order to get an idea of performance, we evaluate it on the sub-task of discourse connective identification for several languages for which gold data are available. We experiment with different setups of our modular pipeline architecture and analyze intermediate results. Our code is made available on GitHub.