Publications

Ortmann, Katrin

Automatic Phrase Recognition in Historical German Inproceedings

Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021), KONVENS 2021 Organizers, pp. 127–136, Düsseldorf, Germany, 2021.

Due to a lack of annotated data, theories of historical syntax are often based on very small, manually compiled data sets. To enable the empirical evaluation of existing hypotheses, the present study explores the automatic recognition of phrases in historical German. Using modern and historical treebanks, training data for a neural sequence labeling tool and a probabilistic parser is created, and both methods are compared on a variety of data sets. The evaluation shows that the unlexicalized parser outperforms the sequence labeling approach, achieving F1-scores of 87%–91% on modern German and between 73% and 85% on different historical corpora. An error analysis indicates that accuracy decreases especially for longer phrases, but most of the errors concern incorrect phrase boundaries, suggesting further potential for improvement.

@inproceedings{ortmann-2021b,
title = {Automatic Phrase Recognition in Historical German},
author = {Katrin Ortmann},
url = {https://aclanthology.org/2021.konvens-1.11},
year = {2021},
date = {2021-09-06},
booktitle = {Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021)},
pages = {127–136},
publisher = {KONVENS 2021 Organizers},
address = {D{\"u}sseldorf, Germany},
abstract = {Due to a lack of annotated data, theories of historical syntax are often based on very small, manually compiled data sets. To enable the empirical evaluation of existing hypotheses, the present study explores the automatic recognition of phrases in historical German. Using modern and historical treebanks, training data for a neural sequence labeling tool and a probabilistic parser is created, and both methods are compared on a variety of data sets. The evaluation shows that the unlexicalized parser outperforms the sequence labeling approach, achieving F1-scores of 87%–91% on modern German and between 73% and 85% on different historical corpora. An error analysis indicates that accuracy decreases especially for longer phrases, but most of the errors concern incorrect phrase boundaries, suggesting further potential for improvement.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C6

Mosbach, Marius; Stenger, Irina; Avgustinova, Tania; Möbius, Bernd; Klakow, Dietrich

incom.py 2.0 - Calculating Linguistic Distances and Asymmetries in Auditory Perception of Closely Related Languages Inproceedings

Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), INCOMA Ltd., pp. 968-977, Held Online, 2021.

We present an extended version of a tool developed for calculating linguistic distances and asymmetries in auditory perception of closely related languages. Along with evaluating the metrics available in the initial version of the tool, we introduce word adaptation entropy as an additional metric of linguistic asymmetry. Potential predictors of speech intelligibility are validated with human performance in spoken cognate recognition experiments for Bulgarian and Russian. Special attention is paid to the possibly different contributions of vowels and consonants in oral intercomprehension. Using incom.py 2.0 it is possible to calculate, visualize, and validate three measurement methods of linguistic distances and asymmetries as well as carrying out regression analyses in speech intelligibility between related languages.

@inproceedings{mosbach-etal-2021-incom,
title = {incom.py 2.0 - Calculating Linguistic Distances and Asymmetries in Auditory Perception of Closely Related Languages},
author = {Marius Mosbach and Irina Stenger and Tania Avgustinova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://aclanthology.org/2021.ranlp-1.110/},
year = {2021},
date = {2021-09-01},
booktitle = {Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)},
pages = {968-977},
publisher = {INCOMA Ltd.},
address = {Held Online},
abstract = {We present an extended version of a tool developed for calculating linguistic distances and asymmetries in auditory perception of closely related languages. Along with evaluating the metrics available in the initial version of the tool, we introduce word adaptation entropy as an additional metric of linguistic asymmetry. Potential predictors of speech intelligibility are validated with human performance in spoken cognate recognition experiments for Bulgarian and Russian. Special attention is paid to the possibly different contributions of vowels and consonants in oral intercomprehension. Using incom.py 2.0 it is possible to calculate, visualize, and validate three measurement methods of linguistic distances and asymmetries as well as carrying out regression analyses in speech intelligibility between related languages.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   B4 C4

Pylypenko, Daria; Amponsah-Kaakyire, Kwabena; Dutta Chowdhury, Koel; van Genabith, Josef; España-Bonet, Cristina

Comparing Feature-Engineering and Feature-Learning Approaches for Multilingual Translationese Classification Inproceedings

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 8596–8611, Online and Punta Cana, Dominican Republic, 2021.

Traditional hand-crafted linguistically-informed features have often been used for distinguishing between translated and original non-translated texts. By contrast, to date, neural architectures without manual feature engineering have been less explored for this task. In this work, we (i) compare the traditional feature-engineering-based approach to the feature-learning-based one and (ii) analyse the neural architectures in order to investigate how well the hand-crafted features explain the variance in the neural models’ predictions. We use pre-trained neural word embeddings, as well as several end-to-end neural architectures in both monolingual and multilingual settings and compare them to feature-engineering-based SVM classifiers. We show that (i) neural architectures outperform other approaches by more than 20 accuracy points, with the BERT-based model performing the best in both the monolingual and multilingual settings; (ii) while many individual hand-crafted translationese features correlate with neural model predictions, feature importance analysis shows that the most important features for neural and classical architectures differ; and (iii) our multilingual experiments provide empirical evidence for translationese universals across languages.

@inproceedings{pylypenko-etal-2021-comparing,
title = {Comparing Feature-Engineering and Feature-Learning Approaches for Multilingual Translationese Classification},
author = {Daria Pylypenko and Kwabena Amponsah-Kaakyire and Koel Dutta Chowdhury and Josef van Genabith and Cristina Espa{\~n}a-Bonet},
url = {https://aclanthology.org/2021.emnlp-main.676/},
doi = {https://doi.org/10.18653/v1/2021.emnlp-main.676},
year = {2021},
date = {2021},
booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
pages = {8596–8611},
publisher = {Association for Computational Linguistics},
address = {Online and Punta Cana, Dominican Republic},
abstract = {Traditional hand-crafted linguistically-informed features have often been used for distinguishing between translated and original non-translated texts. By contrast, to date, neural architectures without manual feature engineering have been less explored for this task. In this work, we (i) compare the traditional feature-engineering-based approach to the feature-learning-based one and (ii) analyse the neural architectures in order to investigate how well the hand-crafted features explain the variance in the neural models’ predictions. We use pre-trained neural word embeddings, as well as several end-to-end neural architectures in both monolingual and multilingual settings and compare them to feature-engineering-based SVM classifiers. We show that (i) neural architectures outperform other approaches by more than 20 accuracy points, with the BERT-based model performing the best in both the monolingual and multilingual settings; (ii) while many individual hand-crafted translationese features correlate with neural model predictions, feature importance analysis shows that the most important features for neural and classical architectures differ; and (iii) our multilingual experiments provide empirical evidence for translationese universals across languages.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B6

Dutta Chowdhury, Koel; España-Bonet, Cristina; van Genabith, Josef

Tracing Source Language Interference in Translation with Graph-Isomorphism Measures Inproceedings

Proceedings of Recent Advances in Natural Language Processing (RANLP 2021), pp. 380-390, Online, 2021, ISSN 2603-2813.

Previous research has used linguistic features to show that translations exhibit traces of source language interference and that phylogenetic trees between languages can be reconstructed from the results of translations into the same language. Recent research has shown that instances of translationese (source language interference) can even be detected in embedding spaces, comparing embeddings spaces of original language data with embedding spaces resulting from translations into the same language, using a simple Eigenvectorbased divergence from isomorphism measure. To date, it remains an open question whether alternative graph-isomorphism measures can produce better results. In this paper, we (i) explore Gromov-Hausdorff distance, (ii) present a novel spectral version of the Eigenvectorbased method, and (iii) evaluate all approaches against a broad linguistic typological database (URIEL). We show that language distances resulting from our spectral isomorphism approaches can reproduce genetic trees on a par with previous work without requiring any explicit linguistic information and that the results can be extended to non-Indo-European languages. Finally, we show that the methods are robust under a variety of modeling conditions.

@inproceedings{Chowdhury2021tracing,
title = {Tracing Source Language Interference in Translation with Graph-Isomorphism Measures},
author = {Koel Dutta Chowdhury and Cristina Espa{\~n}a-Bonet and Josef van Genabith},
url = {https://aclanthology.org/2021.ranlp-1.43/},
year = {2021},
date = {2021},
booktitle = {Proceedings of Recent Advances in Natural Language Processing (RANLP 2021)},
issn = {2603-2813},
pages = {380-390},
address = {Online},
abstract = {Previous research has used linguistic features to show that translations exhibit traces of source language interference and that phylogenetic trees between languages can be reconstructed from the results of translations into the same language. Recent research has shown that instances of translationese (source language interference) can even be detected in embedding spaces, comparing embeddings spaces of original language data with embedding spaces resulting from translations into the same language, using a simple Eigenvectorbased divergence from isomorphism measure. To date, it remains an open question whether alternative graph-isomorphism measures can produce better results. In this paper, we (i) explore Gromov-Hausdorff distance, (ii) present a novel spectral version of the Eigenvectorbased method, and (iii) evaluate all approaches against a broad linguistic typological database (URIEL). We show that language distances resulting from our spectral isomorphism approaches can reproduce genetic trees on a par with previous work without requiring any explicit linguistic information and that the results can be extended to non-Indo-European languages. Finally, we show that the methods are robust under a variety of modeling conditions.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B6

Mosbach, Marius; Andriushchenko, Maksym; Klakow, Dietrich

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines Inproceedings

International Conference on Learning Representations, 2021.

Fine-tuning pre-trained transformer-based language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks. Despite the strong empirical performance of fine-tuned models, fine-tuning is an unstable process: training the same model with multiple random seeds can result in a large variance of the task performance. Previous literature (Devlin et al., 2019; Lee et al., 2020; Dodge et al., 2020) identified two potential reasons for the observed instability: catastrophic forgetting and small size of the fine-tuning datasets. In this paper, we show that both hypotheses fail to explain the fine-tuning instability. We analyze BERT, RoBERTa, and ALBERT, fine-tuned on commonly used datasets from the GLUE benchmark, and show that the observed instability is caused by optimization difficulties that lead to vanishing gradients. Additionally, we show that the remaining variance of the downstream task performance can be attributed to differences in generalization where fine-tuned models with the same training loss exhibit noticeably different test performance. Based on our analysis, we present a simple but strong baseline that makes fine-tuning BERT-based models significantly more stable than the previously proposed approaches.

@inproceedings{mosbach2021on,
title = {On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines},
author = {Marius Mosbach and Maksym Andriushchenko and Dietrich Klakow},
url = {https://arxiv.org/abs/2006.04884},
year = {2021},
date = {2021},
booktitle = {International Conference on Learning Representations},
abstract = {Fine-tuning pre-trained transformer-based language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks. Despite the strong empirical performance of fine-tuned models, fine-tuning is an unstable process: training the same model with multiple random seeds can result in a large variance of the task performance. Previous literature (Devlin et al., 2019; Lee et al., 2020; Dodge et al., 2020) identified two potential reasons for the observed instability: catastrophic forgetting and small size of the fine-tuning datasets. In this paper, we show that both hypotheses fail to explain the fine-tuning instability. We analyze BERT, RoBERTa, and ALBERT, fine-tuned on commonly used datasets from the GLUE benchmark, and show that the observed instability is caused by optimization difficulties that lead to vanishing gradients. Additionally, we show that the remaining variance of the downstream task performance can be attributed to differences in generalization where fine-tuned models with the same training loss exhibit noticeably different test performance. Based on our analysis, we present a simple but strong baseline that makes fine-tuning BERT-based models significantly more stable than the previously proposed approaches.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Crible, Ludivine; Demberg, Vera

The role of non-connective discourse cues and their interaction with connectives Journal Article

Pragmatics & Cognition, 27, pp. 313 - 338, 2021, ISSN 0929-0907.

The disambiguation and processing of coherence relations is often investigated with a focus on explicit connectives, such as but or so. Other, non-connective cues from the context also facilitate discourse inferences, although their precise disambiguating role and interaction with connectives have been largely overlooked in the psycholinguistic literature so far. This study reports on two crowdsourcing experiments that test the role of contextual cues (parallelism, antonyms, resultative verbs) in the disambiguation of contrast and consequence relations. We compare the effect of contextual cues in conceptually different relations, and with connectives that differ in their semantic precision. Using offline tasks, our results show that contextual cues significantly help disambiguating contrast and consequence relations in the absence of connectives. However, when connectives are present in the context, the effect of cues only holds if the connective is acceptable in the target relation. Overall, our study suggests that cues are decisive on their own, but only secondary in the presence of connectives. These results call for further investigation of the complex interplay between connective types, contextual cues, relation types and other linguistic and cognitive factors.

@article{Crible2021,
title = {The role of non-connective discourse cues and their interaction with connectives},
author = {Ludivine Crible and Vera Demberg},
url = {https://www.jbe-platform.com/content/journals/10.1075/pc.20003.cri},
doi = {https://doi.org/10.1075/pc.20003.cri},
year = {2021},
date = {2021},
journal = {Pragmatics & Cognition},
pages = {313 - 338},
volume = {27},
number = {2},
abstract = {The disambiguation and processing of coherence relations is often investigated with a focus on explicit connectives, such as but or so. Other, non-connective cues from the context also facilitate discourse inferences, although their precise disambiguating role and interaction with connectives have been largely overlooked in the psycholinguistic literature so far. This study reports on two crowdsourcing experiments that test the role of contextual cues (parallelism, antonyms, resultative verbs) in the disambiguation of contrast and consequence relations. We compare the effect of contextual cues in conceptually different relations, and with connectives that differ in their semantic precision. Using offline tasks, our results show that contextual cues significantly help disambiguating contrast and consequence relations in the absence of connectives. However, when connectives are present in the context, the effect of cues only holds if the connective is acceptable in the target relation. Overall, our study suggests that cues are decisive on their own, but only secondary in the presence of connectives. These results call for further investigation of the complex interplay between connective types, contextual cues, relation types and other linguistic and cognitive factors.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Ibrahim, Omnia; Yuen, Ivan; van Os, Marjolein; Andreeva, Bistra; Möbius, Bernd

The effect of Lombard speech modifications in different information density contexts Inproceedings

Elektronische Sprachsignalverarbeitung 2021, Tagungsband der 32. Konferenz (Berlin), TUDpress, pp. 185-191, Dresden, 2021.

Speakers adapt their speech to increase clarity in the presence of back-ground noise (Lombard speech) [1, 2]. However, they also modify their speech tobe efficient by shortening word duration in more predictable contexts [3]. To meetthese two communicative functions, speakers will attempt to resolve any conflicting communicative demands. The present study focuses on how this can be resolvedin the acoustic domain. A total of 1520 target CV syllables were annotated andanalysed from 38 German speakers in 2 white-noise (no noise vs. -10 dB SNR) and 2 surprisal (H vs. L) contexts. Median fundamental frequency (F0), intensityrange, and syllable duration were extracted. Our results revealed effects of bothnoise and surprisal on syllable duration and intensity range, but only an effect ofnoise on F0. This might suggest redundant (multi-dimensional) acoustic coding in Lombard speech modification, but not so in surprisal modification.

@inproceedings{Ibrahim2021,
title = {The effect of Lombard speech modifications in different information density contexts},
author = {Omnia Ibrahim and Ivan Yuen and Marjolein van Os and Bistra Andreeva and Bernd M{\"o}bius},
url = {https://www.essv.de/paper.php?id=1117},
year = {2021},
date = {2021},
booktitle = {Elektronische Sprachsignalverarbeitung 2021, Tagungsband der 32. Konferenz (Berlin)},
pages = {185-191},
publisher = {TUDpress},
address = {Dresden},
abstract = {Speakers adapt their speech to increase clarity in the presence of back-ground noise (Lombard speech) [1, 2]. However, they also modify their speech tobe efficient by shortening word duration in more predictable contexts [3]. To meetthese two communicative functions, speakers will attempt to resolve any conflicting communicative demands. The present study focuses on how this can be resolvedin the acoustic domain. A total of 1520 target CV syllables were annotated andanalysed from 38 German speakers in 2 white-noise (no noise vs. -10 dB SNR) and 2 surprisal (H vs. L) contexts. Median fundamental frequency (F0), intensityrange, and syllable duration were extracted. Our results revealed effects of bothnoise and surprisal on syllable duration and intensity range, but only an effect ofnoise on F0. This might suggest redundant (multi-dimensional) acoustic coding in Lombard speech modification, but not so in surprisal modification.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C1

Kudera, Jacek; van Os, Marjolein; Möbius, Bernd

Natural and synthetic speech comprehension in simulated tonal and pulsatile tinnitus: A pilot study Inproceedings

Elektronische Sprachsignalverarbeitung 2021, Tagungsband der 32. Konferenz (Berlin), TUDpress, pp. 273-280, Dresden, 2021.

This paper summarizes the results of a Modified Rhyme Test conducted with masked stimuli to simulate two common types of hearing impairment: bilateral pulsatile and pure tinnitus. Two types of stimuli, meaningful German words (natural read speech and TTS output) differing in initial or final positioned minimal pairs were modified to correspond to six listening conditions. Results showed higher recognition scores for natural speech compared to synthetic and better intelligibility for pulsatile tinnitus noise over pure tone tinnitus. These insights are of relevance given the alarming rates of tinnitus in epidemiological reports.

@inproceedings{Kudera2021,
title = {Natural and synthetic speech comprehension in simulated tonal and pulsatile tinnitus: A pilot study},
author = {Jacek Kudera and Marjolein van Os and Bernd M{\"o}bius},
url = {https://www.essv.de/paper.php?id=1129},
year = {2021},
date = {2021},
booktitle = {Elektronische Sprachsignalverarbeitung 2021, Tagungsband der 32. Konferenz (Berlin)},
pages = {273-280},
publisher = {TUDpress},
address = {Dresden},
abstract = {This paper summarizes the results of a Modified Rhyme Test conducted with masked stimuli to simulate two common types of hearing impairment: bilateral pulsatile and pure tinnitus. Two types of stimuli, meaningful German words (natural read speech and TTS output) differing in initial or final positioned minimal pairs were modified to correspond to six listening conditions. Results showed higher recognition scores for natural speech compared to synthetic and better intelligibility for pulsatile tinnitus noise over pure tone tinnitus. These insights are of relevance given the alarming rates of tinnitus in epidemiological reports.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C1

Stenger, Irina; Avgustinova, Tania

On Slavic cognate recognition in context Inproceedings

P. Selegej, Vladimir et al. (Ed.): Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference ‘Dialogue’, pp. 660-668, Moscow, Russia, 2021.

This study contributes to a better understanding of reading intercomprehension as manifested in the intelligibility of East and South Slavic languages to Russian native speakers in contextualized cognate recognition experiments using Belarusian, Ukrainian, and Bulgarian stimuli. While the results mostly confirm the expected mutual intelligibility effects, we also register apparent processing difficulties in some of the cases. In search of an explanation, we examine the correlation of the experimentally obtained intercomprehension scores with various linguistic factors, which contribute to cognate intelligibility in a context, considering common predictors of intercomprehension associated with (i) morphology and orthography, (ii) lexis, and (iii) syntax.

@inproceedings{Stenger-dialog2021,
title = {On Slavic cognate recognition in context},
author = {Irina Stenger and Tania Avgustinova},
editor = {Vladimir P. Selegej et al.},
url = {https://www.dialog-21.ru/media/5547/stengeriplusavgustinovat027.pdf},
year = {2021},
date = {2021},
booktitle = {Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference ‘Dialogue’},
pages = {660-668},
address = {Moscow, Russia},
abstract = {This study contributes to a better understanding of reading intercomprehension as manifested in the intelligibility of East and South Slavic languages to Russian native speakers in contextualized cognate recognition experiments using Belarusian, Ukrainian, and Bulgarian stimuli. While the results mostly confirm the expected mutual intelligibility effects, we also register apparent processing difficulties in some of the cases. In search of an explanation, we examine the correlation of the experimentally obtained intercomprehension scores with various linguistic factors, which contribute to cognate intelligibility in a context, considering common predictors of intercomprehension associated with (i) morphology and orthography, (ii) lexis, and (iii) syntax.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Stenger, Irina; Avgustinova, Tania

Multilingual learnability and reaction time in online Slavic intercomprehension experiments Inproceedings

Koeva, Svetla; Stamenov, Maksim (Ed.): Proceedings of the International Annual Conference of the Institute for Bulgarian Language, 2, Marin Drinov Academic Publishers, pp. 191-200, Sofia, Bulgaria, 2021.

Receptive multilingualism is a multidimensional and multifactorial phenomenon that crucially depends on the mutual intelligibility of closely related languages. As a strategy, it predominantly capitalizes upon a dynamic integration of linguistic, communicative, contextual, and socio-cognitive aspects. Relevant linguistic determinants (especially linguistic distances) along with recognizable extra-linguistic influences (such as attitude and exposure) have recently enjoyed increased attention in the research community. In our online (web-based) intercomprehension experiments, we have observed learning effects that appear to be empirically associated with individual cognitive skills. For this study, we tested 185 Russian subjects in a written word recognition task which essentially involved cognate guessing in Belarusian, Bulgarian, Macedonian, Serbian, and Ukrainian. The subjects had to translate the stimuli presented online into their native language, i.e. Russian. To reveal implicit multilingual learnability, we correlate the obtained intercomprehension scores with the detected reaction times, taking into consideration the potential influence of the experiment rank on the reaction time too.

@inproceedings{Stenger-CONFIBL2021,
title = {Multilingual learnability and reaction time in online Slavic intercomprehension experiments},
author = {Irina Stenger and Tania Avgustinova},
editor = {Svetla Koeva and Maksim Stamenov},
url = {https://ibl.bas.bg/wp-content/uploads/2021/06/Sbornik_s_dokladi_CONFIBL2021_tom_2_FINAL.pdf},
year = {2021},
date = {2021},
booktitle = {Proceedings of the International Annual Conference of the Institute for Bulgarian Language},
pages = {191-200},
publisher = {Marin Drinov Academic Publishers},
address = {Sofia, Bulgaria},
abstract = {Receptive multilingualism is a multidimensional and multifactorial phenomenon that crucially depends on the mutual intelligibility of closely related languages. As a strategy, it predominantly capitalizes upon a dynamic integration of linguistic, communicative, contextual, and socio-cognitive aspects. Relevant linguistic determinants (especially linguistic distances) along with recognizable extra-linguistic influences (such as attitude and exposure) have recently enjoyed increased attention in the research community. In our online (web-based) intercomprehension experiments, we have observed learning effects that appear to be empirically associated with individual cognitive skills. For this study, we tested 185 Russian subjects in a written word recognition task which essentially involved cognate guessing in Belarusian, Bulgarian, Macedonian, Serbian, and Ukrainian. The subjects had to translate the stimuli presented online into their native language, i.e. Russian. To reveal implicit multilingual learnability, we correlate the obtained intercomprehension scores with the detected reaction times, taking into consideration the potential influence of the experiment rank on the reaction time too.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Höller, Daniel; Behnke, Gregor; Bercher, Pascal; Biundo, Susanne

The PANDA Framework for Hierarchical Planning Journal Article

Künstliche Intelligenz, 2021.

During the last years, much progress has been made in hierarchical planning towards domain-independent systems that come with sophisticated techniques to solve planning problems instead of relying on advice in the input model. Several of these novel methods have been integrated into the PANDA framework, which is a software system to reason about hierarchical planning tasks. Besides solvers for planning problems based on plan space search, progression search, and translation to propositional logic, it also includes techniques for related problems like plan repair, plan and goal recognition, or plan verifcation. These various techniques share a common infrastructure, like e.g. a standard input language or components for grounding and reachability analysis. This article gives an overview over the PANDA framework, introduces the basic techniques from a high level perspective, and surveys the literature describing the diverse components in detail.

@article{hoeller-etal-21-PANDA,
title = {The PANDA Framework for Hierarchical Planning},
author = {Daniel H{\"o}ller and Gregor Behnke and Pascal Bercher and Susanne Biundo},
url = {https://link.springer.com/article/10.1007/s13218-020-00699-y},
doi = {https://doi.org/10.1007/s13218-020-00699-y},
year = {2021},
date = {2021},
journal = {K{\"u}nstliche Intelligenz},
abstract = {During the last years, much progress has been made in hierarchical planning towards domain-independent systems that come with sophisticated techniques to solve planning problems instead of relying on advice in the input model. Several of these novel methods have been integrated into the PANDA framework, which is a software system to reason about hierarchical planning tasks. Besides solvers for planning problems based on plan space search, progression search, and translation to propositional logic, it also includes techniques for related problems like plan repair, plan and goal recognition, or plan verifcation. These various techniques share a common infrastructure, like e.g. a standard input language or components for grounding and reachability analysis. This article gives an overview over the PANDA framework, introduces the basic techniques from a high level perspective, and surveys the literature describing the diverse components in detail.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A7

Höller, Daniel; Bercher, Pascal

Landmark Generation in HTN Planning Inproceedings

Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI), 35, AAAI Press, 2021.

Landmarks (LMs) are state features that need to be made true or tasks that need to be contained in every solution of a planning problem. They are a valuable source of information in planning and can be exploited in various ways. LMs have been used both in classical and hierarchical planning, but while there is much work in classical planning, the techniques in hierarchical planning are less evolved. We introduce a novel LM generation method for Hierarchical Task Network (HTN) planning and show that it is sound and incomplete. We show that every complete approach is as hard as the co-class of the underlying HTN problem, i.e. coNP-hard for our setting (while our approach is in P). On a widely used benchmark set, our approach finds more than twice the number of landmarks than the approach from the literature. Though our focus is on LM generation, we show that the newly discovered landmarks bear information beneficial for solvers.

@inproceedings{Höller_Bercher_2021,
title = {Landmark Generation in HTN Planning},
author = {Daniel H{\"o}ller and Pascal Bercher},
url = {https://ojs.aaai.org/index.php/AAAI/article/view/17405},
doi = {https://doi.org/10.1609/aaai.v35i13.17405},
year = {2021},
date = {2021},
booktitle = {Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI)},
publisher = {AAAI Press},
abstract = {Landmarks (LMs) are state features that need to be made true or tasks that need to be contained in every solution of a planning problem. They are a valuable source of information in planning and can be exploited in various ways. LMs have been used both in classical and hierarchical planning, but while there is much work in classical planning, the techniques in hierarchical planning are less evolved. We introduce a novel LM generation method for Hierarchical Task Network (HTN) planning and show that it is sound and incomplete. We show that every complete approach is as hard as the co-class of the underlying HTN problem, i.e. coNP-hard for our setting (while our approach is in P). On a widely used benchmark set, our approach finds more than twice the number of landmarks than the approach from the literature. Though our focus is on LM generation, we show that the newly discovered landmarks bear information beneficial for solvers.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A7

Zarcone, Alessandra; Demberg, Vera

Interaction of script knowledge and temporal discourse cues in a visual world study Journal Article

Discourse Processes, Routledge, pp. 1-16, 2021.

There is now a well-established literature showing that people anticipate upcoming concepts and words during language processing. Commonsense knowledge about typical event sequences and verbal selectional preferences can contribute to anticipating what will be mentioned next. We here investigate how temporal discourse connectives (before, after), which signal event ordering along a temporal dimension, modulate predictions for upcoming discourse referents. Our study analyses anticipatory gaze in the visual world and supports the idea that script knowledge, temporal connectives (before eating → menu, appetizer), and the verb’s selectional preferences (order → appetizer) jointly contribute to shaping rapid prediction of event participants.

@article{zarcone2021script,
title = {Interaction of script knowledge and temporal discourse cues in a visual world study},
author = {Alessandra Zarcone and Vera Demberg},
url = {https://doi.org/10.1080/0163853X.2021.1930807},
doi = {https://doi.org/10.1080/0163853X.2021.1930807},
year = {2021},
date = {2021-07-26},
journal = {Discourse Processes},
pages = {1-16},
publisher = {Routledge},
abstract = {There is now a well-established literature showing that people anticipate upcoming concepts and words during language processing. Commonsense knowledge about typical event sequences and verbal selectional preferences can contribute to anticipating what will be mentioned next. We here investigate how temporal discourse connectives (before, after), which signal event ordering along a temporal dimension, modulate predictions for upcoming discourse referents. Our study analyses anticipatory gaze in the visual world and supports the idea that script knowledge, temporal connectives (before eating → menu, appetizer), and the verb’s selectional preferences (order → appetizer) jointly contribute to shaping rapid prediction of event participants.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A3

Delogu, Francesca; Brouwer, Harm; Crocker, Matthew W.

When components collide: Spatiotemporal overlap of the N400 and P600 in language comprehension Journal Article

Brain Research, 1766, pp. 147514, 2021, ISSN 0006-8993.

The problem of spatiotemporal overlap between event-related potential (ERP) components is generally acknowledged in language research. However, its implications for the interpretation of experimental results are often overlooked. In a previous experiment on the functional interpretation of the N400 and P600, it was argued that a P600 effect to implausible words was largely obscured – in one of two implausible conditions – by an overlapping N400 effect of semantic association. In the present ERP study, we show that the P600 effect of implausibility is uncovered when the critical condition is tested against a proper baseline condition which elicits a similar N400 amplitude, while it is obscured when tested against a baseline condition producing an N400 effect. Our findings reveal that component overlap can result in the apparent absence or presence of an effect in the surface signal and should therefore be carefully considered when interpreting ERP patterns. Importantly, we show that, by factoring in the effects of spatiotemporal overlap between the N400 and P600 on the surface signal, which we reveal using rERP analysis, apparent inconsistencies in previous findings are easily reconciled, enabling us to draw unambiguous conclusions about the functional interpretation of the N400 and P600 components. Overall, our results provide compelling evidence that the N400 reflects lexical retrieval processes, while the P600 indexes compositional integration of word meaning into the unfolding utterance interpretation.

@article{DELOGU2021147514,
title = {When components collide: Spatiotemporal overlap of the N400 and P600 in language comprehension},
author = {Francesca Delogu and Harm Brouwer and Matthew W. Crocker},
url = {https://www.sciencedirect.com/science/article/pii/S0006899321003711},
doi = {https://doi.org/10.1016/j.brainres.2021.147514},
year = {2021},
date = {2021},
journal = {Brain Research},
pages = {147514},
volume = {1766},
abstract = {The problem of spatiotemporal overlap between event-related potential (ERP) components is generally acknowledged in language research. However, its implications for the interpretation of experimental results are often overlooked. In a previous experiment on the functional interpretation of the N400 and P600, it was argued that a P600 effect to implausible words was largely obscured – in one of two implausible conditions – by an overlapping N400 effect of semantic association. In the present ERP study, we show that the P600 effect of implausibility is uncovered when the critical condition is tested against a proper baseline condition which elicits a similar N400 amplitude, while it is obscured when tested against a baseline condition producing an N400 effect. Our findings reveal that component overlap can result in the apparent absence or presence of an effect in the surface signal and should therefore be carefully considered when interpreting ERP patterns. Importantly, we show that, by factoring in the effects of spatiotemporal overlap between the N400 and P600 on the surface signal, which we reveal using rERP analysis, apparent inconsistencies in previous findings are easily reconciled, enabling us to draw unambiguous conclusions about the functional interpretation of the N400 and P600 components. Overall, our results provide compelling evidence that the N400 reflects lexical retrieval processes, while the P600 indexes compositional integration of word meaning into the unfolding utterance interpretation.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A1

Lemke, Tyll Robin; Reich, Ingo; Schäfer, Lisa; Drenhaus, Heiner

Predictable words are more likely to be omitted in fragments – Evidence from production data Journal Article

Frontiers in Psychology, 12, pp. 662125, 2021.

Instead of a full sentence like Bring me to the university (uttered by the passenger to a taxi driver) speakers often use fragments like To the university to get their message across. So far there is no comprehensive and empirically supported account of why and under which circumstances speakers sometimes prefer a fragment over the corresponding full sentence. We propose an information-theoretic account to model this choice: A speaker chooses the encoding that distributes information most uniformly across the utterance in order to make the most efficient use of the hearer’s processing resources (Uniform Information Density, Levy and Jaeger, 2007). Since processing effort is related to the predictability of words (Hale, 2001) our account predicts two effects of word probability on omissions: First, omitting predictable words (which are more easily processed), avoids underutilizing processing resources. Second, inserting words before very unpredictable words distributes otherwise excessively high processing effort more uniformly. We test these predictions with a production study that supports both of these predictions. Our study makes two main contributions: First we develop an empirically motivated and supported account of fragment usage. Second, we extend previous evidence for information-theoretic processing constraints on language in two ways: We find predictability effects on omissions driven by extralinguistic context, whereas previous research mostly focused on effects of local linguistic context. Furthermore, we show that omissions of content words are also subject to information-theoretic well-formedness considerations. Previously, this has been shown mostly for the omission of function words.

@article{lemke.etal2021.frontiers,
title = {Predictable words are more likely to be omitted in fragments – Evidence from production data},
author = {Tyll Robin Lemke and Ingo Reich and Lisa Sch{\"a}fer and Heiner Drenhaus},
url = {https://www.frontiersin.org/articles/10.3389/fpsyg.2021.662125/full},
doi = {https://doi.org/10.3389/fpsyg.2021.662125},
year = {2021},
date = {2021-07-22},
journal = {Frontiers in Psychology},
pages = {662125},
volume = {12},
abstract = {Instead of a full sentence like Bring me to the university (uttered by the passenger to a taxi driver) speakers often use fragments like To the university to get their message across. So far there is no comprehensive and empirically supported account of why and under which circumstances speakers sometimes prefer a fragment over the corresponding full sentence. We propose an information-theoretic account to model this choice: A speaker chooses the encoding that distributes information most uniformly across the utterance in order to make the most efficient use of the hearer's processing resources (Uniform Information Density, Levy and Jaeger, 2007). Since processing effort is related to the predictability of words (Hale, 2001) our account predicts two effects of word probability on omissions: First, omitting predictable words (which are more easily processed), avoids underutilizing processing resources. Second, inserting words before very unpredictable words distributes otherwise excessively high processing effort more uniformly. We test these predictions with a production study that supports both of these predictions. Our study makes two main contributions: First we develop an empirically motivated and supported account of fragment usage. Second, we extend previous evidence for information-theoretic processing constraints on language in two ways: We find predictability effects on omissions driven by extralinguistic context, whereas previous research mostly focused on effects of local linguistic context. Furthermore, we show that omissions of content words are also subject to information-theoretic well-formedness considerations. Previously, this has been shown mostly for the omission of function words.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B3

Abdullah, Badr M.; Mosbach, Marius; Zaitova, Iuliia; Möbius, Bernd; Klakow, Dietrich

Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study Inproceedings

Proceedings of Interspeech 2020, 2021.

Several variants of deep neural networks have been successfully employed for building parametric models that project variable-duration spoken word segments onto fixed-size vector representations, or acoustic word embeddings (AWEs). However, it remains unclear to what degree we can rely on the distance in the emerging AWE space as an estimate of word-form similarity. In this paper, we ask: does the distance in the acoustic embedding space correlate with phonological dissimilarity? To answer this question, we empirically investigate the performance of supervised approaches for AWEs with different neural architectures and learning objectives. We train AWE models in controlled settings for two languages (German and Czech) and evaluate the embeddings on two tasks: word discrimination and phonological similarity. Our experiments show that (1) the distance in the embedding space in the best cases only moderately correlates with phonological distance, and (2) improving the performance on the word discrimination task does not necessarily yield models that better reflect word phonological similarity. Our findings highlight the necessity to rethink the current intrinsic evaluations for AWEs.

@inproceedings{Abdullah2021DoAW,
title = {Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study},
author = {Badr M. Abdullah and Marius Mosbach and Iuliia Zaitova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://arxiv.org/abs/2106.08686},
year = {2021},
date = {2021},
booktitle = {Proceedings of Interspeech 2020},
abstract = {Several variants of deep neural networks have been successfully employed for building parametric models that project variable-duration spoken word segments onto fixed-size vector representations, or acoustic word embeddings (AWEs). However, it remains unclear to what degree we can rely on the distance in the emerging AWE space as an estimate of word-form similarity. In this paper, we ask: does the distance in the acoustic embedding space correlate with phonological dissimilarity? To answer this question, we empirically investigate the performance of supervised approaches for AWEs with different neural architectures and learning objectives. We train AWE models in controlled settings for two languages (German and Czech) and evaluate the embeddings on two tasks: word discrimination and phonological similarity. Our experiments show that (1) the distance in the embedding space in the best cases only moderately correlates with phonological distance, and (2) improving the performance on the word discrimination task does not necessarily yield models that better reflect word phonological similarity. Our findings highlight the necessity to rethink the current intrinsic evaluations for AWEs.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   C4 B4

Mayn, Alexandra; Abdullah, Badr M.; Klakow, Dietrich

Familiar words but strange voices: Modelling the influence of speech variability on word recognition Inproceedings

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, Association for Computational Linguistics, pp. 96-102, Online, 2021.

We present a deep neural model of spoken word recognition which is trained to retrieve the meaning of a word (in the form of a word embedding) given its spoken form, a task which resembles that faced by a human listener. Furthermore, we investigate the influence of variability in speech signals on the model’s performance. To this end, we conduct of set of controlled experiments using word-aligned read speech data in German. Our experiments show that (1) the model is more sensitive to dialectical variation than gender variation, and (2) recognition performance of word cognates from related languages reflect the degree of relatedness between languages in our study. Our work highlights the feasibility of modeling human speech perception using deep neural networks.

@inproceedings{mayn-etal-2021-familiar,
title = {Familiar words but strange voices: Modelling the influence of speech variability on word recognition},
author = {Alexandra Mayn and Badr M. Abdullah and Dietrich Klakow},
url = {https://aclanthology.org/2021.eacl-srw.14},
year = {2021},
date = {2021},
booktitle = {Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop},
pages = {96-102},
publisher = {Association for Computational Linguistics},
address = {Online},
abstract = {We present a deep neural model of spoken word recognition which is trained to retrieve the meaning of a word (in the form of a word embedding) given its spoken form, a task which resembles that faced by a human listener. Furthermore, we investigate the influence of variability in speech signals on the model’s performance. To this end, we conduct of set of controlled experiments using word-aligned read speech data in German. Our experiments show that (1) the model is more sensitive to dialectical variation than gender variation, and (2) recognition performance of word cognates from related languages reflect the degree of relatedness between languages in our study. Our work highlights the feasibility of modeling human speech perception using deep neural networks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Macher, Nicole; Abdullah, Badr M.; Brouwer, Harm; Klakow, Dietrich

Do we read what we hear? Modeling orthographic influences on spoken word recognition Inproceedings

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, Association for Computational Linguistics, pp. 16-22, Online, 2021.

Theories and models of spoken word recognition aim to explain the process of accessing lexical knowledge given an acoustic realization of a word form. There is consensus that phonological and semantic information is crucial for this process. However, there is accumulating evidence that orthographic information could also have an impact on auditory word recognition. This paper presents two models of spoken word recognition that instantiate different hypotheses regarding the influence of orthography on this process. We show that these models reproduce human-like behavior in different ways and provide testable hypotheses for future research on the source of orthographic effects in spoken word recognition.

@inproceedings{macher-etal-2021-read,
title = {Do we read what we hear? Modeling orthographic influences on spoken word recognition},
author = {Nicole Macher and Badr M. Abdullah and Harm Brouwer and Dietrich Klakow},
url = {https://aclanthology.org/2021.eacl-srw.3},
year = {2021},
date = {2021},
booktitle = {Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop},
pages = {16-22},
publisher = {Association for Computational Linguistics},
address = {Online},
abstract = {Theories and models of spoken word recognition aim to explain the process of accessing lexical knowledge given an acoustic realization of a word form. There is consensus that phonological and semantic information is crucial for this process. However, there is accumulating evidence that orthographic information could also have an impact on auditory word recognition. This paper presents two models of spoken word recognition that instantiate different hypotheses regarding the influence of orthography on this process. We show that these models reproduce human-like behavior in different ways and provide testable hypotheses for future research on the source of orthographic effects in spoken word recognition.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A1 C4

Chingacham, Anupama; Demberg, Vera; Klakow, Dietrich

Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors Inproceedings

Proceedings of Interspeech 2021, pp. 1713–1717, 2021.

Listening in noisy environments can be difficult even for individuals with a normal hearing thresholds. The speech signal can be masked by noise, which may lead to word misperceptions on the side of the listener, and overall difficulty to understand the message. To mitigate hearing difficulties on listeners, a co-operative speaker utilizes voice modulation strategies like Lombard speech to generate noise-robust utterances, and similar solutions have been developed for speech synthesis systems. In this work, we propose an alternate solution of choosing noise-robust lexical paraphrases to represent an intended meaning. Our results show that lexical paraphrases differ in their intelligibility in noise. We evaluate the intelligibility of synonyms in context and find that choosing a lexical unit that is less risky to be misheard than its synonym introduced an average gain in comprehension of 37% at SNR -5 dB and 21% at SNR 0 dB for babble noise.

@inproceedings{Chingacham2021,
title = {Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors},
author = {Anupama Chingacham and Vera Demberg and Dietrich Klakow},
url = {https://arxiv.org/abs/2107.08337},
year = {2021},
date = {2021},
booktitle = {Proceedings of Interspeech 2021},
pages = {1713–1717},
abstract = {Listening in noisy environments can be difficult even for individuals with a normal hearing thresholds. The speech signal can be masked by noise, which may lead to word misperceptions on the side of the listener, and overall difficulty to understand the message. To mitigate hearing difficulties on listeners, a co-operative speaker utilizes voice modulation strategies like Lombard speech to generate noise-robust utterances, and similar solutions have been developed for speech synthesis systems. In this work, we propose an alternate solution of choosing noise-robust lexical paraphrases to represent an intended meaning. Our results show that lexical paraphrases differ in their intelligibility in noise. We evaluate the intelligibility of synonyms in context and find that choosing a lexical unit that is less risky to be misheard than its synonym introduced an average gain in comprehension of 37% at SNR -5 dB and 21% at SNR 0 dB for babble noise.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Voigtmann, Sophia; Speyer, Augustin

Information density as a factor for syntactic variation in Early New High German Inproceedings

Proceedings of Linguistic Evidence 2020, Tübingen, Germany, 2021.

In contrast to other languages like English, German has certain liberties in its word order. Different word orders do not influence the proposition of a sentence. The frame of the German clause are the sentence brackets (the left (LSB) and the right (RSB) sentence brackets) over which the parts of the predicate are distributed in the main clause, whereas in subordinate clauses, the left one can host subordinate conjunctions. But apart from the sentence brackets, the order of constituents is fairly variable, though a default word order (subject, indirect object, direct object for nouns; subject, direct object, indirect object for pronouns) exists. A deviation of this order can be caused by factors like focus, given-/newness, topicality, definiteness and animacy (Zubin & Köpcke, 1985; Reis, 1987; Müller, 1999; Lenerz, 2001 among others).

@inproceedings{voigtmannspeyerinprint,
title = {Information density as a factor for syntactic variation in Early New High German},
author = {Sophia Voigtmann and Augustin Speyer},
url = {https://ub01.uni-tuebingen.de/xmlui/handle/10900/134561},
year = {2021},
date = {2021},
booktitle = {Proceedings of Linguistic Evidence 2020},
address = {T{\"u}bingen, Germany},
abstract = {In contrast to other languages like English, German has certain liberties in its word order. Different word orders do not influence the proposition of a sentence. The frame of the German clause are the sentence brackets (the left (LSB) and the right (RSB) sentence brackets) over which the parts of the predicate are distributed in the main clause, whereas in subordinate clauses, the left one can host subordinate conjunctions. But apart from the sentence brackets, the order of constituents is fairly variable, though a default word order (subject, indirect object, direct object for nouns; subject, direct object, indirect object for pronouns) exists. A deviation of this order can be caused by factors like focus, given-/newness, topicality, definiteness and animacy (Zubin & K{\"o}pcke, 1985; Reis, 1987; M{\"u}ller, 1999; Lenerz, 2001 among others).},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C6

Successfully