Publications

Mayn, Alexandra; Abdullah, Badr M.; Klakow, Dietrich

Familiar words but strange voices: Modelling the influence of speech variability on word recognition Inproceedings

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, Association for Computational Linguistics, pp. 96-102, Online, 2021.

We present a deep neural model of spoken word recognition which is trained to retrieve the meaning of a word (in the form of a word embedding) given its spoken form, a task which resembles that faced by a human listener. Furthermore, we investigate the influence of variability in speech signals on the model’s performance. To this end, we conduct of set of controlled experiments using word-aligned read speech data in German. Our experiments show that (1) the model is more sensitive to dialectical variation than gender variation, and (2) recognition performance of word cognates from related languages reflect the degree of relatedness between languages in our study. Our work highlights the feasibility of modeling human speech perception using deep neural networks.

@inproceedings{mayn-etal-2021-familiar,
title = {Familiar words but strange voices: Modelling the influence of speech variability on word recognition},
author = {Alexandra Mayn and Badr M. Abdullah and Dietrich Klakow},
url = {https://aclanthology.org/2021.eacl-srw.14},
year = {2021},
date = {2021},
booktitle = {Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop},
pages = {96-102},
publisher = {Association for Computational Linguistics},
address = {Online},
abstract = {We present a deep neural model of spoken word recognition which is trained to retrieve the meaning of a word (in the form of a word embedding) given its spoken form, a task which resembles that faced by a human listener. Furthermore, we investigate the influence of variability in speech signals on the model’s performance. To this end, we conduct of set of controlled experiments using word-aligned read speech data in German. Our experiments show that (1) the model is more sensitive to dialectical variation than gender variation, and (2) recognition performance of word cognates from related languages reflect the degree of relatedness between languages in our study. Our work highlights the feasibility of modeling human speech perception using deep neural networks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Macher, Nicole; Abdullah, Badr M.; Brouwer, Harm; Klakow, Dietrich

Do we read what we hear? Modeling orthographic influences on spoken word recognition Inproceedings

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, Association for Computational Linguistics, pp. 16-22, Online, 2021.

Theories and models of spoken word recognition aim to explain the process of accessing lexical knowledge given an acoustic realization of a word form. There is consensus that phonological and semantic information is crucial for this process. However, there is accumulating evidence that orthographic information could also have an impact on auditory word recognition. This paper presents two models of spoken word recognition that instantiate different hypotheses regarding the influence of orthography on this process. We show that these models reproduce human-like behavior in different ways and provide testable hypotheses for future research on the source of orthographic effects in spoken word recognition.

@inproceedings{macher-etal-2021-read,
title = {Do we read what we hear? Modeling orthographic influences on spoken word recognition},
author = {Nicole Macher and Badr M. Abdullah and Harm Brouwer and Dietrich Klakow},
url = {https://aclanthology.org/2021.eacl-srw.3},
year = {2021},
date = {2021},
booktitle = {Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop},
pages = {16-22},
publisher = {Association for Computational Linguistics},
address = {Online},
abstract = {Theories and models of spoken word recognition aim to explain the process of accessing lexical knowledge given an acoustic realization of a word form. There is consensus that phonological and semantic information is crucial for this process. However, there is accumulating evidence that orthographic information could also have an impact on auditory word recognition. This paper presents two models of spoken word recognition that instantiate different hypotheses regarding the influence of orthography on this process. We show that these models reproduce human-like behavior in different ways and provide testable hypotheses for future research on the source of orthographic effects in spoken word recognition.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A1 C4

Jágrová, Klára; Hedderich, Michael; Mosbach, Marius; Avgustinova, Tania; Klakow, Dietrich

On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers Journal Article

Frontiers in Psychology, 12, pp. 2296, 2021, ISSN 1664-1078.

This contribution seeks to provide a rational probabilistic explanation for the intelligibility of words in a genetically related language that is unknown to the reader, a phenomenon referred to as intercomprehension. In this research domain, linguistic distance, among other factors, was proved to correlate well with the mutual intelligibility of individual words. However, the role of context for the intelligibility of target words in sentences was subject to very few studies. To address this, we analyze data from web-based experiments in which Czech (CS) respondents were asked to translate highly predictable target words at the final position of Polish sentences. We compare correlations of target word intelligibility with data from 3-g language models (LMs) to their correlations with data obtained from context-aware LMs. More specifically, we evaluate two context-aware LM architectures: Long Short-Term Memory (LSTMs) that can, theoretically, take infinitely long-distance dependencies into account and Transformer-based LMs which can access the whole input sequence at the same time. We investigate how their use of context affects surprisal and its correlation with intelligibility.

@article{10.3389/fpsyg.2021.662277,
title = {On the Correlation of Context-Aware Language Models With the Intelligibility of Polish Target Words to Czech Readers},
author = {Kl{\'a}ra J{\'a}grov{\'a} and Michael Hedderich and Marius Mosbach and Tania Avgustinova and Dietrich Klakow},
url = {https://www.frontiersin.org/articles/10.3389/fpsyg.2021.662277/full},
doi = {https://doi.org/10.3389/fpsyg.2021.662277},
year = {2021},
date = {2021},
journal = {Frontiers in Psychology},
pages = {2296},
volume = {12},
abstract = {This contribution seeks to provide a rational probabilistic explanation for the intelligibility of words in a genetically related language that is unknown to the reader, a phenomenon referred to as intercomprehension. In this research domain, linguistic distance, among other factors, was proved to correlate well with the mutual intelligibility of individual words. However, the role of context for the intelligibility of target words in sentences was subject to very few studies. To address this, we analyze data from web-based experiments in which Czech (CS) respondents were asked to translate highly predictable target words at the final position of Polish sentences. We compare correlations of target word intelligibility with data from 3-g language models (LMs) to their correlations with data obtained from context-aware LMs. More specifically, we evaluate two context-aware LM architectures: Long Short-Term Memory (LSTMs) that can, theoretically, take infinitely long-distance dependencies into account and Transformer-based LMs which can access the whole input sequence at the same time. We investigate how their use of context affects surprisal and its correlation with intelligibility.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Projects:   B4 C4

Kudera, Jacek; Tavi, Lauri; Möbius, Bernd; Avgustinova, Tania; Klakow, Dietrich

The effect of surprisal on articulatory gestures in Polish consonant-to-vowel transitions: A pilot EMA study Inproceedings

14. ITG-Konferenz, ITG-Fachbericht 298: Speech Communication, pp. 179-183, Kiel, Germany, 2021, ISBN 978-3-8007-5627-8.

This study is concerned with the relation between the information-theoretic notion of surprisal and articulatory gesture in Polish consonant-to-vowel transitions. It addresses the question of the influence of diphone predictability on spectral trajectories and articulatory gestures by relating the effect of surprisal with motor fluency. The study combines the computation of locus equations (LE) with kinematic data obtained from electromagnetic articulograph (EMA). The kinematic and acoustic data showed that a small coarticulation effect was present in the highand low-surprisal clusters. Regardless of some small discrepancies across the measures, a high degree of overlap of adjacent segments is reported for the mid-surprisal group in both domains. Two explanations of the observed effect are proposed. The first refers to low-surprisal coarticulation resistance and suggests the need to disambiguate predictable sequences. The second, observed in high surprisal clusters, refers to the prominence given to emphasize the unexpected concatenation.

@inproceedings{Kudera/etal:2021c,
title = {The effect of surprisal on articulatory gestures in Polish consonant-to-vowel transitions: A pilot EMA study},
author = {Jacek Kudera and Lauri Tavi and Bernd M{\"o}bius and Tania Avgustinova and Dietrich Klakow},
url = {https://ieeexplore.ieee.org/document/9657527},
year = {2021},
date = {2021},
booktitle = {14. ITG-Konferenz, ITG-Fachbericht 298: Speech Communication},
isbn = {978-3-8007-5627-8},
pages = {179-183},
address = {Kiel, Germany},
abstract = {This study is concerned with the relation between the information-theoretic notion of surprisal and articulatory gesture in Polish consonant-to-vowel transitions. It addresses the question of the influence of diphone predictability on spectral trajectories and articulatory gestures by relating the effect of surprisal with motor fluency. The study combines the computation of locus equations (LE) with kinematic data obtained from electromagnetic articulograph (EMA). The kinematic and acoustic data showed that a small coarticulation effect was present in the highand low-surprisal clusters. Regardless of some small discrepancies across the measures, a high degree of overlap of adjacent segments is reported for the mid-surprisal group in both domains. Two explanations of the observed effect are proposed. The first refers to low-surprisal coarticulation resistance and suggests the need to disambiguate predictable sequences. The second, observed in high surprisal clusters, refers to the prominence given to emphasize the unexpected concatenation.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Kudera, Jacek; Georgis, Philip; Möbius, Bernd; Avgustinova, Tania; Klakow, Dietrich

Phonetic Distance and Surprisal in Multilingual Priming: Evidence from Slavic Inproceedings

Proc. Interspeech, pp. 3944-3948, 2021.

This study reveals the relation between surprisal, phonetic distance, and latency based on a multilingual, short-term priming framework. Four Slavic languages (Bulgarian, Czech, Polish, and Russian) are investigated across two priming conditions: associative and phonetic priming, involving true cognates and near-homophones, respectively. This research is grounded in the methodology of information theory and proposes new methods for quantifying differences between meaningful lexical primes and targets for closely related languages. It also outlines the influence of phonetic distance between cognate and noncognate pairs of primes and targets on response times in a cross-lingual lexical decision task. The experimental results show that phonetic distance moderates response times only in Polish and Czech, whereas the surprisal-based correspondence effect is an accurate predictor of latency for all tested languages. The information-theoretic approach of quantifying feature-based alternations between Slavic cognates and near-homophones appears to be a valid method for latency moderation in the auditory modality. The outcomes of this study suggest that the surprisal-based (un)expectedness of spoken stimuli is an accurate predictor of human performance in multilingual lexical decision tasks.

@inproceedings{kudera21_interspeech,
title = {Phonetic Distance and Surprisal in Multilingual Priming: Evidence from Slavic},
author = {Jacek Kudera and Philip Georgis and Bernd M{\"o}bius and Tania Avgustinova and Dietrich Klakow},
url = {https://www.isca-speech.org/archive/interspeech_2021/kudera21_interspeech.html},
doi = {https://doi.org/10.21437/Interspeech.2021-1003},
year = {2021},
date = {2021},
booktitle = {Proc. Interspeech},
pages = {3944-3948},
abstract = {This study reveals the relation between surprisal, phonetic distance, and latency based on a multilingual, short-term priming framework. Four Slavic languages (Bulgarian, Czech, Polish, and Russian) are investigated across two priming conditions: associative and phonetic priming, involving true cognates and near-homophones, respectively. This research is grounded in the methodology of information theory and proposes new methods for quantifying differences between meaningful lexical primes and targets for closely related languages. It also outlines the influence of phonetic distance between cognate and noncognate pairs of primes and targets on response times in a cross-lingual lexical decision task. The experimental results show that phonetic distance moderates response times only in Polish and Czech, whereas the surprisal-based correspondence effect is an accurate predictor of latency for all tested languages. The information-theoretic approach of quantifying feature-based alternations between Slavic cognates and near-homophones appears to be a valid method for latency moderation in the auditory modality. The outcomes of this study suggest that the surprisal-based (un)expectedness of spoken stimuli is an accurate predictor of human performance in multilingual lexical decision tasks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Abdullah, Badr M.; Zaitova, Iuliia; Avgustinova, Tania; Möbius, Bernd; Klakow, Dietrich

How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings Inproceedings

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, Association for Computational Linguistics, pp. 407-419, 2021.

How do neural networks “perceive” speech sounds from unknown languages? Does the typological similarity between the model’s training language (L1) and an unknown language (L2) have an impact on the model representations of L2 speech signals? To answer these questions, we present a novel experimental design based on representational similarity analysis (RSA) to analyze acoustic word embeddings (AWEs)—vector representations of variable-duration spoken-word segments. First, we train monolingual AWE models on seven Indo-European languages with various degrees of typological similarity. We then employ RSA to quantify the cross-lingual similarity by simulating native and non-native spoken-word processing using AWEs. Our experiments show that typological similarity indeed affects the representational similarity of the models in our study. We further discuss the implications of our work on modeling speech processing and language similarity with neural networks.

@inproceedings{abdullah-etal-2021-familiar,
title = {How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings},
author = {Badr M. Abdullah and Iuliia Zaitova and Tania Avgustinova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://aclanthology.org/2021.blackboxnlp-1.32/},
doi = {https://doi.org/10.18653/v1/2021.blackboxnlp-1.32},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP},
pages = {407-419},
publisher = {Association for Computational Linguistics},
abstract = {How do neural networks “perceive” speech sounds from unknown languages? Does the typological similarity between the model’s training language (L1) and an unknown language (L2) have an impact on the model representations of L2 speech signals? To answer these questions, we present a novel experimental design based on representational similarity analysis (RSA) to analyze acoustic word embeddings (AWEs)—vector representations of variable-duration spoken-word segments. First, we train monolingual AWE models on seven Indo-European languages with various degrees of typological similarity. We then employ RSA to quantify the cross-lingual similarity by simulating native and non-native spoken-word processing using AWEs. Our experiments show that typological similarity indeed affects the representational similarity of the models in our study. We further discuss the implications of our work on modeling speech processing and language similarity with neural networks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Mosbach, Marius; Degaetano-Ortlieb, Stefania; Krielke, Marie-Pauline; Abdullah, Badr M.; Klakow, Dietrich

A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English Inproceedings

Proceedings of the 28th International Conference on Computational Linguistics, pp. 771-787, 2020.

Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT), testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks. We focus on relative clauses (in American English) as a complex phenomenon needing contextual information and antecedent identification to be resolved. Based on a naturalistic dataset, probing shows that all three models indeed capture linguistic knowledge about grammaticality, achieving high performance. Evaluation on diagnostic cases and masked prediction tasks considering fine-grained linguistic knowledge, however, shows pronounced model-specific weaknesses especially on semantic knowledge, strongly impacting models’ performance. Our results highlight the importance of (a) model comparison in evaluation task and (b) building up claims of model performance and the linguistic knowledge they capture beyond purely probing-based evaluations.

@inproceedings{Mosbach2020,
title = {A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English},
author = {Marius Mosbach and Stefania Degaetano-Ortlieb and Marie-Pauline Krielke and Badr M. Abdullah and Dietrich Klakow},
url = {https://aclanthology.org/2020.coling-main.67/},
year = {2020},
date = {2020},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
pages = {771-787},
abstract = {Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT), testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks. We focus on relative clauses (in American English) as a complex phenomenon needing contextual information and antecedent identification to be resolved. Based on a naturalistic dataset, probing shows that all three models indeed capture linguistic knowledge about grammaticality, achieving high performance. Evaluation on diagnostic cases and masked prediction tasks considering fine-grained linguistic knowledge, however, shows pronounced model-specific weaknesses especially on semantic knowledge, strongly impacting models’ performance. Our results highlight the importance of (a) model comparison in evaluation task and (b) building up claims of model performance and the linguistic knowledge they capture beyond purely probing-based evaluations.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   B1 B4 C4

Stenger, Irina; Avgustinova, Tania

How intelligible is spoken Bulgarian for Russian native speakers in an intercomprehension scenario? Inproceedings

Micheva, Vanya et al. (Ed.): Proceedings of the International Annual Conference of the Institute for Bulgarian Language, 2, pp. 142-151, Sofia, Bulgaria, 2020.

In a web-based experiment, Bulgarian audio stimuli in the form of recorded isolated words are presented to Russian native speakers who are required to write a suitable Russian translation. The degree of intelligibility, as revealed by the cognate guessing task, is relatively high for this pair of languages. We correlate the obtained intercomprehension scores with established linguistic factors in order to determine their influence on the cross-linguistic spoken word recognition. A detailed error analysis focuses on sound correspondences that cause translation problems in such an intercomprehension scenario.

@inproceedings{Stenger2020b,
title = {How intelligible is spoken Bulgarian for Russian native speakers in an intercomprehension scenario?},
author = {Irina Stenger and Tania Avgustinova},
editor = {Vanya Micheva et al.},
year = {2020},
date = {2020},
booktitle = {Proceedings of the International Annual Conference of the Institute for Bulgarian Language},
pages = {142-151},
address = {Sofia, Bulgaria},
abstract = {In a web-based experiment, Bulgarian audio stimuli in the form of recorded isolated words are presented to Russian native speakers who are required to write a suitable Russian translation. The degree of intelligibility, as revealed by the cognate guessing task, is relatively high for this pair of languages. We correlate the obtained intercomprehension scores with established linguistic factors in order to determine their influence on the cross-linguistic spoken word recognition. A detailed error analysis focuses on sound correspondences that cause translation problems in such an intercomprehension scenario.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Avgustinova, Tania; Stenger, Irina

Russian-Bulgarian mutual intelligibility in light of linguistic and statistical models of Slavic receptive multilingualism [Russko-bolgarskaja vzaimoponjatnost’ v svete lingvističeskich i statističeskich modelej slavjanskoj receptivnoj mnogojazyčnocsti] Book Chapter

Marti, Roland; Pognan, Patrice; Schlamberger Brezar, Mojca (Ed.): University Press, Faculty of Arts, pp. 85-99, Ljubljana, Slovenia, 2020.

Computational modelling of the observed mutual intelligibility of Slavic languages unavoid-ably requires systematic integration of classical Slavistics knowledge from comparative his-torical grammar and traditional contrastive description of language pairs. The phenomenon of intercomprehension is quite intuitive: speakers of a given language L1 understand another closely related language (variety) L2 without being able to use the latter productively, i.e. for speaking or writing.

This specific mode of using the human linguistic competence manifests itself as receptive multilingualism. The degree of mutual understanding of genetically close-ly related languages, such as Bulgarian and Russian, corresponds to objectively measurable distances at different linguistic levels. The common Slavic basis and the comparative-syn-chronous perspective allow us to reveal Bulgarian-Russian linguistic affinity with regard to spelling, vocabulary and grammar.

@inbook{Avgustinova2020,
title = {Russian-Bulgarian mutual intelligibility in light of linguistic and statistical models of Slavic receptive multilingualism [Russko-bolgarskaja vzaimoponjatnost’ v svete lingvisti{\v{c}eskich i statisti{\v{c}eskich modelej slavjanskoj receptivnoj mnogojazy{\v{c}nocsti]},
author = {Tania Avgustinova and Irina Stenger},
editor = {Roland Marti and Patrice Pognan and Mojca Schlamberger Brezar},
url = {https://e-knjige.ff.uni-lj.si/znanstvena-zalozba/catalog/view/226/326/5284-1},
year = {2020},
date = {2020},
pages = {85-99},
publisher = {University Press, Faculty of Arts},
address = {Ljubljana, Slovenia},
abstract = {Computational modelling of the observed mutual intelligibility of Slavic languages unavoid-ably requires systematic integration of classical Slavistics knowledge from comparative his-torical grammar and traditional contrastive description of language pairs. The phenomenon of intercomprehension is quite intuitive: speakers of a given language L1 understand another closely related language (variety) L2 without being able to use the latter productively, i.e. for speaking or writing. This specific mode of using the human linguistic competence manifests itself as receptive multilingualism. The degree of mutual understanding of genetically close-ly related languages, such as Bulgarian and Russian, corresponds to objectively measurable distances at different linguistic levels. The common Slavic basis and the comparative-syn-chronous perspective allow us to reveal Bulgarian-Russian linguistic affinity with regard to spelling, vocabulary and grammar.},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   C4

Stenger, Irina; Avgustinova, Tania

Visual vs. auditory perception of Bulgarian stimuli by Russian native speakers Inproceedings

P. Selegej, Vladimir et al. (Ed.): Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference ‘Dialogue’, pp. 684 - 695, 2020.

This study contributes to a better understanding of receptive multilingualism by determining similarities and differences in successful processing of written and spoken cognate words in an unknown but (closely) related language. We investigate two Slavic languages with regard to their mutual intelligibility. The current focus is on the recognition of isolated Bulgarian words by Russian native speakers in a cognate guessing task, considering both written and audio stimuli.

The experimentally obtained intercomprehension scores show a generally high degree of intelligibility of Bulgarian cognates to Russian subjects, as well as processing difficulties in case of visual vs. auditory perception. In search of an explanation, we examine the linguistic factors that can contribute to various degrees of written and spoken word intelligibility. The intercomprehension scores obtained in the online word translation experiments are correlated with (i) the identical and mismatched correspondences on the orthographic and phonetic level, (ii) the word length of the stimuli, and (iii) the frequency of Russian cognates. Additionally we validate two measuring methods: the Levenshtein distance and the word adaptation surprisal as potential predictors of the word intelligibility in reading and oral intercomprehension.

@inproceedings{Stenger2020b,
title = {Visual vs. auditory perception of Bulgarian stimuli by Russian native speakers},
author = {Irina Stenger and Tania Avgustinova},
editor = {Vladimir P. Selegej et al.},
url = {http://www.dialog-21.ru/media/4962/stengeriplusavgustinovat-045.pdf},
year = {2020},
date = {2020},
booktitle = {Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference ‘Dialogue’},
pages = {684 - 695},
abstract = {This study contributes to a better understanding of receptive multilingualism by determining similarities and differences in successful processing of written and spoken cognate words in an unknown but (closely) related language. We investigate two Slavic languages with regard to their mutual intelligibility. The current focus is on the recognition of isolated Bulgarian words by Russian native speakers in a cognate guessing task, considering both written and audio stimuli. The experimentally obtained intercomprehension scores show a generally high degree of intelligibility of Bulgarian cognates to Russian subjects, as well as processing difficulties in case of visual vs. auditory perception. In search of an explanation, we examine the linguistic factors that can contribute to various degrees of written and spoken word intelligibility. The intercomprehension scores obtained in the online word translation experiments are correlated with (i) the identical and mismatched correspondences on the orthographic and phonetic level, (ii) the word length of the stimuli, and (iii) the frequency of Russian cognates. Additionally we validate two measuring methods: the Levenshtein distance and the word adaptation surprisal as potential predictors of the word intelligibility in reading and oral intercomprehension.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Avgustinova, Tania; Jágrová, Klára; Stenger, Irina

The INCOMSLAV Platform: Experimental Website with Integrated Methods for Measuring Linguistic Distances and Asymmetries in Receptive Multilingualism Inproceedings

Fiumara, James; Cieri, Christopher; Liberman, Mark; Callison-Burch, Chris (Ed.): LREC 2020 Workshop Language Resources and Evaluation Conference 11-16 May 2020, Citizen Linguistics in Language Resource Development (CLLRD 2020), Peter Lang, pp. 483-500, 2020.

We report on a web-based resource for conducting intercomprehension experiments with native speakers of Slavic languages and present our methods for measuring linguistic distances and asymmetries in receptive multilingualism. Through a website which serves as a platform for online testing, a large number of participants with different linguistic backgrounds can be targeted. A statistical language model is used to measure information density and to gauge how language users master various degrees of (un)intelligibilty. The key idea is that intercomprehension should be better when the model adapted for understanding the unknown language exhibits relatively low average distance and surprisal. All obtained intelligibility scores together with distance and asymmetry measures for the different language pairs and processing directions are made available as an integrated online resource in the form of a Slavic intercomprehension matrix (SlavMatrix).

@inproceedings{Stenger2020b,
title = {The INCOMSLAV Platform: Experimental Website with Integrated Methods for Measuring Linguistic Distances and Asymmetries in Receptive Multilingualism},
author = {Tania Avgustinova and Kl{\'a}ra J{\'a}grov{\'a} and Irina Stenger},
editor = {James Fiumara and Christopher Cieri and Mark Liberman and Chris Callison-Burch},
url = {https://aclanthology.org/2020.cllrd-1.6/},
doi = {https://doi.org/10.3726/978-3-653-07147-4},
year = {2020},
date = {2020},
booktitle = {LREC 2020 Workshop Language Resources and Evaluation Conference 11-16 May 2020, Citizen Linguistics in Language Resource Development (CLLRD 2020)},
pages = {483-500},
publisher = {Peter Lang},
abstract = {We report on a web-based resource for conducting intercomprehension experiments with native speakers of Slavic languages and present our methods for measuring linguistic distances and asymmetries in receptive multilingualism. Through a website which serves as a platform for online testing, a large number of participants with different linguistic backgrounds can be targeted. A statistical language model is used to measure information density and to gauge how language users master various degrees of (un)intelligibilty. The key idea is that intercomprehension should be better when the model adapted for understanding the unknown language exhibits relatively low average distance and surprisal. All obtained intelligibility scores together with distance and asymmetry measures for the different language pairs and processing directions are made available as an integrated online resource in the form of a Slavic intercomprehension matrix (SlavMatrix).},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Stenger, Irina; Jágrová, Klára; Fischer, Andrea; Avgustinova, Tania

“Reading Polish with Czech Eyes” or “How Russian Can a Bulgarian Text Be?”: Orthographic Differences as an Experimental Variable in Slavic Intercomprehension Incollection

Radeva-Bork, Teodora; Kosta, Peter (Ed.): Current Developments in Slavic Linguistics. Twenty Years After (based on selected papers from FDSL 11), Peter Lang, pp. 483-500, 2020.

@incollection{Stenger2020,
title = {“Reading Polish with Czech Eyes” or “How Russian Can a Bulgarian Text Be?”: Orthographic Differences as an Experimental Variable in Slavic Intercomprehension},
author = {Irina Stenger and Kl{\'a}ra J{\'a}grov{\'a} and Andrea Fischer and Tania Avgustinova},
editor = {Teodora Radeva-Bork and Peter Kosta},
url = {https://www.peterlang.com/view/title/19540},
doi = {https://doi.org/10.3726/978-3-653-07147-4},
year = {2020},
date = {2020},
booktitle = {Current Developments in Slavic Linguistics. Twenty Years After (based on selected papers from FDSL 11)},
pages = {483-500},
publisher = {Peter Lang},
pubstate = {published},
type = {incollection}
}

Copy BibTeX to Clipboard

Project:   C4

Avgustinova, Tania

Surprisal in Intercomprehension Book Chapter

Slavcheva, Milena; Simov, Kiril; Osenova, Petya; Boytcheva, Svetla (Ed.): Knowledge, Language, Models, INCOMA Ltd., pp. 6-19, Shoumen, Bulgaria, 2020, ISBN 978-954-452-062-5.

A large-scale interdisciplinary research collaboration at Saarland University (Crocker et al. 2016) investigates the hypothesis that language use may be driven by the optimal utilization of the communication channel. The information-theoretic concepts of entropy (Shannon, 1949) and surprisal (Hale 2001; Levy 2008) have gained in popularity due to their potential to predict human linguistic behavior. The underlying assumption is that there is a certain total amount of information contained in a message, which is distributed over the individual units constituting it. Capturing this distribution of information is the goal of surprisal-based modeling with the intention of predicting the processing effort experienced by humans upon encountering these units. The ease of processing linguistic material is thus correlated with its contextually determined predictability, which may be appropriately indexed by Shannon’s notion of information. Multilingualism pervasiveness suggests that human language competence is used quite robustly, taking on various types of information and employing multi-source compensatory and guessing strategies. While it is not realistic to require from every single person to master several languages, it is certainly beneficial to strive and promote a significantly higher degree of receptive skills facilitating the access to other languages. Taking advantage of linguistic similarity – genetic, typological or areal – is the key to acquiring such abilities as efficiently as possible. Awareness that linguistic structures known of a specific language apply to other varieties in which similar phenomena are detectable is indeed essential

@inbook{TAfestGA,
title = {Surprisal in Intercomprehension},
author = {Tania Avgustinova},
editor = {Milena Slavcheva and Kiril Simov and Petya Osenova and Svetla Boytcheva},
url = {https://www.coli.uni-saarland.de/~tania/ta-pub/Avgustinova2020.Festschrift.pdf},
year = {2020},
date = {2020},
booktitle = {Knowledge, Language, Models},
isbn = {978-954-452-062-5},
pages = {6-19},
publisher = {INCOMA Ltd.},
address = {Shoumen, Bulgaria},
abstract = {A large-scale interdisciplinary research collaboration at Saarland University (Crocker et al. 2016) investigates the hypothesis that language use may be driven by the optimal utilization of the communication channel. The information-theoretic concepts of entropy (Shannon, 1949) and surprisal (Hale 2001; Levy 2008) have gained in popularity due to their potential to predict human linguistic behavior. The underlying assumption is that there is a certain total amount of information contained in a message, which is distributed over the individual units constituting it. Capturing this distribution of information is the goal of surprisal-based modeling with the intention of predicting the processing effort experienced by humans upon encountering these units. The ease of processing linguistic material is thus correlated with its contextually determined predictability, which may be appropriately indexed by Shannon’s notion of information. Multilingualism pervasiveness suggests that human language competence is used quite robustly, taking on various types of information and employing multi-source compensatory and guessing strategies. While it is not realistic to require from every single person to master several languages, it is certainly beneficial to strive and promote a significantly higher degree of receptive skills facilitating the access to other languages. Taking advantage of linguistic similarity – genetic, typological or areal – is the key to acquiring such abilities as efficiently as possible. Awareness that linguistic structures known of a specific language apply to other varieties in which similar phenomena are detectable is indeed essential},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   C4

Abdullah, Badr M.; Avgustinova, Tania; Möbius, Bernd; Klakow, Dietrich

Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages Inproceedings

Proceedings of Interspeech 2020, pp. 477-481, 2020.

State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with different acoustic conditions due to domain shift. In this paper, we present a set of experiments to investigate the impact of domain mismatch on the performance of neural LID systems for a subset of six Slavic languages across two domains (read speech and radio broadcast) and examine two low-level signal descriptors (spectral and cepstral features) for this task. Our experiments show that (1) out-of-domain speech samples severely hinder the performance of neural LID models, and (2) while both spectral and cepstral features show comparable performance within-domain, spectral features show more robustness under domain mismatch. Moreover, we apply unsupervised domain adaptation to minimize the discrepancy between the two domains in our study. We achieve relative accuracy improvements that range from 9% to 77% depending on the diversity of acoustic conditions in the source domain.

@inproceedings{abdullah_etal_is2020,
title = {Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic Languages},
author = {Badr M. Abdullah and Tania Avgustinova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://arxiv.org/abs/2008.00545},
doi = {https://doi.org/10.21437/Interspeech.2020-2930},
year = {2020},
date = {2020},
booktitle = {Proceedings of Interspeech 2020},
pages = {477-481},
abstract = {State-of-the-art spoken language identification (LID) systems, which are based on end-to-end deep neural networks, have shown remarkable success not only in discriminating between distant languages but also between closely-related languages or even different spoken varieties of the same language. However, it is still unclear to what extent neural LID models generalize to speech samples with different acoustic conditions due to domain shift. In this paper, we present a set of experiments to investigate the impact of domain mismatch on the performance of neural LID systems for a subset of six Slavic languages across two domains (read speech and radio broadcast) and examine two low-level signal descriptors (spectral and cepstral features) for this task. Our experiments show that (1) out-of-domain speech samples severely hinder the performance of neural LID models, and (2) while both spectral and cepstral features show comparable performance within-domain, spectral features show more robustness under domain mismatch. Moreover, we apply unsupervised domain adaptation to minimize the discrepancy between the two domains in our study. We achieve relative accuracy improvements that range from 9% to 77% depending on the diversity of acoustic conditions in the source domain.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   C1 C4

Abdullah, Badr M.; Kudera, Jacek; Avgustinova, Tania; Möbius, Bernd; Klakow, Dietrich

Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification Inproceedings

Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2020), International Committee on Computational Linguistics (ICCL), pp. 128-139, Barcelona, Spain (Online), 2020.

Deep neural networks have been employed for various spoken language recognition tasks, including tasks that are multilingual by definition such as spoken language identification (LID). In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness or non-linguists’ perception of language similarity. While our analysis shows that the language representation space indeed captures language relatedness to a great extent, we find perceptual confusability to be the best predictor of the language representation similarity.

@inproceedings{abdullah_etal_vardial2020,
title = {Rediscovering the Slavic Continuum in Representations Emerging from Neural Models of Spoken Language Identification},
author = {Badr M. Abdullah and Jacek Kudera and Tania Avgustinova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://www.aclweb.org/anthology/2020.vardial-1.12},
year = {2020},
date = {2020},
booktitle = {Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2020)},
pages = {128-139},
publisher = {International Committee on Computational Linguistics (ICCL)},
address = {Barcelona, Spain (Online)},
abstract = {Deep neural networks have been employed for various spoken language recognition tasks, including tasks that are multilingual by definition such as spoken language identification (LID). In this paper, we present a neural model for Slavic language identification in speech signals and analyze its emergent representations to investigate whether they reflect objective measures of language relatedness or non-linguists’ perception of language similarity. While our analysis shows that the language representation space indeed captures language relatedness to a great extent, we find perceptual confusability to be the best predictor of the language representation similarity.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   C1 C4

Chen, Yu; Avgustinova, Tania

Machine Translation from an Intercomprehension Perspective Inproceedings

Proc. Fourth Conference on Machine Translation (WMT), Volume 3: Shared Task Papers, pp. 192-196, Florence, Italy, 2019.

Within the first shared task on machine translation between similar languages, we present our first attempts on Czech to Polish machine translation from an intercomprehension perspective. We propose methods based on the mutual intelligibility of the two languages, taking advantage of their orthographic and phonological similarity, in the hope to improve over our baselines. The translation results are evaluated using BLEU. On this metric, none of our proposals could outperform the baselines on the final test set. The current setups are rather preliminary, and there are several potential improvements we can try in the future.

@inproceedings{csplMT,
title = {Machine Translation from an Intercomprehension Perspective},
author = {Yu Chen and Tania Avgustinova},
url = {https://aclanthology.org/W19-5425},
doi = {https://doi.org/10.18653/v1/W19-5425},
year = {2019},
date = {2019},
booktitle = {Proc. Fourth Conference on Machine Translation (WMT), Volume 3: Shared Task Papers},
pages = {192-196},
address = {Florence, Italy},
abstract = {Within the first shared task on machine translation between similar languages, we present our first attempts on Czech to Polish machine translation from an intercomprehension perspective. We propose methods based on the mutual intelligibility of the two languages, taking advantage of their orthographic and phonological similarity, in the hope to improve over our baselines. The translation results are evaluated using BLEU. On this metric, none of our proposals could outperform the baselines on the final test set. The current setups are rather preliminary, and there are several potential improvements we can try in the future.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Jágrová, Klára

Reading Polish with Czech Eyes: Distance and Surprisal in Quantitative, Qualitative, and Error Analyses of Intelligibility PhD Thesis

Saarland University, Saarbruecken, Germany, 2019.

In CHAPTER I, I first introduce the thesis in the context of the project workflow in section 1. I then summarise the methods and findings from the project publications about the languages in focus. There I also introduce the relevant concepts and terminology viewed in the literature as possible predictors of intercomprehension and processing difficulty. CHAPTER II presents a quantitative (section 4) and a qualitative (section 5) analysis of the results of the cooperative translation experiments. The focus of this thesis – the language pair PL-CS – is explained and the hypotheses are introduced in section 6. The experiment website is introduced in section 7 with an overview over participants, the different experiments conducted and in which section they are discussed. In CHAPTER IV, free translation experiments are discussed in which two different sets of individual word stimuli were presented to Czech readers: (i) Cognates that are transformable with regular PL-CS correspondences (section 12) and (ii) the 100 most frequent PL nouns (section 13). CHAPTER V presents the findings of experiments in which PL NPs in two different linearisation conditions were presented to Czech readers (section 14.1-14.6). A short digression is made when I turn to experiments with PL internationalisms which were presented to German readers (14.7). CHAPTER VI discusses the methods and results of cloze translation experiments with highly predictable target words in sentential context (section 15) and random context with sentences from the cooperative translation experiments (section 16). A final synthesis of the findings, together with an outlook, is provided in CHAPTER VII.


In KAPITEL I stelle ich zunächst die These im Kontext des Projektablaufs in Abschnitt 1 vor. Anschließend fasse ich die Methoden und Erkenntnisse aus den Projektpublikationen zu den untersuchten Sprachen zusammen. Dort stelle ich auch die relevanten Konzepte und die Terminologie vor, die in der Literatur als mögliche Prädiktoren für Interkomprehension und Verarbeitungsschwierigkeiten angesehen werden. KAPITEL II enthält eine quantitative (Abschnitt 4) und eine qualitative (Abschnitt 5) Analyse der Ergebnisse der kooperativen Übersetzungsexperimente. Der Fokus dieser Arbeit – das Sprachenpaar PL-CS – wird erläutert und die Hypothesen werden in Abschnitt 6 vorgestellt. Die Experiment-Website wird in Abschnitt 7 mit einer Übersicht über die Teilnehmer, die verschiedenen durchgeführten Experimente und die Abschnitte, in denen sie besprochen werden, vorgestellt. In KAPITEL IV werden Experimente zur freien Übersetzung besprochen, bei denen tschechischen Lesern zwei verschiedene Sätze einzelner Wortstimuli präsentiert wurden: (i) Kognaten, die mit regulären PL-CS-Korrespondenzen umgewandelt werden können (Abschnitt 12) und (ii) die 100 häufigsten PL-Substantive (Abschnitt 13). KAPITEL V stellt die Ergebnisse von Experimenten vor, in denen tschechischen Lesern PL-NP in zwei verschiedenen Linearisierungszuständen präsentiert wurden (Abschnitt 14.1-14.6). Einen kurzen Exkurs mache ich, wenn ich mich den Experimenten mit PL-Internationalismen zuwende, die deutschen Lesern präsentiert wurden (14.7). KAPITEL VI erörtert die Methoden und Ergebnisse von Lückentexten mit hochgradig vorhersehbaren Zielwörtern im Satzkontext (Abschnitt 15) und Zufallskontext mit Sätzen aus den kooperativen Übersetzungsexperimenten (Abschnitt 16). Eine abschließende Synthese der Ergebnisse und ein Ausblick finden sich in KAPITEL VII.

@phdthesis{Jagrova_Diss_2019,
title = {Reading Polish with Czech Eyes: Distance and Surprisal in Quantitative, Qualitative, and Error Analyses of Intelligibility},
author = {Kl{\'a}ra J{\'a}grov{\'a}},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/32995},
doi = {https://doi.org/10.22028/D291-32708},
year = {2019},
date = {2019},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {In CHAPTER I, I first introduce the thesis in the context of the project workflow in section 1. I then summarise the methods and findings from the project publications about the languages in focus. There I also introduce the relevant concepts and terminology viewed in the literature as possible predictors of intercomprehension and processing difficulty. CHAPTER II presents a quantitative (section 4) and a qualitative (section 5) analysis of the results of the cooperative translation experiments. The focus of this thesis – the language pair PL-CS – is explained and the hypotheses are introduced in section 6. The experiment website is introduced in section 7 with an overview over participants, the different experiments conducted and in which section they are discussed. In CHAPTER IV, free translation experiments are discussed in which two different sets of individual word stimuli were presented to Czech readers: (i) Cognates that are transformable with regular PL-CS correspondences (section 12) and (ii) the 100 most frequent PL nouns (section 13). CHAPTER V presents the findings of experiments in which PL NPs in two different linearisation conditions were presented to Czech readers (section 14.1-14.6). A short digression is made when I turn to experiments with PL internationalisms which were presented to German readers (14.7). CHAPTER VI discusses the methods and results of cloze translation experiments with highly predictable target words in sentential context (section 15) and random context with sentences from the cooperative translation experiments (section 16). A final synthesis of the findings, together with an outlook, is provided in CHAPTER VII.


In KAPITEL I stelle ich zun{\"a}chst die These im Kontext des Projektablaufs in Abschnitt 1 vor. Anschlie{\ss}end fasse ich die Methoden und Erkenntnisse aus den Projektpublikationen zu den untersuchten Sprachen zusammen. Dort stelle ich auch die relevanten Konzepte und die Terminologie vor, die in der Literatur als m{\"o}gliche Pr{\"a}diktoren f{\"u}r Interkomprehension und Verarbeitungsschwierigkeiten angesehen werden. KAPITEL II enth{\"a}lt eine quantitative (Abschnitt 4) und eine qualitative (Abschnitt 5) Analyse der Ergebnisse der kooperativen {\"U}bersetzungsexperimente. Der Fokus dieser Arbeit - das Sprachenpaar PL-CS - wird erl{\"a}utert und die Hypothesen werden in Abschnitt 6 vorgestellt. Die Experiment-Website wird in Abschnitt 7 mit einer {\"U}bersicht {\"u}ber die Teilnehmer, die verschiedenen durchgef{\"u}hrten Experimente und die Abschnitte, in denen sie besprochen werden, vorgestellt. In KAPITEL IV werden Experimente zur freien {\"U}bersetzung besprochen, bei denen tschechischen Lesern zwei verschiedene S{\"a}tze einzelner Wortstimuli pr{\"a}sentiert wurden: (i) Kognaten, die mit regul{\"a}ren PL-CS-Korrespondenzen umgewandelt werden k{\"o}nnen (Abschnitt 12) und (ii) die 100 h{\"a}ufigsten PL-Substantive (Abschnitt 13). KAPITEL V stellt die Ergebnisse von Experimenten vor, in denen tschechischen Lesern PL-NP in zwei verschiedenen Linearisierungszust{\"a}nden pr{\"a}sentiert wurden (Abschnitt 14.1-14.6). Einen kurzen Exkurs mache ich, wenn ich mich den Experimenten mit PL-Internationalismen zuwende, die deutschen Lesern pr{\"a}sentiert wurden (14.7). KAPITEL VI er{\"o}rtert die Methoden und Ergebnisse von L{\"u}ckentexten mit hochgradig vorhersehbaren Zielw{\"o}rtern im Satzkontext (Abschnitt 15) und Zufallskontext mit S{\"a}tzen aus den kooperativen {\"U}bersetzungsexperimenten (Abschnitt 16). Eine abschlie{\ss}ende Synthese der Ergebnisse und ein Ausblick finden sich in KAPITEL VII.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   C4

Jágrová, Klára; Stenger, Irina; Telus, Magdalena

Slavische Interkomprehension im 5-Sprachen-Kurs – Dokumentation eines Semesters Journal Article

Polnisch in Deutschland. Zeitschrift der Bundesvereinigung der Polnischlehrkräfte. Sondernummer: Emil Krebs und die Mehrsprachigkeit in Europa, pp. 122–133, 2019.

@article{Jágrová2019,
title = {Slavische Interkomprehension im 5-Sprachen-Kurs – Dokumentation eines Semesters},
author = {Kl{\'a}ra J{\'a}grov{\'a} and Irina Stenger and Magdalena Telus},
year = {2019},
date = {2019},
journal = {Polnisch in Deutschland. Zeitschrift der Bundesvereinigung der Polnischlehrkr{\"a}fte. Sondernummer: Emil Krebs und die Mehrsprachigkeit in Europa},
pages = {122–133},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   C4

Stenger, Irina

Zur Rolle der Orthographie in der slavischen Interkomprehension mit besonderem Fokus auf die kyrillische Schrift PhD Thesis

Saarland University, Saarbrücken, Germany, 2019, ISBN 978-3-86223-283-3.

Die slavischen Sprachen stellen einen bedeutenden indogermanischen Sprachzweig dar. Es stellt sich die Frage, inwieweit sich Sprecher verschiedener slavischer Sprachen interkomprehensiv verständigen können. Unter Interkomprehension wird die Kommunikationsfähigkeit von Sprechern verwandter Sprachen verstanden, wobei sich jeder Sprecher seiner Sprache bedient. Die vorliegende Arbeit untersucht die orthographische Verständlichkeit slavischer Sprachen mit kyrillischer Schrift im interkomprehensiven Lesen. Sechs ost- und südslavische Sprachen – Bulgarisch, Makedonisch, Russisch, Serbisch, Ukrainisch und Weißrussisch – werden im Hinblick auf orthographische Ähnlichkeiten und Unterschiede miteinander verglichen und statistisch analysiert. Der Fokus der empirischen Untersuchung liegt auf der Erkennung einzelner Kognaten mit diachronisch motivierten orthographischen Korrespondenzen in ost- und südslavischen Sprachen, ausgehend vom Russischen. Die in dieser Arbeit vorgestellten Methoden und erzielten Ergebnisse stellen einen empirischen Beitrag zur slavischen Interkomprehensionsforschung und Interkomrepehensionsdidaktik dar.

@phdthesis{Stenger_diss_2019,
title = {Zur Rolle der Orthographie in der slavischen Interkomprehension mit besonderem Fokus auf die kyrillische Schrift},
author = {Irina Stenger},
year = {2019},
date = {2019},
school = {Saarland University},
address = {Saarbr{\"u}cken, Germany},
abstract = {Die slavischen Sprachen stellen einen bedeutenden indogermanischen Sprachzweig dar. Es stellt sich die Frage, inwieweit sich Sprecher verschiedener slavischer Sprachen interkomprehensiv verst{\"a}ndigen k{\"o}nnen. Unter Interkomprehension wird die Kommunikationsf{\"a}higkeit von Sprechern verwandter Sprachen verstanden, wobei sich jeder Sprecher seiner Sprache bedient. Die vorliegende Arbeit untersucht die orthographische Verst{\"a}ndlichkeit slavischer Sprachen mit kyrillischer Schrift im interkomprehensiven Lesen. Sechs ost- und s{\"u}dslavische Sprachen - Bulgarisch, Makedonisch, Russisch, Serbisch, Ukrainisch und Wei{\ss}russisch - werden im Hinblick auf orthographische {\"A}hnlichkeiten und Unterschiede miteinander verglichen und statistisch analysiert. Der Fokus der empirischen Untersuchung liegt auf der Erkennung einzelner Kognaten mit diachronisch motivierten orthographischen Korrespondenzen in ost- und s{\"u}dslavischen Sprachen, ausgehend vom Russischen. Die in dieser Arbeit vorgestellten Methoden und erzielten Ergebnisse stellen einen empirischen Beitrag zur slavischen Interkomprehensionsforschung und Interkomrepehensionsdidaktik dar.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   C4

Stenger, Irina; Avgustinova, Tania; Belousov, Konstantin I.; Baranov, Dmitrij A.; Erofeeva, Elena V.

Interaction of linguistic and socio-cognitive factors in receptive multilingualism [Vzaimodejstvie lingvističeskich i sociokognitivnych parametrov pri receptivnom mul’tilingvisme] Inproceedings

25th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue 2019), Moscow, Russia, 2019.

@inproceedings{Stenger2019,
title = {Interaction of linguistic and socio-cognitive factors in receptive multilingualism [Vzaimodejstvie lingvisti{\v{c}eskich i sociokognitivnych parametrov pri receptivnom mul’tilingvisme]},
author = {Irina Stenger and Tania Avgustinova and Konstantin I. Belousov and Dmitrij A. Baranov and Elena V. Erofeeva},
url = {http://www.dialog-21.ru/digest/2019/online/},
year = {2019},
date = {2019},
booktitle = {25th International Conference on Computational Linguistics and Intellectual Technologies (Dialogue 2019)},
address = {Moscow, Russia},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Successfully