Publications

Yung, Frances Pik Yu; Demberg, Vera

Do speakers produce discourse connectives rationally? Inproceedings

Proceedings of the Eight Workshop on Cognitive Aspects of Computational Language Learning and Processing, Association for Computational Linguistics, pp. 6-16, Melbourne, Australia, 2018.

A number of different discourse connectives can be used to mark the same discourse relation, but it is unclear what factors affect connective choice. One recent account is the Rational Speech Acts theory, which predicts that speakers try to maximize the informativeness of an utterance such that the listener can interpret the intended meaning correctly. Existing prior work uses referential language games to test the rational account of speakers{‚} production of concrete meanings, such as identification of objects within a picture. Building on the same paradigm, we design a novel Discourse Continuation Game to investigate speakers{‚} production of abstract discourse relations. Experimental results reveal that speakers significantly prefer a more informative connective, in line with predictions of the RSA model.

@inproceedings{Yung2019b,
title = {Do speakers produce discourse connectives rationally?},
author = {Frances Pik Yu Yung and Vera Demberg},
url = {https://aclanthology.org/W18-2802},
doi = {https://doi.org/10.18653/v1/W18-2802},
year = {2018},
date = {2018},
booktitle = {Proceedings of the Eight Workshop on Cognitive Aspects of Computational Language Learning and Processing},
pages = {6-16},
publisher = {Association for Computational Linguistics},
address = {Melbourne, Australia},
abstract = {A number of different discourse connectives can be used to mark the same discourse relation, but it is unclear what factors affect connective choice. One recent account is the Rational Speech Acts theory, which predicts that speakers try to maximize the informativeness of an utterance such that the listener can interpret the intended meaning correctly. Existing prior work uses referential language games to test the rational account of speakers{'} production of concrete meanings, such as identification of objects within a picture. Building on the same paradigm, we design a novel Discourse Continuation Game to investigate speakers{'} production of abstract discourse relations. Experimental results reveal that speakers significantly prefer a more informative connective, in line with predictions of the RSA model.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Gessinger, Iona; Schweitzer, Antje; Andreeva, Bistra; Raveh, Eran; Möbius, Bernd; Steiner, Ingmar

Convergence of Pitch Accents in a Shadowing Task Inproceedings

Proceedings of the 9th International Conference on Speech Prosody, Speech Prosody Special Interest Group, pp. 225-229, Poznán, Poland, 2018.

In the present study, a corpus of short German sentences collected in a shadowing task was examined with respect to pitch accent realization. The pitch accents were parameterized with the PaIntE model, which describes the f0 contour of intonation events concerning their height, slope, and temporal alignment. Convergence was quantified as decrease in Euclidean distance, and hence increase in similarity, between the PaIntE parameter vectors. This was assessed for three stimulus types: natural speech, diphone based speech synthesis, or HMM based speech synthesis. The factors tested in the analysis were experimental phase – was the sentence uttered before or while shadowing the model, accent type – a distinction was made between prenuclear and nuclear pitch accents, and sex of speaker and shadowed model. For the natural and HMM stimuli, Euclidean distance decreased in the shadowing task. This convergence effect did not depend on the accent type. However, prenuclear pitch accents showed generally lower values in Euclidean distance than nuclear pitch accents. Whether the sex of the speaker and the shadowed model matched did not explain any variance in the data. For the diphone stimuli, no convergence of pitch accents was observed.

@inproceedings{Gessinger2018SP,
title = {Convergence of Pitch Accents in a Shadowing Task},
author = {Iona Gessinger and Antje Schweitzer and Bistra Andreeva and Eran Raveh and Bernd M{\"o}bius and Ingmar Steiner},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/29618},
doi = {https://doi.org/10.21437/SpeechProsody.2018-46},
year = {2018},
date = {2018},
booktitle = {Proceedings of the 9th International Conference on Speech Prosody},
pages = {225-229},
publisher = {Speech Prosody Special Interest Group},
address = {Pozn{\'a}n, Poland},
abstract = {In the present study, a corpus of short German sentences collected in a shadowing task was examined with respect to pitch accent realization. The pitch accents were parameterized with the PaIntE model, which describes the f0 contour of intonation events concerning their height, slope, and temporal alignment. Convergence was quantified as decrease in Euclidean distance, and hence increase in similarity, between the PaIntE parameter vectors. This was assessed for three stimulus types: natural speech, diphone based speech synthesis, or HMM based speech synthesis. The factors tested in the analysis were experimental phase - was the sentence uttered before or while shadowing the model, accent type - a distinction was made between prenuclear and nuclear pitch accents, and sex of speaker and shadowed model. For the natural and HMM stimuli, Euclidean distance decreased in the shadowing task. This convergence effect did not depend on the accent type. However, prenuclear pitch accents showed generally lower values in Euclidean distance than nuclear pitch accents. Whether the sex of the speaker and the shadowed model matched did not explain any variance in the data. For the diphone stimuli, no convergence of pitch accents was observed.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C5

Sanders, Ted J. M.; Demberg, Vera; Hoek, Jet; Scholman, Merel; Torabi Asr, Fatemeh; Zufferey, Sandrine; Evers-Vermeul, Jacqueline

Unifying dimensions in coherence relations: How various annotation frameworks are related Journal Article

Corpus Linguistics and Linguistic Theory, 2018.

In this paper, we show how three often used and seemingly different discourse annotation frameworks – Penn Discourse Treebank (PDTB), Rhetorical Structure Theory (RST), and Segmented Discourse Representation Theory – can be related by using a set of unifying dimensions. These dimensions are taken from the Cognitive approach to Coherence Relations and combined with more fine-grained additional features from the frameworks themselves to yield a posited set of dimensions that can successfully map three frameworks. The resulting interface will allow researchers to find identical or at least closely related relations within sets of annotated corpora, even if they are annotated within different frameworks. Furthermore, we tested our unified dimension (UniDim) approach by comparing PDTB and RST annotations of identical newspaper texts and converting their original end label annotations of relations into the accompanying values per dimension. Subsequently, rates of overlap in the attributed values per dimension were analyzed. Results indicate that the proposed dimensions indeed create an interface that makes existing annotation systems “talk to each other.”

@article{Sanders2018,
title = {Unifying dimensions in coherence relations: How various annotation frameworks are related},
author = {Ted J. M. Sanders and Vera Demberg and Jet Hoek and Merel Scholman and Fatemeh Torabi Asr and Sandrine Zufferey and Jacqueline Evers-Vermeul},
url = {https://www.degruyter.com/document/doi/10.1515/cllt-2016-0078/html},
doi = {https://doi.org/10.1515/cllt-2016-0078},
year = {2018},
date = {2018-05-22},
journal = {Corpus Linguistics and Linguistic Theory},
abstract = {In this paper, we show how three often used and seemingly different discourse annotation frameworks – Penn Discourse Treebank (PDTB), Rhetorical Structure Theory (RST), and Segmented Discourse Representation Theory – can be related by using a set of unifying dimensions. These dimensions are taken from the Cognitive approach to Coherence Relations and combined with more fine-grained additional features from the frameworks themselves to yield a posited set of dimensions that can successfully map three frameworks. The resulting interface will allow researchers to find identical or at least closely related relations within sets of annotated corpora, even if they are annotated within different frameworks. Furthermore, we tested our unified dimension (UniDim) approach by comparing PDTB and RST annotations of identical newspaper texts and converting their original end label annotations of relations into the accompanying values per dimension. Subsequently, rates of overlap in the attributed values per dimension were analyzed. Results indicate that the proposed dimensions indeed create an interface that makes existing annotation systems “talk to each other.”},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B2

Steiner, Ingmar; Le Maguer, Sébastien

Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform Inproceedings

11th Language Resources and Evaluation Conference (LREC), pp. 3171-3175, Miyazaki, Japan, 2018.

We present a new workflow to create components for the MaryTTS text-to-speech synthesis platform, which is popular with researchers and developers, extending it to support new languages and custom synthetic voices. This workflow replaces the previous toolkit with an efficient, flexible process that leverages modern build automation and cloud-hosted infrastructure. Moreover, it is compatible with the updated MaryTTS architecture, enabling new features and state-of-the-art paradigms such as synthesis based on deep neural networks (DNNs). Like MaryTTS itself, the new tools are free, open source software (FOSS), and promote the use of open data.

@inproceedings{Steiner2018LREC,
title = {Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform},
author = {Ingmar Steiner and S{\'e}bastien Le Maguer},
url = {https://arxiv.org/abs/1712.04787},
year = {2018},
date = {2018-05-10},
booktitle = {11th Language Resources and Evaluation Conference (LREC)},
pages = {3171-3175},
address = {Miyazaki, Japan},
abstract = {We present a new workflow to create components for the MaryTTS text-to-speech synthesis platform, which is popular with researchers and developers, extending it to support new languages and custom synthetic voices. This workflow replaces the previous toolkit with an efficient, flexible process that leverages modern build automation and cloud-hosted infrastructure. Moreover, it is compatible with the updated MaryTTS architecture, enabling new features and state-of-the-art paradigms such as synthesis based on deep neural networks (DNNs). Like MaryTTS itself, the new tools are free, open source software (FOSS), and promote the use of open data.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C5

Zimmerer, Frank; Brandt, Erika; Andreeva, Bistra; Möbius, Bernd

Idiomatic or literal? Production of collocations in German read speech Inproceedings

Proc. Speech Prosody 2018, pp. 428-432, Poznan, 2018.

Collocations have been identified as an interesting field to study the effects of frequency of occurrence in language and speech. We report results of a production experiment including a duration analysis based on the production of German collocations. The collocations occurred in a condition where the phrase was produced with a literal meaning and in another condition where it was idiomatic. A durational difference was found for the collocations, which were reduced in the idiomatic condition. This difference was also observed for the function word und (‘and’) in collocations like Mord und Totschlag (‘murder and manslaughter’). However, an analysis of the vowel /U/ of the function word did not show a durational difference. Some explanations as to why speakers showed different patterns of reduction (not all collocations were produced with a shorter duration in the idiomatic condition by all speakers) and why not all speakers use the durational cue (one out of eight speakers produced the conditions identically) are proposed.

@inproceedings{Zimmerer2018SpPro,
title = {Idiomatic or literal? Production of collocations in German read speech},
author = {Frank Zimmerer and Erika Brandt and Bistra Andreeva and Bernd M{\"o}bius},
url = {https://www.isca-speech.org/archive/speechprosody_2018/zimmerer18_speechprosody.html},
doi = {https://doi.org/10.21437/SpeechProsody.2018-87},
year = {2018},
date = {2018},
booktitle = {Proc. Speech Prosody 2018},
pages = {428-432},
address = {Poznan},
abstract = {Collocations have been identified as an interesting field to study the effects of frequency of occurrence in language and speech. We report results of a production experiment including a duration analysis based on the production of German collocations. The collocations occurred in a condition where the phrase was produced with a literal meaning and in another condition where it was idiomatic. A durational difference was found for the collocations, which were reduced in the idiomatic condition. This difference was also observed for the function word und (‘and’) in collocations like Mord und Totschlag (‘murder and manslaughter’). However, an analysis of the vowel /U/ of the function word did not show a durational difference. Some explanations as to why speakers showed different patterns of reduction (not all collocations were produced with a shorter duration in the idiomatic condition by all speakers) and why not all speakers use the durational cue (one out of eight speakers produced the conditions identically) are proposed.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C1

Brandt, Erika; Zimmerer, Frank; Andreeva, Bistra; Möbius, Bernd

Impact of prosodic structure and information density on dynamic formant trajectories in German Inproceedings

Klessa, Katarzyna; Bachan, Jolanta; Wagner, Agnieszka; Karpiński, Maciej; Śledziński, Daniel (Ed.): Speech Prosody 2018, Speech Prosody Special Interest Group, pp. 119-123, Urbana, 2018, ISSN 2333-2042.

This study investigated the influence of prosodic structure and information density (ID), defined as contextual predictability, on vowel-inherent spectral change (VISC). We extracted formant measurements from the onset and offset of the vowels of a large German corpus of newspaper read speech. Vector length (VL), the Euclidean distance between F1 and F2 trajectory, and F1 and F2 slope, formant deltas of onset and offset relative to vowel duration, were calculated as measures of formant change. ID factors were word frequency and phoneme-based surprisal measures, while the prosodic factors contained global and local articulation rate, primary lexical stress, and prosodic boundary. We expected that vowels increased in spectral change when they were difficult to predict from the context, or stood in low-frequency words while controlling for known effects of prosodic structure. The ID effects were assumed to be modulated by prosodic factors to a certain extent. We confirmed our hypotheses for VL, and found expected independent effects of prosody and ID on F1 slope and F2 slope.

@inproceedings{Brandt2018SpPro,
title = {Impact of prosodic structure and information density on dynamic formant trajectories in German},
author = {Erika Brandt and Frank Zimmerer and Bistra Andreeva and Bernd M{\"o}bius},
editor = {Katarzyna Klessa and Jolanta Bachan and Agnieszka Wagner and Maciej Karpiński and Daniel Śledziński},
url = {https://www.researchgate.net/publication/325744530_Impact_of_prosodic_structure_and_information_density_on_dynamic_formant_trajectories_in_German},
doi = {https://doi.org/10.22028/D291-32050},
year = {2018},
date = {2018},
booktitle = {Speech Prosody 2018},
issn = {2333-2042},
pages = {119-123},
publisher = {Speech Prosody Special Interest Group},
address = {Urbana},
abstract = {This study investigated the influence of prosodic structure and information density (ID), defined as contextual predictability, on vowel-inherent spectral change (VISC). We extracted formant measurements from the onset and offset of the vowels of a large German corpus of newspaper read speech. Vector length (VL), the Euclidean distance between F1 and F2 trajectory, and F1 and F2 slope, formant deltas of onset and offset relative to vowel duration, were calculated as measures of formant change. ID factors were word frequency and phoneme-based surprisal measures, while the prosodic factors contained global and local articulation rate, primary lexical stress, and prosodic boundary. We expected that vowels increased in spectral change when they were difficult to predict from the context, or stood in low-frequency words while controlling for known effects of prosodic structure. The ID effects were assumed to be modulated by prosodic factors to a certain extent. We confirmed our hypotheses for VL, and found expected independent effects of prosody and ID on F1 slope and F2 slope.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C1

Shen, Xiaoyu; Su, Hui; Niu, Shuzi; Demberg, Vera

Improving Variational Encoder-Decoders in Dialogue Generation Inproceedings

32nd AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, USA, 2018.

Variational encoder-decoders (VEDs) have shown promising results in dialogue generation. However, the latent variable distributions are usually approximated by a much simpler model than the powerful RNN structure used for encoding and decoding, yielding the KL-vanishing problem and inconsistent training objective. In this paper, we separate the training step into two phases: The first phase learns to autoencode discrete texts into continuous embeddings, from which the second phase learns to generalize latent representations by reconstructing the encoded embedding. In this case, latent variables are sampled by transforming Gaussian noise through multi-layer perceptrons and are trained with a separate VED model, which has the potential of realizing a much more flexible distribution. We compare our model with current popular models and the experiment demonstrates substantial improvement in both metric-based and human evaluations.

@inproceedings{Shen2018,
title = {Improving Variational Encoder-Decoders in Dialogue Generation},
author = {Xiaoyu Shen and Hui Su and Shuzi Niu and Vera Demberg},
url = {https://arxiv.org/abs/1802.02032},
year = {2018},
date = {2018-02-02},
publisher = {32nd AAAI Conference on Artificial Intelligence (AAAI-18)},
address = {New Orleans, USA},
abstract = {Variational encoder-decoders (VEDs) have shown promising results in dialogue generation. However, the latent variable distributions are usually approximated by a much simpler model than the powerful RNN structure used for encoding and decoding, yielding the KL-vanishing problem and inconsistent training objective. In this paper, we separate the training step into two phases: The first phase learns to autoencode discrete texts into continuous embeddings, from which the second phase learns to generalize latent representations by reconstructing the encoded embedding. In this case, latent variables are sampled by transforming Gaussian noise through multi-layer perceptrons and are trained with a separate VED model, which has the potential of realizing a much more flexible distribution. We compare our model with current popular models and the experiment demonstrates substantial improvement in both metric-based and human evaluations.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Steiner, Ingmar; Le Maguer, Sébastien; Hewer, Alexander

Synthesis of Tongue Motion and Acoustics from Text using a Multimodal Articulatory Database Journal Article

IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25, pp. 2351-2361, 2017.

We present an end-to-end text-to-speech (TTS) synthesis system that generates audio and synchronized tongue motion directly from text. This is achieved by adapting a 3D model of the tongue surface to an articulatory dataset and training a statistical parametric speech synthesis system directly on the tongue model parameters. We evaluate the model at every step by comparing the spatial coordinates of predicted articulatory movements against the reference data. The results indicate a global mean Euclidean distance of less than 2.8 mm, and our approach can be adapted to add an articulatory modality to conventional TTS applications without the need for extra data.

@article{Steiner2017TASLP,
title = {Synthesis of Tongue Motion and Acoustics from Text using a Multimodal Articulatory Database},
author = {Ingmar Steiner and S{\'e}bastien Le Maguer and Alexander Hewer},
url = {https://arxiv.org/abs/1612.09352},
year = {2017},
date = {2017},
journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
pages = {2351-2361},
volume = {25},
number = {12},
abstract = {We present an end-to-end text-to-speech (TTS) synthesis system that generates audio and synchronized tongue motion directly from text. This is achieved by adapting a 3D model of the tongue surface to an articulatory dataset and training a statistical parametric speech synthesis system directly on the tongue model parameters. We evaluate the model at every step by comparing the spatial coordinates of predicted articulatory movements against the reference data. The results indicate a global mean Euclidean distance of less than 2.8 mm, and our approach can be adapted to add an articulatory modality to conventional TTS applications without the need for extra data.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   C5

Fischer, Andrea; Vreeken, Jilles; Klakow, Dietrich

Beyond Pairwise Similarity: Quantifying and Characterizing Linguistic Similarity between Groups of Languages by MDL Journal Article

Computación y Systems, 21, pp. 829-839, 2017.
We present a minimum description length based algorithm for finding the regular correspondences between related languages and show how it can be used to quantify the similarity between not only pairs, but whole groups of languages directly from cognate sets. We employ a two-part code, which allows to use the data and model complexity of the discovered correspondences as information-theoretic quantifications of the degree of regularity of cognate realizations in these languages. Unlike previous work, our approach is not limited to pairs of languages, does not limit the size of discovered correspondences, does not make assumptions about the shape or distribution of correspondences, and requires no expert knowledge or fine-tuning of parameters. We here test our approach on the Slavic languages. In a pairwise analysis of 13 Slavic languages, we show that our algorithm replicates their linguistic classification exactly. In a four-language experiment, we demonstrate how our algorithm efficiently quantifies similarity between all subsets of the analyzed four languages and find that it is excellently suited to quantifying the orthographic regularity of closely-related languages.

@article{Fischer2017,
title = {Beyond Pairwise Similarity: Quantifying and Characterizing Linguistic Similarity between Groups of Languages by MDL},
author = {Andrea Fischer and Jilles Vreeken and Dietrich Klakow},
url = {http://www.cys.cic.ipn.mx/ojs/index.php/CyS/article/view/2865},
year = {2017},
date = {2017},
journal = {Computación y Systems},
pages = {829-839},
volume = {21},
number = {4},
abstract = {

We present a minimum description length based algorithm for finding the regular correspondences between related languages and show how it can be used to quantify the similarity between not only pairs, but whole groups of languages directly from cognate sets. We employ a two-part code, which allows to use the data and model complexity of the discovered correspondences as information-theoretic quantifications of the degree of regularity of cognate realizations in these languages. Unlike previous work, our approach is not limited to pairs of languages, does not limit the size of discovered correspondences, does not make assumptions about the shape or distribution of correspondences, and requires no expert knowledge or fine-tuning of parameters. We here test our approach on the Slavic languages. In a pairwise analysis of 13 Slavic languages, we show that our algorithm replicates their linguistic classification exactly. In a four-language experiment, we demonstrate how our algorithm efficiently quantifies similarity between all subsets of the analyzed four languages and find that it is excellently suited to quantifying the orthographic regularity of closely-related languages.
},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   C4

Jágrová, Klára; Stenger, Irina; Marti, Roland; Avgustinova, Tania

Lexical and orthographic distances between Bulgarian, Czech, Polish, and Russian: A comparative analysis of the most frequent nouns Inproceedings

Joseph Emonds & Markéta Janebová (eds.), Language Use and Linguistic Structure. Proceedings of the Olomouc Linguistics Colloquium 2016, pp. 401–416, Olomouc: Palacký University, 2017.

@inproceedings{Klára2017,
title = {Lexical and orthographic distances between Bulgarian, Czech, Polish, and Russian: A comparative analysis of the most frequent nouns},
author = {Kl{\'a}ra J{\'a}grov{\'a} and Irina Stenger and Roland Marti and Tania Avgustinova},
year = {2017},
date = {2017},
booktitle = {Joseph Emonds & Mark{\'e}ta Janebov{\'a} (eds.), Language Use and Linguistic Structure. Proceedings of the Olomouc Linguistics Colloquium 2016},
pages = {401–416},
address = {Olomouc: Palacký University},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Stenger, Irina; Jágrová, Klára; Fischer, Andrea; Avgustinova, Tania; Klakow, Dietrich; Marti, Roland

Modeling the Impact of Orthographic Coding on Czech-Polish and Bulgarian-Russian Reading Intercomprehension Journal Article

Nordic Journal of Linguistic, 40, pp. 175-199, 2017.

Focusing on orthography as a primary linguistic interface in every reading activity, the central research question we address here is how orthographic intelligibility can be measured and predicted between closely related languages. This paper presents methods and findings of modeling orthographic intelligibility in a reading intercomprehension scenario from the information-theoretic perspective. The focus of the study is on two Slavic language pairs: Czech–Polish (West Slavic, using the Latin script) and Bulgarian–Russian (South Slavic and East Slavic, respectively, using the Cyrillic script). In this article, we present computational methods for measuring orthographic distance and orthographic asymmetry by means of the Levenshtein algorithm, conditional entropy and adaptation surprisal method that are expected to predict the influence of orthography on mutual intelligibility in reading.

@article{Stenger2017b,
title = {Modeling the Impact of Orthographic Coding on Czech-Polish and Bulgarian-Russian Reading Intercomprehension},
author = {Irina Stenger and Kl{\'a}ra J{\'a}grov{\'a} and Andrea Fischer and Tania Avgustinova and Dietrich Klakow and Roland Marti},
url = {https://www.cambridge.org/core/journals/nordic-journal-of-linguistics/article/modeling-the-impact-of-orthographic-coding-on-czechpolish-and-bulgarianrussian-reading-intercomprehension/363BEB5C556DFBDAC7FEED0AE06B06AA},
year = {2017},
date = {2017},
journal = {Nordic Journal of Linguistic},
pages = {175-199},
volume = {40},
number = {2},
abstract = {

Focusing on orthography as a primary linguistic interface in every reading activity, the central research question we address here is how orthographic intelligibility can be measured and predicted between closely related languages. This paper presents methods and findings of modeling orthographic intelligibility in a reading intercomprehension scenario from the information-theoretic perspective. The focus of the study is on two Slavic language pairs: Czech–Polish (West Slavic, using the Latin script) and Bulgarian–Russian (South Slavic and East Slavic, respectively, using the Cyrillic script). In this article, we present computational methods for measuring orthographic distance and orthographic asymmetry by means of the Levenshtein algorithm, conditional entropy and adaptation surprisal method that are expected to predict the influence of orthography on mutual intelligibility in reading.
},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   C4

Stenger, Irina; Avgustinova, Tania; Marti, Roland

Levenshtein distance and word adaptation surprisal as methods of measuring mutual intelligibility in reading comprehension of Slavic languages Inproceedings

Computational Linguistics and Intellectual Technologies: International Conference "Dialogue 2017" , 1, pp. 304-317, 2017.

In this article we validate two measuring methods: Levenshtein distance and word adaptation surprisal as potential predictors of success in reading intercomprehension. We investigate to what extent orthographic distances between Russian and other East Slavic (Ukrainian, Belarusian) and South Slavic (Bulgarian, Macedonian, Serbian) languages found by means of the Levenshtein algorithm and word adaptation surprisal correlate with comprehension of unknown Slavic languages on the basis of data obtained from Russian native speakers in online free translation task experiments. We try to find an answer to the following question: Can measuring methods such as Levenshtein distance and word adaptation surprisal be considered as a good approximation of orthographic intelligibility of unknown Slavic languages using the Cyrillic script?

@inproceedings{Stenger2017,
title = {Levenshtein distance and word adaptation surprisal as methods of measuring mutual intelligibility in reading comprehension of Slavic languages},
author = {Irina Stenger and Tania Avgustinova and Roland Marti},
url = {https://www.semanticscholar.org/paper/Levenshtein-Distance-anD-WorD-aDaptation-surprisaL-Distance/6103d388cb0398b89dec8ca36ec0be025bb6dea2},
year = {2017},
date = {2017},
booktitle = {Computational Linguistics and Intellectual Technologies: International Conference "Dialogue 2017"},
pages = {304-317},
abstract = {In this article we validate two measuring methods: Levenshtein distance and word adaptation surprisal as potential predictors of success in reading intercomprehension. We investigate to what extent orthographic distances between Russian and other East Slavic (Ukrainian, Belarusian) and South Slavic (Bulgarian, Macedonian, Serbian) languages found by means of the Levenshtein algorithm and word adaptation surprisal correlate with comprehension of unknown Slavic languages on the basis of data obtained from Russian native speakers in online free translation task experiments. We try to find an answer to the following question: Can measuring methods such as Levenshtein distance and word adaptation surprisal be considered as a good approximation of orthographic intelligibility of unknown Slavic languages using the Cyrillic script?},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Jágrová, Klára; Stenger, Irina; Avgustinova, Tania; Marti, Roland

POLSKI TO JEZYK NIESKOMPLIKOWANY? Theoretische und praktische Interkomprehension der 100 häufigsten polnischen Substantive Journal Article

In Polnisch in Deutschland. Zeitschrift der Bundesvereinigung der Polnischlehrkräfte, 4/2016, pp. 5-19, 2017.

@article{Jágrová2017,
title = {POLSKI TO JEZYK NIESKOMPLIKOWANY? Theoretische und praktische Interkomprehension der 100 h{\"a}ufigsten polnischen Substantive},
author = {Kl{\'a}ra J{\'a}grov{\'a} and Irina Stenger and Tania Avgustinova and Roland Marti},
year = {2017},
date = {2017},
journal = {In Polnisch in Deutschland. Zeitschrift der Bundesvereinigung der Polnischlehrkr{\"a}fte},
pages = {5-19},
volume = {4/2016},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   C4

Tourtouri, Elli; Delogu, Francesca; Crocker, Matthew W.

The interplay of specificity and referential entropy reduction in situated communication Inproceedings

10th Annual Embodied and Situated Language (ESLP) Conference, Higher School of Economics, Moscow, Russia, 2017.

In situated communication, reference can be established with expressions conveying either precise (Minimally-Specified, MS) or redundant (Over-Specified, OS) information. For example, while in Figure 1, “Find the blue ball” identifies exactly one object in all panels, only in the top displays is the adjective required. There is no consensus, however, concerning whether OS hinders processing (e.g., Engelhardt et al., 2011) or not (e.g., Tourtouri et al., 2015). Additionally, as incoming words incrementally restrict the referential domain, they contribute to the reduction of uncertainty regarding the target (i.e., referential entropy). Depending on the distribution of objects, the same utterance results in different entropy reduction profiles: “blue” reduces entropy by 1.58 bits in the right panels, and by .58 bits in the left ones, while “ball” reduces entropy by 1 and 2 bits, respectively. Thus, the adjective modulates the distribution of entropy reduction, resulting in uniform (UR) or non-uniform (NR) reduction profiles. This study seeks to establish whether referential processing is facilitated: a) by the use of redundant pre-nominal modification (OS), b) by the uniform reduction of entropy (cf. Jaeger, 2010), and c) when these two factors interact. Results from inspection probabilities and the Index of Cognitive Activity — a pupillometric measure of cognitive workload (Demberg & Sayeed, 2016) — indicate that processing was facilitated for both OS and UR, while fixation probabilities show a greater advantage for OS-UR. In conclusion, efficient processing is determined by both informativity of the reference and the rate of entropy reduction.

@inproceedings{Tourtourietal2017d,
title = {The interplay of specificity and referential entropy reduction in situated communication},
author = {Elli Tourtouri and Francesca Delogu and Matthew W. Crocker},
url = {https://www.researchgate.net/publication/322556329_The_interplay_of_specificity_and_referential_entropy_reduction_in_situated_communication},
year = {2017},
date = {2017},
booktitle = {10th Annual Embodied and Situated Language (ESLP) Conference},
publisher = {Higher School of Economics},
address = {Moscow, Russia},
abstract = {In situated communication, reference can be established with expressions conveying either precise (Minimally-Specified, MS) or redundant (Over-Specified, OS) information. For example, while in Figure 1, “Find the blue ball” identifies exactly one object in all panels, only in the top displays is the adjective required. There is no consensus, however, concerning whether OS hinders processing (e.g., Engelhardt et al., 2011) or not (e.g., Tourtouri et al., 2015). Additionally, as incoming words incrementally restrict the referential domain, they contribute to the reduction of uncertainty regarding the target (i.e., referential entropy). Depending on the distribution of objects, the same utterance results in different entropy reduction profiles: “blue” reduces entropy by 1.58 bits in the right panels, and by .58 bits in the left ones, while “ball” reduces entropy by 1 and 2 bits, respectively. Thus, the adjective modulates the distribution of entropy reduction, resulting in uniform (UR) or non-uniform (NR) reduction profiles. This study seeks to establish whether referential processing is facilitated: a) by the use of redundant pre-nominal modification (OS), b) by the uniform reduction of entropy (cf. Jaeger, 2010), and c) when these two factors interact. Results from inspection probabilities and the Index of Cognitive Activity — a pupillometric measure of cognitive workload (Demberg & Sayeed, 2016) — indicate that processing was facilitated for both OS and UR, while fixation probabilities show a greater advantage for OS-UR. In conclusion, efficient processing is determined by both informativity of the reference and the rate of entropy reduction.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A1 C3

Tourtouri, Elli; Delogu, Francesca; Crocker, Matthew W.

Overspecification and uniform reduction of visual entropy facilitate referential processing Inproceedings

XPrag2017, Cologne, Germany, 2017.

Over-specifications (OS) are expressions that provide more information than minimally required for the identification of a referent, thereby violating Grice’s 2nd Quantity Maxim [1]. In Figure 1, for example, the expression “Find the blue ball” identifies exactly one object in all panels, but only in the top displays is the adjective required to disambiguate the target. In recent years, psycholinguistic research has tried to test the empirical validity of Grice’s Maxim, resulting in conflicting findings. That is, there is evidence both that OS hinders [2,3] and that it facilitates [4,5] referential processing. The current study investigates the influence of OS on visually-situated processing, when the context allows both a minimally-specified (MS) and an OS interpretation of pre-nominal adjectives (cf. Fig.1). Additionally, as the utterance unfolds over time, incoming words incrementally restrict the search space. In this sense, information on “blue” and “ball” is determined not only by their probability to occur in this context, but also by the amount of uncertainty about the target they reduce — in information theoretic terms [6]. A greater reduction of the referential set size on the adjective (A&C) results in a more uniform reduction profile (Uniform Reduction, UR), as the adjective reduces entropy by 1.58 bits and the noun by 1 bit. On the other hand, a moderate reduction of the set size on the adjective (B&D) results in a less uniform reduction profile (Nonuniform Reduction, NR): the adjective reduces entropy by .58 bits and the noun by 2 bits. This study examines whether, above and beyond any effects of specificity, the rate at which incoming words reduce visual entropy also affects referential processing.

@inproceedings{Tourtourietal2017b,
title = {Overspecification and uniform reduction of visual entropy facilitate referential processing},
author = {Elli Tourtouri and Francesca Delogu and Matthew W. Crocker},
url = {https://www.researchgate.net/publication/322571202_Over-specification_Uniform_Reduction_of_Visual_Entropy_Facilitate_Referential_Processing},
year = {2017},
date = {2017},
booktitle = {XPrag2017},
address = {Cologne, Germany},
abstract = {Over-specifications (OS) are expressions that provide more information than minimally required for the identification of a referent, thereby violating Grice’s 2nd Quantity Maxim [1]. In Figure 1, for example, the expression “Find the blue ball” identifies exactly one object in all panels, but only in the top displays is the adjective required to disambiguate the target. In recent years, psycholinguistic research has tried to test the empirical validity of Grice’s Maxim, resulting in conflicting findings. That is, there is evidence both that OS hinders [2,3] and that it facilitates [4,5] referential processing. The current study investigates the influence of OS on visually-situated processing, when the context allows both a minimally-specified (MS) and an OS interpretation of pre-nominal adjectives (cf. Fig.1). Additionally, as the utterance unfolds over time, incoming words incrementally restrict the search space. In this sense, information on “blue” and “ball” is determined not only by their probability to occur in this context, but also by the amount of uncertainty about the target they reduce — in information theoretic terms [6]. A greater reduction of the referential set size on the adjective (A&C) results in a more uniform reduction profile (Uniform Reduction, UR), as the adjective reduces entropy by 1.58 bits and the noun by 1 bit. On the other hand, a moderate reduction of the set size on the adjective (B&D) results in a less uniform reduction profile (Nonuniform Reduction, NR): the adjective reduces entropy by .58 bits and the noun by 2 bits. This study examines whether, above and beyond any effects of specificity, the rate at which incoming words reduce visual entropy also affects referential processing.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A1 C3

Jachmann, Torsten; Drenhaus, Heiner; Staudte, Maria; Crocker, Matthew W.

The Influence of Speaker's Gaze on Sentence Comprehension: An ERP Investigation Inproceedings

Proceedings of the 39th Annual Conference of the Cognitive Science Society, pp. 2261-2266, 2017.

Behavioral studies demonstrate the influence of speaker gaze in visually-situated spoken language comprehension. We present an ERP experiment examining the influence of speaker’s gaze congruency on listeners’ comprehension of referential expressions related to a shared visual scene. We demonstrate that listeners exploit speakers’ gaze toward objects in order to form sentence continuation expectations: Compared to a congruent gaze condition, we observe an increased N400 when (a) the lack of gaze (neutral) does not allow for upcoming noun prediction, and (b) when the noun violates gaze-driven expectations (incongruent). The later also results in a late (sustained) positivity, reflecting the need to update the assumed situation model. We take the combination of the N400 and late positivity as evidence that speaker gaze influences both lexical retrieval and integration processes, respectively (Brouwer et al., in press). Moreover, speaker gaze is interpreted as reflecting referential intentions (Staudte & Crocker, 2011).

@inproceedings{Jachmann2017,
title = {The Influence of Speaker's Gaze on Sentence Comprehension: An ERP Investigation},
author = {Torsten Jachmann and Heiner Drenhaus and Maria Staudte and Matthew W. Crocker},
url = {https://www.researchgate.net/publication/325969989_The_Influence_of_Speaker%27s_Gaze_on_Sentence_Comprehension_An_ERP_Investigation},
year = {2017},
date = {2017},
booktitle = {Proceedings of the 39th Annual Conference of the Cognitive Science Society},
pages = {2261-2266},
abstract = {Behavioral studies demonstrate the influence of speaker gaze in visually-situated spoken language comprehension. We present an ERP experiment examining the influence of speaker’s gaze congruency on listeners’ comprehension of referential expressions related to a shared visual scene. We demonstrate that listeners exploit speakers’ gaze toward objects in order to form sentence continuation expectations: Compared to a congruent gaze condition, we observe an increased N400 when (a) the lack of gaze (neutral) does not allow for upcoming noun prediction, and (b) when the noun violates gaze-driven expectations (incongruent). The later also results in a late (sustained) positivity, reflecting the need to update the assumed situation model. We take the combination of the N400 and late positivity as evidence that speaker gaze influences both lexical retrieval and integration processes, respectively (Brouwer et al., in press). Moreover, speaker gaze is interpreted as reflecting referential intentions (Staudte & Crocker, 2011).},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C3

Tourtouri, Elli; Delogu, Francesca; Crocker, Matthew W.

Specificity and entropy reduction in situated referential processing Inproceedings

39th Annual Conference of the Cognitive Science Society, Austin, Texas, USA, 2017.

In situated communication, reference to an entity in the shared visual context can be established using eitheranexpression that conveys precise (minimally specified) or redundant (over-specified) information. There is, however, along-lasting debate in psycholinguistics concerningwhether the latter hinders referential processing. We present evidence from an eyetrackingexperiment recordingfixations as well asthe Index of Cognitive Activity –a novel measure of cognitive workload –supporting the view that over-specifications facilitate processing. We further present originalevidence that, above and beyond the effect of specificity,referring expressions thatuniformly reduce referential entropyalso benefitprocessing

@inproceedings{Tourtouri2017,
title = {Specificity and entropy reduction in situated referential processing},
author = {Elli Tourtouri and Francesca Delogu and Matthew W. Crocker},
url = {https://www.mpi.nl/publications/item3309545/specificity-and-entropy-reduction-situated-referential-processing},
year = {2017},
date = {2017},
booktitle = {39th Annual Conference of the Cognitive Science Society},
address = {Austin, Texas, USA},
abstract = {In situated communication, reference to an entity in the shared visual context can be established using eitheranexpression that conveys precise (minimally specified) or redundant (over-specified) information. There is, however, along-lasting debate in psycholinguistics concerningwhether the latter hinders referential processing. We present evidence from an eyetrackingexperiment recordingfixations as well asthe Index of Cognitive Activity –a novel measure of cognitive workload –supporting the view that over-specifications facilitate processing. We further present originalevidence that, above and beyond the effect of specificity,referring expressions thatuniformly reduce referential entropyalso benefitprocessing},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C3

Sikos, Les; Greenberg, Clayton; Drenhaus, Heiner; Crocker, Matthew W.

Information density of encodings: The role of syntactic variation in comprehension Inproceedings

Proceedings of the 39th Annual Conference of the Cognitive Science Society(CogSci 2017), pp. 3168-3173, Austin, Texas, USA, 2017.

The Uniform Information Density (UID) hypothesis links production strategies with comprehension processes, predicting that speakers will utilize flexibility in encoding in order to increase uniformity in the rate of information transmission, as measured by surprisal (Jaeger, 2010). Evidence in support of UID comes primarily from studies focusing on word-level effects, e.g. demonstrating that surprisal predicts the omission/inclusion of optional words. Here we investigate whether comprehenders are sensitive to the information density of alternative encodings that are more syntactically complex. We manipulated the syntactic encoding of complex noun phrases in German via meaning-preserving pre-nominal and post-nominal modification in contexts that were either predictive or non-predictive. We then used the G-maze reading task to measure online comprehension during self-paced reading. The results are consistent with the UID hypothesis. Length-adjusted reading times were facilitated for pre-nominally modified head nouns, and this effect was larger in non-predictive contexts.

@inproceedings{Sikos2017,
title = {Information density of encodings: The role of syntactic variation in comprehension},
author = {Les Sikos and Clayton Greenberg and Heiner Drenhaus and Matthew W. Crocker},
url = {https://www.semanticscholar.org/paper/Information-density-of-encodings%3A-The-role-of-in-Sikos-Greenberg/06a47324b53bc53e0e4762fd1547091d8b2392f1},
year = {2017},
date = {2017},
booktitle = {Proceedings of the 39th Annual Conference of the Cognitive Science Society(CogSci 2017)},
pages = {3168-3173},
address = {Austin, Texas, USA},
abstract = {The Uniform Information Density (UID) hypothesis links production strategies with comprehension processes, predicting that speakers will utilize flexibility in encoding in order to increase uniformity in the rate of information transmission, as measured by surprisal (Jaeger, 2010). Evidence in support of UID comes primarily from studies focusing on word-level effects, e.g. demonstrating that surprisal predicts the omission/inclusion of optional words. Here we investigate whether comprehenders are sensitive to the information density of alternative encodings that are more syntactically complex. We manipulated the syntactic encoding of complex noun phrases in German via meaning-preserving pre-nominal and post-nominal modification in contexts that were either predictive or non-predictive. We then used the G-maze reading task to measure online comprehension during self-paced reading. The results are consistent with the UID hypothesis. Length-adjusted reading times were facilitated for pre-nominally modified head nouns, and this effect was larger in non-predictive contexts.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C3

Calvillo, Jesús

Fast and Easy: Approximating uniform information density in language production Inproceedings

39th Annual Conference of the Cognitive Science Society, Austin, Texas, USA, 2017.

A model of sentence production is presented, which implements a strategy that produces sentences with more uniform surprisal profiles, as compared to other strategies, and in accordance to the Uniform Information Density Hypothesis (Jaeger, 2006; Levy & Jaeger, 2007). The model operates at the algorithmic level combining information concerning word probabilities and sentence lengths, representing a first attempt to model UID as resulting from underlying factors during language production. The sentences produced by this model showed indeed the expected tendency, having more uniform surprisal profiles and lower average word surprisal, in comparison to other production strategies.

@inproceedings{Calvillo2017,
title = {Fast and Easy: Approximating uniform information density in language production},
author = {Jesús Calvillo},
url = {https://cogsci.mindmodeling.org/2017/papers/0333/paper0333.pdf},
year = {2017},
date = {2017},
publisher = {39th Annual Conference of the Cognitive Science Society},
address = {Austin, Texas, USA},
abstract = {A model of sentence production is presented, which implements a strategy that produces sentences with more uniform surprisal profiles, as compared to other strategies, and in accordance to the Uniform Information Density Hypothesis (Jaeger, 2006; Levy & Jaeger, 2007). The model operates at the algorithmic level combining information concerning word probabilities and sentence lengths, representing a first attempt to model UID as resulting from underlying factors during language production. The sentences produced by this model showed indeed the expected tendency, having more uniform surprisal profiles and lower average word surprisal, in comparison to other production strategies.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C3

Oualil, Youssef

Sequential estimation techniques and application to multiple speaker tracking and language modeling PhD Thesis

Saarland University, Saarbruecken, Germany, 2017.

For many real-word applications, the considered data is given as a time sequence that becomes available in an orderly fashion, where the order incorporates important information about the entities of interest. The work presented in this thesis deals with two such cases by introducing new sequential estimation solutions. More precisely, we introduce a: I. Sequential Bayesian estimation framework to solve the multiple speaker localization, detection and tracking problem. This framework is a complete pipeline that includes 1) new observation estimators, which extract a fixed number of potential locations per time frame; 2) new unsupervised Bayesian detectors, which classify these estimates into noise/speaker classes and 3) new Bayesian filters, which use the speaker class estimates to track multiple speakers.

This framework was developed to tackle the low overlap detection rate of multiple speakers and to reduce the number of constraints generally imposed in standard solutions. II. Sequential neural estimation framework for language modeling, which overcomes some of the shortcomings of standard approaches through merging of different models in a hybrid architecture. That is, we introduce two solutions that tightly merge particular models and then show how a generalization can be achieved through a new mixture model. In order to speed-up the training of large vocabulary language models, we introduce a new extension of the noise contrastive estimation approach to batch training.

@phdthesis{Oualil2017b,
title = {Sequential estimation techniques and application to multiple speaker tracking and language modeling},
author = {Youssef Oualil},
url = {http://nbn-resolving.de/urn:nbn:de:bsz:291-scidok-ds-272280},
doi = {https://doi.org/http://dx.doi.org/10.22028/D291-27228},
year = {2017},
date = {2017},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {For many real-word applications, the considered data is given as a time sequence that becomes available in an orderly fashion, where the order incorporates important information about the entities of interest. The work presented in this thesis deals with two such cases by introducing new sequential estimation solutions. More precisely, we introduce a: I. Sequential Bayesian estimation framework to solve the multiple speaker localization, detection and tracking problem. This framework is a complete pipeline that includes 1) new observation estimators, which extract a fixed number of potential locations per time frame; 2) new unsupervised Bayesian detectors, which classify these estimates into noise/speaker classes and 3) new Bayesian filters, which use the speaker class estimates to track multiple speakers. This framework was developed to tackle the low overlap detection rate of multiple speakers and to reduce the number of constraints generally imposed in standard solutions. II. Sequential neural estimation framework for language modeling, which overcomes some of the shortcomings of standard approaches through merging of different models in a hybrid architecture. That is, we introduce two solutions that tightly merge particular models and then show how a generalization can be achieved through a new mixture model. In order to speed-up the training of large vocabulary language models, we introduce a new extension of the noise contrastive estimation approach to batch training.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   B4

Successfully