Publications - SFB 1102

Hoek, Jet; Scholman, Merel

Expressing non-volitional causality in English Book Chapter

Jędrzejowski, Łukasz; Fleczoreck, Constanze (Ed.): Micro- and Macro-variation of Causal Clauses: Synchronic and Diachronic Insights, John Benjamins Publishing Company, pp. 167–183, Amsterdam, 2023.

Abstract
|
Links
|
BibTeX

English because is assumed to be polysemous in that it can be used to mark causal relations in all domains. The current study examines this claim and explores the suitability of because to mark non-volitional content relations. In a parallel corpus study, we investigate how causal relations translated into Dutch using doordat (prototypically marking non-volitional causal relations), omdat (marking content relations), and want (marking epistemic and speech act relations) were originally expressed in English. The results show that while omdat and want are indeed typically translations of because in English, non-volitional doordat is not. A qualitative analysis reveals that non-volitional causality is more often expressed in English in a single discourse unit or using a connective restricted to the content domain. These findings have important consequences for the presumed domain generality of English because and call for a reconsideration of English translation recommendations for doordat.

https://benjamins.com/catalog/slcs.231.06hoe

@inbook{hoek-scholman-2023,
title = {Expressing non-volitional causality in English},
author = {Jet Hoek and Merel Scholman},
editor = {Łukasz Jędrzejowski and Constanze Fleczoreck},
url = {https://benjamins.com/catalog/slcs.231.06hoe},
year = {2023},
date = {2023},
booktitle = {Micro- and Macro-variation of Causal Clauses: Synchronic and Diachronic Insights},
pages = {167–183},
publisher = {John Benjamins Publishing Company},
address = {Amsterdam},
abstract = {

English because is assumed to be polysemous in that it can be used to mark causal relations in all domains. The current study examines this claim and explores the suitability of because to mark non-volitional content relations. In a parallel corpus study, we investigate how causal relations translated into Dutch using doordat (prototypically marking non-volitional causal relations), omdat (marking content relations), and want (marking epistemic and speech act relations) were originally expressed in English. The results show that while omdat and want are indeed typically translations of because in English, non-volitional doordat is not. A qualitative analysis reveals that non-volitional causality is more often expressed in English in a single discourse unit or using a connective restricted to the content domain. These findings have important consequences for the presumed domain generality of English because and call for a reconsideration of English translation recommendations for doordat.

},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project: B2

Marchal, Marian; Scholman, Merel; Demberg, Vera

How Statistical Correlations Influence Discourse-Level Processing: Clause Type as a Cue for Discourse Relations Journal Article

Journal of Experimental Psychology: Learning, Memory, and Cognition, Advance online publication, 2023.

Abstract
|
Links
|
BibTeX

Linguistic phenomena (e.g., words and syntactic structure) co-occur with a wide variety of meanings. These systematic correlations can help readers to interpret a text and create predictions about upcoming material. However, to what extent these correlations influence discourse processing is still unknown. We address this question by examining whether clause type serves as a cue for discourse relations. We found that the co-occurrence of gerund-free adjuncts and specific discourse relations found in natural language is also reflected in readers’ offline expectations for discourse relations. However, we also found that clause structure did not facilitate the online processing of these discourse relations, nor that readers have a preference for these relations in a paraphrase selection task. The present research extends previous research on discourse relation processing, which mostly focused on lexical cues, by examining the role of non-semantic cues. We show that readers are aware of correlations between clause structure and discourse relations in natural language, but that, unlike what has been found for lexical cues, this information does not seem to influence online processing and discourse interpretation.

https://doi.org/10.1037/xlm0001270

@article{marchal-etal-2023,
title = {How Statistical Correlations Influence Discourse-Level Processing: Clause Type as a Cue for Discourse Relations},
author = {Marian Marchal and Merel Scholman and Vera Demberg},
url = {https://doi.org/10.1037/xlm0001270},
year = {2023},
date = {2023},
journal = {Journal of Experimental Psychology: Learning, Memory, and Cognition},
publisher = {Advance online publication},
abstract = {

Linguistic phenomena (e.g., words and syntactic structure) co-occur with a wide variety of meanings. These systematic correlations can help readers to interpret a text and create predictions about upcoming material. However, to what extent these correlations influence discourse processing is still unknown. We address this question by examining whether clause type serves as a cue for discourse relations. We found that the co-occurrence of gerund-free adjuncts and specific discourse relations found in natural language is also reflected in readers’ offline expectations for discourse relations. However, we also found that clause structure did not facilitate the online processing of these discourse relations, nor that readers have a preference for these relations in a paraphrase selection task. The present research extends previous research on discourse relation processing, which mostly focused on lexical cues, by examining the role of non-semantic cues. We show that readers are aware of correlations between clause structure and discourse relations in natural language, but that, unlike what has been found for lexical cues, this information does not seem to influence online processing and discourse interpretation.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project: B2

Sommerfeld, Linda; Staudte, Maria; Mani, Nivedita; Kray, Jutta

Even young children make multiple predictions in the complex visual world Journal Article

Journal of Experimental Child Psychology , 235, 2023.

Abstract
|
Links
|
BibTeX

Children can anticipate upcoming input in sentences with semantically constraining verbs. In the visual world, the sentence context is used to anticipatorily fixate the only object matching potential sentence continuations. Adults can process even multiple visual objects in parallel when predicting language. This study examined whether young children can also maintain multiple prediction options in parallel during language processing. In addition, we aimed at replicating the finding that children’s receptive vocabulary size modulates their prediction. German children (5–6 years, n = 26) and adults (19–40 years, n = 37) listened to 32 subject–verb–object sentences with semantically constraining verbs (e.g., “The father eats the waffle”) while looking at visual scenes of four objects. The number of objects being consistent with the verb constraints (e.g., being edible) varied among 0, 1, 3, and 4. A linear mixed effects model on the proportion of target fixations with the effect coded factors condition (i.e., the number of consistent objects), time window, and age group revealed that upon hearing the verb, children and adults anticipatorily fixated the single visual object, or even multiple visual objects, being consistent with the verb constraints, whereas inconsistent objects were fixated less. This provides first evidence that, comparable to adults, young children maintain multiple prediction options in parallel. Moreover, children with larger receptive vocabulary sizes (Peabody Picture Vocabulary Test) anticipatorily fixated potential targets more often than those with smaller ones, showing that verbal abilities affect children’s prediction in the complex visual world.

@article{Sommerfeld_etal_children_2023,
title = {Even young children make multiple predictions in the complex visual world},
author = {Linda Sommerfeld and Maria Staudte and Nivedita Mani and Jutta Kray},
url = {https://www.sciencedirect.com/science/article/pii/S0022096523000668},
doi = {https://doi.org/10.1016/j.jecp.2023.105690},
year = {2023},
date = {2023},
journal = {Journal of Experimental Child Psychology},
volume = {235},
number = {105690},
abstract = {

Children can anticipate upcoming input in sentences with semantically constraining verbs. In the visual world, the sentence context is used to anticipatorily fixate the only object matching potential sentence continuations. Adults can process even multiple visual objects in parallel when predicting language. This study examined whether young children can also maintain multiple prediction options in parallel during language processing. In addition, we aimed at replicating the finding that children’s receptive vocabulary size modulates their prediction. German children (5–6 years, n = 26) and adults (19–40 years, n = 37) listened to 32 subject–verb–object sentences with semantically constraining verbs (e.g., “The father eats the waffle”) while looking at visual scenes of four objects. The number of objects being consistent with the verb constraints (e.g., being edible) varied among 0, 1, 3, and 4. A linear mixed effects model on the proportion of target fixations with the effect coded factors condition (i.e., the number of consistent objects), time window, and age group revealed that upon hearing the verb, children and adults anticipatorily fixated the single visual object, or even multiple visual objects, being consistent with the verb constraints, whereas inconsistent objects were fixated less. This provides first evidence that, comparable to adults, young children maintain multiple prediction options in parallel. Moreover, children with larger receptive vocabulary sizes (Peabody Picture Vocabulary Test) anticipatorily fixated potential targets more often than those with smaller ones, showing that verbal abilities affect children’s prediction in the complex visual world.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project: A5

Werner, Raphael

The phonetics of speech breathing : pauses, physiology, acoustics, and perception PhD Thesis

Saarland University, Saarbruecken, Germany, 2023.

Abstract
|
Links
|
BibTeX

Speech is made up of a continuous stream of speech sounds that is interrupted by pauses and breathing. As phoneticians are primarily interested in describing the segments of the speech stream, pauses and breathing are often neglected in phonetic studies, even though they are vital for speech. The present work adds to a more detailed view of both pausing and speech breathing with a special focus on the latter and the resulting breath noises, investigating their acoustic, physiological, and perceptual aspects. We present an overview of how a selection of corpora annotate pauses and pause-internal particles, as well as a recording setup that can be used for further studies on speech breathing. For pauses, this work emphasized their optionality and variability under different tempos, as well as the temporal composition of silence and breath noise in breath pauses. For breath noises, we first focused on acoustic and physiological characteristics: We explored alignment between the onsets and offsets of audible breath noises with the start and end of expansion of both rib cage and abdomen. Further, we found similarities between speech breath noises and aspiration phases of /k/, as well as that breath noises may be produced with a more open and slightly more front place of articulation than realizations of schwa. We found positive correlations between acoustic and physiological parameters, suggesting that when speakers inhale faster, the resulting breath noises were more intense and produced more anterior in the mouth. Inspecting the entire spectrum of speech breath noises, we showed relatively flat spectra and several weak peaks. These peaks largely overlapped with resonances reported for inhalations produced with a central vocal tract configuration. We used 3D-printed vocal tract models representing four vowels and four fricatives to simulate in- and exhalations by reversing airflow direction. We found the direction to not have a general effect for all models, but only for those with high-tongue configurations, as opposed to those that were more open. Then, we compared inhalations produced with the schwa-model to human inhalations in an attempt to approach the vocal tract configuration in speech breathing. There were some similarities, however, several complexities of human speech breathing not captured in the models complicated comparisons. In two perception studies, we investigated how much information listeners could auditorily extract from breath noises. First, we tested categorizing different breath noises into six different types, based on airflow direction and airway usage, e.g. oral inhalation. Around two thirds of all answers were correct. Second, we investigated how well breath noises could be used to discriminate between speakers and to extract coarse information on speaker characteristics, such as age (old/young) and sex (female/male). We found that listeners were able to distinguish between two breath noises coming from the same or different speakers in around two thirds of all cases. Hearing one breath noise, classification of sex was successful in around 64%, while for age it was 50%, suggesting that sex was more perceivable than age in breath noises.

@phdthesis{Werner_Diss_2023,
title = {The phonetics of speech breathing : pauses, physiology, acoustics, and perception},
author = {Raphael Werner},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/36987},
doi = {https://doi.org/10.22028/D291-41147},
year = {2023},
date = {2023},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {Speech is made up of a continuous stream of speech sounds that is interrupted by pauses and breathing. As phoneticians are primarily interested in describing the segments of the speech stream, pauses and breathing are often neglected in phonetic studies, even though they are vital for speech. The present work adds to a more detailed view of both pausing and speech breathing with a special focus on the latter and the resulting breath noises, investigating their acoustic, physiological, and perceptual aspects. We present an overview of how a selection of corpora annotate pauses and pause-internal particles, as well as a recording setup that can be used for further studies on speech breathing. For pauses, this work emphasized their optionality and variability under different tempos, as well as the temporal composition of silence and breath noise in breath pauses. For breath noises, we first focused on acoustic and physiological characteristics: We explored alignment between the onsets and offsets of audible breath noises with the start and end of expansion of both rib cage and abdomen. Further, we found similarities between speech breath noises and aspiration phases of /k/, as well as that breath noises may be produced with a more open and slightly more front place of articulation than realizations of schwa. We found positive correlations between acoustic and physiological parameters, suggesting that when speakers inhale faster, the resulting breath noises were more intense and produced more anterior in the mouth. Inspecting the entire spectrum of speech breath noises, we showed relatively flat spectra and several weak peaks. These peaks largely overlapped with resonances reported for inhalations produced with a central vocal tract configuration. We used 3D-printed vocal tract models representing four vowels and four fricatives to simulate in- and exhalations by reversing airflow direction. We found the direction to not have a general effect for all models, but only for those with high-tongue configurations, as opposed to those that were more open. Then, we compared inhalations produced with the schwa-model to human inhalations in an attempt to approach the vocal tract configuration in speech breathing. There were some similarities, however, several complexities of human speech breathing not captured in the models complicated comparisons. In two perception studies, we investigated how much information listeners could auditorily extract from breath noises. First, we tested categorizing different breath noises into six different types, based on airflow direction and airway usage, e.g. oral inhalation. Around two thirds of all answers were correct. Second, we investigated how well breath noises could be used to discriminate between speakers and to extract coarse information on speaker characteristics, such as age (old/young) and sex (female/male). We found that listeners were able to distinguish between two breath noises coming from the same or different speakers in around two thirds of all cases. Hearing one breath noise, classification of sex was successful in around 64%, while for age it was 50%, suggesting that sex was more perceivable than age in breath noises.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project: C1

Voigtmann, Sophia; Speyer, Augustin

Where to place a phrase? Journal Article

Journal of Historical Syntax, 7, Proceedings of the 22nd Diachronic Generative Syntax (DiGS) Conference, 2023.

Abstract
|
Links
|
BibTeX

In the following paper, we aim to cast light on the placement of prepositional phrases (PPs) in the so-called postfield, the position behind the right sentence bracket. Our focus is on the period of early New High German from 1650 to 1900. In a first step, extraposition will be correlated with Information Density (’ID’, Shannon 1948). ID is defined as “amount of information per unit comprising the utterance” (Levy & Jaeger 2007: 1). It can be calculated as surprisal. The higher the surprisal values the higher the impact on working memory and the more likely perceiving di?iculties become (e.g. Hale 2001). We expect PP with such high surprisal values to be more likely to be placed in the postfield where more memory capacities are available than in the middle field. We test this hypothesis on a corpus of scientific articles and monographs dealing with medicine and theology and taken from the Deutsches Textarchiv (DTA, BBAW 2019). We only find evidence for the hypothesis in the timespan from 1650 to 1700 and for the rare case that attributive PPs are placed in the postfield. Since this has already been shown for attributive relative clauses (Voigtmann & Speyer 2021), we want to take this up and argue for a similar generative analysis for attributive PP and relative clauses in a second step.

https://doi.org/10.18148/HS/2023.V7I6-19.151

@article{voigtmann_speyer_2023,
title = {Where to place a phrase?},
author = {Sophia Voigtmann and Augustin Speyer},
url = {https://doi.org/10.18148/HS/2023.V7I6-19.151},
year = {2023},
date = {2023},
journal = {Journal of Historical Syntax},
publisher = {Proceedings of the 22nd Diachronic Generative Syntax (DiGS) Conference},
volume = {7},
number = {6-19},
abstract = {

In the following paper, we aim to cast light on the placement of prepositional phrases (PPs) in the so-called postfield, the position behind the right sentence bracket. Our focus is on the period of early New High German from 1650 to 1900. In a first step, extraposition will be correlated with Information Density (’ID’, Shannon 1948). ID is defined as “amount of information per unit comprising the utterance” (Levy & Jaeger 2007: 1). It can be calculated as surprisal. The higher the surprisal values the higher the impact on working memory and the more likely perceiving di?iculties become (e.g. Hale 2001). We expect PP with such high surprisal values to be more likely to be placed in the postfield where more memory capacities are available than in the middle field. We test this hypothesis on a corpus of scientific articles and monographs dealing with medicine and theology and taken from the Deutsches Textarchiv (DTA, BBAW 2019). We only find evidence for the hypothesis in the timespan from 1650 to 1700 and for the rare case that attributive PPs are placed in the postfield. Since this has already been shown for attributive relative clauses (Voigtmann & Speyer 2021), we want to take this up and argue for a similar generative analysis for attributive PP and relative clauses in a second step.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project: C6

Kunilovskaya, Maria; Przybyl, Heike; Lapshinova-Koltunski, Ekaterina; Teich, Elke

Simultaneous Interpreting as a Noisy Channel: How Much Information Gets Through Inproceedings

Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, INCOMA Ltd., Shoumen, Bulgaria, pp. 608–618, Varna, Bulgaria, 2023.

Abstract
|
Links
|
BibTeX

We explore the relationship between information density/surprisal of source and target texts in translation and interpreting in the language pair English-German, looking at the specific properties of translation (“translationese”). Our data comes from two bidirectional English-German subcorpora representing written and spoken mediation modes collected from European Parliament proceedings. Within each language, we (a) compare original speeches to their translated or interpreted counterparts, and (b) explore the association between segment-aligned sources and targets in each translation direction. As additional variables, we consider source delivery mode (read-out, impromptu) and source speech rate in interpreting. We use language modelling to measure the information rendered by words in a segment and to characterise the cross-lingual transfer of information under various conditions. Our approach is based on statistical analyses of surprisal values, extracted from ngram models of our dataset. The analysis reveals that while there is a considerable positive correlation between the average surprisal of source and target segments in both modes, information output in interpreting is lower than in translation, given the same amount of input. Significantly lower information density in spoken mediated production compared to nonmediated speech in the same language can indicate a possible simplification effect in interpreting.

2023.ranlp-1.66 (0.62MB)
https://aclanthology.org/2023.ranlp-1.66/

@inproceedings{kunilovskaya-etal-2023,
title = {Simultaneous Interpreting as a Noisy Channel: How Much Information Gets Through},
author = {Maria Kunilovskaya and Heike Przybyl and Ekaterina Lapshinova-Koltunski and Elke Teich},
url = {https://aclanthology.org/2023.ranlp-1.66/},
year = {2023},
date = {2023},
booktitle = {Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing},
pages = {608–618},
publisher = {INCOMA Ltd., Shoumen, Bulgaria},
address = {Varna, Bulgaria},
abstract = {We explore the relationship between information density/surprisal of source and target texts in translation and interpreting in the language pair English-German, looking at the specific properties of translation (“translationese”). Our data comes from two bidirectional English-German subcorpora representing written and spoken mediation modes collected from European Parliament proceedings. Within each language, we (a) compare original speeches to their translated or interpreted counterparts, and (b) explore the association between segment-aligned sources and targets in each translation direction. As additional variables, we consider source delivery mode (read-out, impromptu) and source speech rate in interpreting. We use language modelling to measure the information rendered by words in a segment and to characterise the cross-lingual transfer of information under various conditions. Our approach is based on statistical analyses of surprisal values, extracted from ngram models of our dataset. The analysis reveals that while there is a considerable positive correlation between the average surprisal of source and target segments in both modes, information output in interpreting is lower than in translation, given the same amount of input. Significantly lower information density in spoken mediated production compared to nonmediated speech in the same language can indicate a possible simplification effect in interpreting.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: B7

Yung, Frances Pik Yu; Scholman, Merel; Lapshinova-Koltunski, Ekaterina; Pollkläsener, Christina; Demberg, Vera

Investigating Explicitation of Discourse Connectives in Translation Using Automatic Annotations Inproceedings

Stoyanchev, Svetlana; Joty, Shafiq; Schlangen, David; Dusek, Ondrej; Kennington, Casey; Alikhani, Malihe (Ed.): Proceedings of the 24th Meeting of Special Interest Group on Discourse and Dialogue (SIGDIAL), Association for Computational Linguistics, pp. 21-30, Prague, Czechia, 2023.

Abstract
|
Links
|
BibTeX

Discourse relations have different patterns of marking across different languages. As a result, discourse connectives are often added, omitted, or rephrased in translation. Prior work has shown a tendency for explicitation of discourse connectives, but such work was conducted using restricted sample sizes due to difficulty of connective identification and alignment. The current study exploits automatic methods to facilitate a large-scale study of connectives in English and German parallel texts. Our results based on over 300 types and 18000 instances of aligned connectives and an empirical approach to compare the cross-lingual specificity gap provide strong evidence of the Explicitation Hypothesis. We conclude that discourse relations are indeed more explicit in translation than texts written originally in the same language. Automatic annotations allow us to carry out translation studies of discourse relations on a large scale. Our methodology using relative entropy to study the specificity of connectives also provides more fine-grained insights into translation patterns.

@inproceedings{yung-etal-2023-investigating,
title = {Investigating Explicitation of Discourse Connectives in Translation Using Automatic Annotations},
author = {Frances Pik Yu Yung and Merel Scholman and Ekaterina Lapshinova-Koltunski and Christina Pollkl{\"a}sener and Vera Demberg},
editor = {Svetlana Stoyanchev and Shafiq Joty and David Schlangen and Ondrej Dusek and Casey Kennington and Malihe Alikhani},
url = {https://aclanthology.org/2023.sigdial-1.2},
doi = {https://doi.org/10.18653/v1/2023.sigdial-1.2},
year = {2023},
date = {2023},
booktitle = {Proceedings of the 24th Meeting of Special Interest Group on Discourse and Dialogue (SIGDIAL)},
pages = {21-30},
publisher = {Association for Computational Linguistics},
address = {Prague, Czechia},
abstract = {Discourse relations have different patterns of marking across different languages. As a result, discourse connectives are often added, omitted, or rephrased in translation. Prior work has shown a tendency for explicitation of discourse connectives, but such work was conducted using restricted sample sizes due to difficulty of connective identification and alignment. The current study exploits automatic methods to facilitate a large-scale study of connectives in English and German parallel texts. Our results based on over 300 types and 18000 instances of aligned connectives and an empirical approach to compare the cross-lingual specificity gap provide strong evidence of the Explicitation Hypothesis. We conclude that discourse relations are indeed more explicit in translation than texts written originally in the same language. Automatic annotations allow us to carry out translation studies of discourse relations on a large scale. Our methodology using relative entropy to study the specificity of connectives also provides more fine-grained insights into translation patterns.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects: B2 B7

Ryzhova, Margarita; Demberg, Vera

Processing cost effects of atypicality inferences in a dual-task setup Journal Article

Journal of Pragmatics, 211, pp. 47-80, 2023.

Abstract
|
Links
|
BibTeX

Whether pragmatic inferences are cognitively more effortful than processing literal language has been a longstanding question in pragmatics. So far, experimental studies have exclusively tested generalized (scalar) implicatures. Current theories would predict that particularized implicatures should be cognitively effortful – however, this prediction has to date not been tested empirically. The present article contributes to the debate by investigating a specific type of particularized implicature, atypicality inferences, in a dual-task paradigm. In three experiments, we used either a non-linguistic (Experiment 1) or a linguistic (Experiments 2 and 3) secondary task, to modulate the amount of available cognitive resources. Our results show that the strength of pragmatic inferences is largely unaffected by the secondary task, which contrasts with prior predictions. We discuss the implications for traditional and modern accounts of pragmatic processing.

@article{ryzhova-demberg-2023,
title = {Processing cost effects of atypicality inferences in a dual-task setup},
author = {Margarita Ryzhova and Vera Demberg},
url = {https://www.sciencedirect.com/science/article/pii/S037821662300098X},
doi = {https://doi.org/10.1016/j.pragma.2023.04.005},
year = {2023},
date = {2023},
journal = {Journal of Pragmatics},
pages = {47-80},
volume = {211},
abstract = {

Whether pragmatic inferences are cognitively more effortful than processing literal language has been a longstanding question in pragmatics. So far, experimental studies have exclusively tested generalized (scalar) implicatures. Current theories would predict that particularized implicatures should be cognitively effortful – however, this prediction has to date not been tested empirically. The present article contributes to the debate by investigating a specific type of particularized implicature, atypicality inferences, in a dual-task paradigm. In three experiments, we used either a non-linguistic (Experiment 1) or a linguistic (Experiments 2 and 3) secondary task, to modulate the amount of available cognitive resources. Our results show that the strength of pragmatic inferences is largely unaffected by the secondary task, which contrasts with prior predictions. We discuss the implications for traditional and modern accounts of pragmatic processing.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project: A3

Borah, Angana; Pylypenko, Daria; España-Bonet, Cristina; van Genabith, Josef

Measuring Spurious Correlation in Classification: "Clever Hans" in Translationese Inproceedings

Mitkov, Ruslan; Angelova, Galia (Ed.): Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, INCOMA Ltd., Shoumen, Bulgaria, pp. 196-206, Varna, Bulgaria, 2023.

Abstract
|
Links
|
BibTeX

Recent work has shown evidence of „Clever Hans“ behavior in high-performance neural translationese classifiers, where BERT-based classifiers capitalize on spurious correlations, in particular topic information, between data and target classification labels, rather than genuine translationese signals. Translationese signals are subtle (especially for professional translation) and compete with many other signals in the data such as genre, style, author, and, in particular, topic. This raises the general question of how much of the performance of a classifier is really due to spurious correlations in the data versus the signals actually targeted for by the classifier, especially for subtle target signals and in challenging (low resource) data settings. We focus on topic-based spurious correlation and approach the question from two directions: (i) where we have no knowledge about spurious topic information and its distribution in the data, (ii) where we have some indication about the nature of spurious topic correlations. For (i) we develop a measure from first principles capturing alignment of unsupervised topics with target classification labels as an indication of spurious topic information in the data. We show that our measure is the same as purity in clustering and propose a „topic floor“ (as in a „noise floor“) for classification. For (ii) we investigate masking of known spurious topic carriers in classification. Both (i) and (ii) contribute to quantifying and (ii) to mitigating spurious correlations.

https://aclanthology.org/2023.ranlp-1.22

@inproceedings{borah-etal-2023-measuring,
title = {Measuring Spurious Correlation in Classification: "Clever Hans" in Translationese},
author = {Angana Borah and Daria Pylypenko and Cristina Espa{\~n}a-Bonet and Josef van Genabith},
editor = {Ruslan Mitkov and Galia Angelova},
url = {https://aclanthology.org/2023.ranlp-1.22},
year = {2023},
date = {2023},
booktitle = {Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing},
pages = {196-206},
publisher = {INCOMA Ltd., Shoumen, Bulgaria},
address = {Varna, Bulgaria},
abstract = {

Recent work has shown evidence of "Clever Hans" behavior in high-performance neural translationese classifiers, where BERT-based classifiers capitalize on spurious correlations, in particular topic information, between data and target classification labels, rather than genuine translationese signals. Translationese signals are subtle (especially for professional translation) and compete with many other signals in the data such as genre, style, author, and, in particular, topic. This raises the general question of how much of the performance of a classifier is really due to spurious correlations in the data versus the signals actually targeted for by the classifier, especially for subtle target signals and in challenging (low resource) data settings. We focus on topic-based spurious correlation and approach the question from two directions: (i) where we have no knowledge about spurious topic information and its distribution in the data, (ii) where we have some indication about the nature of spurious topic correlations. For (i) we develop a measure from first principles capturing alignment of unsupervised topics with target classification labels as an indication of spurious topic information in the data. We show that our measure is the same as purity in clustering and propose a "topic floor" (as in a "noise floor") for classification. For (ii) we investigate masking of known spurious topic carriers in classification. Both (i) and (ii) contribute to quantifying and (ii) to mitigating spurious correlations.

},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: B6

Zhu, Dawei; Shen, Xiaoyu; Mosbach, Marius; Stephan, Andreas; Klakow, Dietrich

Weaker Than You Think: A Critical Look at Weakly Supervised Learning Inproceedings

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, pp. 14229-14253, Toronto, Canada, 2023.

Abstract
|
Links
|
BibTeX

Weakly supervised learning is a popular approach for training machine learning models in low-resource settings. Instead of requesting high-quality yet costly human annotations, it allows training models with noisy annotations obtained from various weak sources. Recently, many sophisticated approaches have been proposed for robust training under label noise, reporting impressive results. In this paper, we revisit the setup of these approaches and find that the benefits brought by these approaches are significantly overestimated. Specifically, we find that the success of existing weakly supervised learning approaches heavily relies on the availability of clean validation samples which, as we show, can be leveraged much more efficiently by simply training on them. After using these clean labels in training, the advantages of using these sophisticated approaches are mostly wiped out. This remains true even when reducing the size of the available clean data to just five samples per class, making these approaches impractical. To understand the true value of weakly supervised learning, we thoroughly analyze diverse NLP datasets and tasks to ascertain when and why weakly supervised approaches work. Based on our findings, we provide recommendations for future research.

@inproceedings{zhu-etal-2023-weaker,
title = {Weaker Than You Think: A Critical Look at Weakly Supervised Learning},
author = {Dawei Zhu and Xiaoyu Shen and Marius Mosbach and Andreas Stephan and Dietrich Klakow},
url = {https://aclanthology.org/2023.acl-long.796},
doi = {https://doi.org/10.18653/v1/2023.acl-long.796},
year = {2023},
date = {2023-09-21},
booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
pages = {14229-14253},
publisher = {Association for Computational Linguistics},
address = {Toronto, Canada},
abstract = {Weakly supervised learning is a popular approach for training machine learning models in low-resource settings. Instead of requesting high-quality yet costly human annotations, it allows training models with noisy annotations obtained from various weak sources. Recently, many sophisticated approaches have been proposed for robust training under label noise, reporting impressive results. In this paper, we revisit the setup of these approaches and find that the benefits brought by these approaches are significantly overestimated. Specifically, we find that the success of existing weakly supervised learning approaches heavily relies on the availability of clean validation samples which, as we show, can be leveraged much more efficiently by simply training on them. After using these clean labels in training, the advantages of using these sophisticated approaches are mostly wiped out. This remains true even when reducing the size of the available clean data to just five samples per class, making these approaches impractical. To understand the true value of weakly supervised learning, we thoroughly analyze diverse NLP datasets and tasks to ascertain when and why weakly supervised approaches work. Based on our findings, we provide recommendations for future research.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: B4

Mosbach, Marius; Pimentel, Tiago; Ravfogel, Shauli; Klakow, Dietrich; Elazar, Yanai

Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation Inproceedings

Findings of the Association for Computational Linguistics: ACL 2023, Association for Computational Linguistics, pp. 12284-12314, Toronto, Canada, 2023.

Abstract
|
Links
|
BibTeX

Few-shot fine-tuning and in-context learning are two alternative strategies for task adaptation of pre-trained language models. Recently, in-context learning has gained popularity over fine-tuning due to its simplicity and improved out-of-domain generalization, and because extensive evidence shows that fine-tuned models pick up on spurious correlations.Unfortunately, previous comparisons of the two approaches were done using models of different sizes. This raises the question of whether the observed weaker out-of-domain generalization of fine-tuned models is an inherent property of fine-tuning or a limitation of the experimental setup. In this paper, we compare the generalization of few-shot fine-tuning and in-context learning to challenge datasets, while controlling for the models used, the number of examples, and the number of parameters, ranging from 125M to 30B. Our results show that fine-tuned language models can in fact generalize well out-of-domain. We find that both approaches generalize similarly; they exhibit large variation and depend on properties such as model size and the number of examples, highlighting that robust task adaptation remains a challenge.

@inproceedings{mosbach-etal-2023-shot,
title = {Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation},
author = {Marius Mosbach and Tiago Pimentel and Shauli Ravfogel and Dietrich Klakow and Yanai Elazar},
url = {https://aclanthology.org/2023.findings-acl.779},
doi = {https://doi.org/10.18653/v1/2023.findings-acl.779},
year = {2023},
date = {2023},
booktitle = {Findings of the Association for Computational Linguistics: ACL 2023},
pages = {12284-12314},
publisher = {Association for Computational Linguistics},
address = {Toronto, Canada},
abstract = {Few-shot fine-tuning and in-context learning are two alternative strategies for task adaptation of pre-trained language models. Recently, in-context learning has gained popularity over fine-tuning due to its simplicity and improved out-of-domain generalization, and because extensive evidence shows that fine-tuned models pick up on spurious correlations.Unfortunately, previous comparisons of the two approaches were done using models of different sizes. This raises the question of whether the observed weaker out-of-domain generalization of fine-tuned models is an inherent property of fine-tuning or a limitation of the experimental setup. In this paper, we compare the generalization of few-shot fine-tuning and in-context learning to challenge datasets, while controlling for the models used, the number of examples, and the number of parameters, ranging from 125M to 30B. Our results show that fine-tuned language models can in fact generalize well out-of-domain. We find that both approaches generalize similarly; they exhibit large variation and depend on properties such as model size and the number of examples, highlighting that robust task adaptation remains a challenge.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: B4

Ryzhova, Margarita; Skrjanec, Iza; Quach, Nina; Chase, Alice Virginia ; Ellsiepen, Emilia; Demberg, Vera

Word Familiarity Classification From a Single Trial Based on Eye-Movements. A Study in German and English Inproceedings

ETRA '23: Proceedings of the 2023 Symposium on Eye Tracking Research and Applications, 2023.

Abstract
|
Links
|
BibTeX

Identifying processing difficulty during reading due to unfamiliar words has promising applications in automatic text adaptation. We present a classification model that predicts whether a word is (un)known to the reader based on eye-movement measures. We examine German and English data and validate our model on unseen subjects and items achieving a high accuracy in both languages.

@inproceedings{ryzhova-etal-2023,
title = {Word Familiarity Classification From a Single Trial Based on Eye-Movements. A Study in German and English},
author = {Margarita Ryzhova and Iza Skrjanec and Nina Quach and Alice Virginia Chase and Emilia Ellsiepen and Vera Demberg},
url = {https://dl.acm.org/doi/abs/10.1145/3588015.3590118},
doi = {https://doi.org/10.1145/3588015.3590118},
year = {2023},
date = {2023},
booktitle = {ETRA '23: Proceedings of the 2023 Symposium on Eye Tracking Research and Applications},
abstract = {

Identifying processing difficulty during reading due to unfamiliar words has promising applications in automatic text adaptation. We present a classification model that predicts whether a word is (un)known to the reader based on eye-movement measures. We examine German and English data and validate our model on unseen subjects and items achieving a high accuracy in both languages.

},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: A8

Skrjanec, Iza; Broy, Frederik Yannick; Demberg, Vera

Expert-adapted language models improve the fit to reading times Inproceedings

Procedia Computer Science, PsyArXiv, 2023.

Abstract
|
Links
|
BibTeX

The concept of surprisal refers to the predictability of a word based on its context. Surprisal is known to be predictive of human processing difficulty and is usually estimated by language models. However, because humans differ in their linguistic experience, they also differ in the actual processing difficulty they experience with a given word or sentence. We investigate whether models that are similar to the linguistic experience and background knowledge of a specific group of humans are better at predicting their reading times than a generic language model. We analyze reading times from the PoTeC corpus (Jäger et al. 2021) of eye movements from biology and physics experts reading biology and physics texts. We find experts read in-domain texts faster than novices, especially domain-specific terms. Next, we train language models adapted to the biology and physics domains and show that surprisal obtained from these specialized models improves the fit to expert reading times above and beyond a generic language model.

@inproceedings{skrjanec_broy_demberg_2023,
title = {Expert-adapted language models improve the fit to reading times},
author = {Iza Skrjanec and Frederik Yannick Broy and Vera Demberg},
url = {https://psyarxiv.com/dc8y6},
doi = {https://doi.org/10.31234/osf.io/dc8y6},
year = {2023},
date = {2023},
booktitle = {Procedia Computer Science},
publisher = {PsyArXiv},
abstract = {

The concept of surprisal refers to the predictability of a word based on its context. Surprisal is known to be predictive of human processing difficulty and is usually estimated by language models. However, because humans differ in their linguistic experience, they also differ in the actual processing difficulty they experience with a given word or sentence. We investigate whether models that are similar to the linguistic experience and background knowledge of a specific group of humans are better at predicting their reading times than a generic language model. We analyze reading times from the PoTeC corpus (J{\"a}ger et al. 2021) of eye movements from biology and physics experts reading biology and physics texts. We find experts read in-domain texts faster than novices, especially domain-specific terms. Next, we train language models adapted to the biology and physics domains and show that surprisal obtained from these specialized models improves the fit to expert reading times above and beyond a generic language model.

},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: A8

Mecklinger, Axel; Kamp, Siri-Maria

Observing memory encoding while it unfolds: Functional interpretation and current debates regarding ERP subsequent memory effects Journal Article

Neuroscience & Biobehavioral Reviews, 153, 2023.

Abstract
|
Links
|
BibTeX

Our ability to remember the past depends on neural processes set in train in the moment an event is experienced. These processes can be studied by segregating brain activity according to whether an event is later remembered or forgotten. The present review integrates a large number of studies examining this differential brain activity, labeled subsequent memory effect (SME), with the ERP technique, into a functional organization and discusses routes for further research. Based on the reviewed literature, we suggest that memory encoding is implemented by multiple processes, typically reflected in three functionally different subcomponents of the ERP SME elicited by study stimuli, which presumably interact with preparatory SME activity preceding the to be encoded event. We argue that ERPs are a valuable method in the SME paradigm because they have a sufficiently high temporal resolution to disclose the subcomponents of encoding-related brain activity. Implications of the proposed functional organization for future studies using the SME procedure in basic and applied settings will be discussed.

https://www.sciencedirect.com/science/article/abs/pii/S0149763423003160

@article{Mecklinger-etal-2023,
title = {Observing memory encoding while it unfolds: Functional interpretation and current debates regarding ERP subsequent memory effects},
author = {Axel Mecklinger and Siri-Maria Kamp},
url = {https://www.sciencedirect.com/science/article/abs/pii/S0149763423003160},
year = {2023},
date = {2023},
journal = {Neuroscience & Biobehavioral Reviews},
volume = {153},
abstract = {

Our ability to remember the past depends on neural processes set in train in the moment an event is experienced. These processes can be studied by segregating brain activity according to whether an event is later remembered or forgotten. The present review integrates a large number of studies examining this differential brain activity, labeled subsequent memory effect (SME), with the ERP technique, into a functional organization and discusses routes for further research. Based on the reviewed literature, we suggest that memory encoding is implemented by multiple processes, typically reflected in three functionally different subcomponents of the ERP SME elicited by study stimuli, which presumably interact with preparatory SME activity preceding the to be encoded event. We argue that ERPs are a valuable method in the SME paradigm because they have a sufficiently high temporal resolution to disclose the subcomponents of encoding-related brain activity. Implications of the proposed functional organization for future studies using the SME procedure in basic and applied settings will be discussed.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project: A6

Bader, Regine; Tarantini, Luca; Mecklinger, Axel

Task context dissociates the FN400 and the N400 Journal Article

Psychophysiology, 60, 2023.

Abstract
|
Links
|
BibTeX

In event-related potential studies, familiarity-based recognition has been associated with the FN400, that is, more positive-going waveforms for old items than new items 300–500 ms post-stimulus onset, maximal at frontal electrodes. We tested the proposition that the FN400 reflects the attribution of unexpected processing fluency to familiarity. This implies that the FN400 is greater when fluency is less expected, that is, for less familiar stimuli. Moreover, the FN400 should be modulated by the goal of remembering and only elicited when fluency is correctly attributed to the past, that is, by correct old responses in recognition memory tests. In the absence of a retrieval task, enhanced fluency for repeated items should be associated with an N400 attenuation as no episodic attribution takes place. In an incidental study-test design with words of low and high life-time familiarity, participants made pleasantness judgments for half of the studied words. The other half re-appeared in a recognition test. Only in the latter task, participants had the goal of remembering. As both tasks included also new words, we could compare old/new effects under conditions in which both effects are driven by increased fluency for repeated words. We did not find the expected differences in the FN400 for low vs. high life-time familiarity items. However, as expected, we found a frontally distributed FN400 in the recognition test whereas the old/new effect in the pleasantness task resembled an N400 effect. This supports the view that the FN400 occurs when fluency is attributed to familiarity during a recognition decision.

@article{Bader_etal_2023,
title = {Task context dissociates the FN400 and the N400},
author = {Regine Bader and Luca Tarantini and Axel Mecklinger},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1111/psyp.14258},
doi = {https://doi.org/10.1111/psyp.14258},
year = {2023},
date = {2023},
journal = {Psychophysiology},
volume = {60},
number = {7},
abstract = {

In event-related potential studies, familiarity-based recognition has been associated with the FN400, that is, more positive-going waveforms for old items than new items 300–500 ms post-stimulus onset, maximal at frontal electrodes. We tested the proposition that the FN400 reflects the attribution of unexpected processing fluency to familiarity. This implies that the FN400 is greater when fluency is less expected, that is, for less familiar stimuli. Moreover, the FN400 should be modulated by the goal of remembering and only elicited when fluency is correctly attributed to the past, that is, by correct old responses in recognition memory tests. In the absence of a retrieval task, enhanced fluency for repeated items should be associated with an N400 attenuation as no episodic attribution takes place. In an incidental study-test design with words of low and high life-time familiarity, participants made pleasantness judgments for half of the studied words. The other half re-appeared in a recognition test. Only in the latter task, participants had the goal of remembering. As both tasks included also new words, we could compare old/new effects under conditions in which both effects are driven by increased fluency for repeated words. We did not find the expected differences in the FN400 for low vs. high life-time familiarity items. However, as expected, we found a frontally distributed FN400 in the recognition test whereas the old/new effect in the pleasantness task resembled an N400 effect. This supports the view that the FN400 occurs when fluency is attributed to familiarity during a recognition decision.

},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project: A6

Li, Muqing; Venhuizen, Noortje; Jachmann, Torsten; Drenhaus, Heiner; Crocker, Matthew W.

Does informativity modulate linearization preferences in reference production? Inproceedings

Proceedings of the Annual Meeting of the Cognitive Science Society, 45, pp. 3028-3054, 2023.

Abstract
|
Links
|
BibTeX

During referential communication, speaker choices regarding the syntactic encoding of their expressions can modulate the linear ordering of the properties necessary to identify the referent. We investigated whether such syntactic choices are influenced by the informativity of these properties in a given visual context, as quantified by Referential Entropy Reduction (RER). In two experiments, a maze-based sentence completion task was used to examine whether informativity of a particular property (animal or action) influenced the decision to produce pre- versus post-nominal modifications when describing animal-performing-action referents in a visual scene. While many participants used a fixed strategy, informativity did significantly influence linearization for the remaining participants, consistent with a maximal informativity strategy in which the high RER property is be encoded first. This suggests that speakers who vary their encodings are indeed sensitive to the informativity of properties in a visual scene, preferring syntactic linearization in which informative properties appear early.

https://escholarship.org/uc/item/95v6j0sx

@inproceedings{Muqing-etal-2023,
title = {Does informativity modulate linearization preferences in reference production? },
author = {Muqing Li and Noortje Venhuizen and Torsten Jachmann and Heiner Drenhaus and Matthew W. Crocker},
url = {https://escholarship.org/uc/item/95v6j0sx},
year = {2023},
date = {2023},
booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society},
pages = {3028-3054},
abstract = {During referential communication, speaker choices regarding the syntactic encoding of their expressions can modulate the linear ordering of the properties necessary to identify the referent. We investigated whether such syntactic choices are influenced by the informativity of these properties in a given visual context, as quantified by Referential Entropy Reduction (RER). In two experiments, a maze-based sentence completion task was used to examine whether informativity of a particular property (animal or action) influenced the decision to produce pre- versus post-nominal modifications when describing animal-performing-action referents in a visual scene. While many participants used a fixed strategy, informativity did significantly influence linearization for the remaining participants, consistent with a maximal informativity strategy in which the high RER property is be encoded first. This suggests that speakers who vary their encodings are indeed sensitive to the informativity of properties in a visual scene, preferring syntactic linearization in which informative properties appear early.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: C3

Zaitova, Iuliia; Stenger, Irina; Avgustinova, Tania

Microsyntactic Unit Detection Using Word Embedding Models: Experiments on Slavic Languages Inproceedings

Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2023), pp. 1251-1259, 2023.

BibTeX

@inproceedings{Zaitova/etal:2023a,
title = {Microsyntactic Unit Detection Using Word Embedding Models: Experiments on Slavic Languages},
author = {Iuliia Zaitova and Irina Stenger and Tania Avgustinova},
year = {2023},
date = {2023},
booktitle = {Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2023)},
pages = {1251-1259},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: C4

Gessinger, Iona; Cohn, Michelle; Cowan, Benjamin R.; Zellou, Georgia; Möbius, Bernd

Cross-linguistic emotion perception in human and TTS voices Inproceedings

Proceedings of Interspeech 2023, pp. 5222-5226, Dublin, Ireland, 2023.

Abstract
|
Links
|
BibTeX

This study investigates how German listeners perceive changes in the emotional expression of German and American English human voices and Amazon Alexa text-to-speech (TTS) voices, respectively. Participants rated sentences containing emotionally neutral lexico-semantic information that were resynthesized to vary in prosodic emotional expressiveness. Starting from an emotionally neutral production, three levels of increasing ‚happiness‘ were created. Results show that ‚happiness‘ manipulations lead to higher ratings of emotional valence (i.e., more positive) and arousal (i.e., more excited) for German and English voices, with stronger effects for the German voices. In particular, changes in valence were perceived more prominently in German TTS compared to English TTS. Additionally, both TTS voices were rated lower than the respective human voices on scales that reflect anthropomorphism (e.g., human-likeness). We discuss these findings in the context of cross-linguistic emotion accounts.

@inproceedings{Gessinger/etal:2023,
title = {Cross-linguistic emotion perception in human and TTS voices},
author = {Iona Gessinger and Michelle Cohn and Benjamin R. Cowan and Georgia Zellou and Bernd M{\"o}bius},
url = {https://www.isca-speech.org/archive/interspeech_2023/gessinger23_interspeech.html},
doi = {https://doi.org/10.21437/Interspeech.2023-711},
year = {2023},
date = {2023},
booktitle = {Proceedings of Interspeech 2023},
pages = {5222-5226},
address = {Dublin, Ireland},
abstract = {This study investigates how German listeners perceive changes in the emotional expression of German and American English human voices and Amazon Alexa text-to-speech (TTS) voices, respectively. Participants rated sentences containing emotionally neutral lexico-semantic information that were resynthesized to vary in prosodic emotional expressiveness. Starting from an emotionally neutral production, three levels of increasing 'happiness' were created. Results show that 'happiness' manipulations lead to higher ratings of emotional valence (i.e., more positive) and arousal (i.e., more excited) for German and English voices, with stronger effects for the German voices. In particular, changes in valence were perceived more prominently in German TTS compared to English TTS. Additionally, both TTS voices were rated lower than the respective human voices on scales that reflect anthropomorphism (e.g., human-likeness). We discuss these findings in the context of cross-linguistic emotion accounts.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: C1

Kudera, Jacek; Stenger, Irina; Georgis, Philip; Möbius, Bernd; Avgustinova, Tania; Klakow, Dietrich

Cross-linguistic intelligibility of idiomatic phrases in Polish-Russian translation tasks Incollection

Phraseology, constructions and translation: Corpus-based, computational and cultural aspects, Presses universitaires de Louvain, pp. 237-249, 2023.

Abstract
|
Links
|
BibTeX

This paper presents the results of a translation task involving idiomatic phrases in closely related languages. The goal is to test auditory comprehension of idioms. The experiment was conducted with native speakers of either Polish or Russian, who were not professional translators. The translation equivalents were categorized according to three conditions: (1) semantic equivalent, found in a phraseological dictionary; (2) lemma-based referent, sharing a cognate component; and (3) literal translation of the source phrase. It is hypothesized that information-theoretic measures of surprisal in combination with lexical and syntactic distances between idioms can predict lay translators’ preferences. The results suggest that the proposed measures are valid predictors for the type of translation native speakers will select. The outcomes reveal an asymmetry in preference for equivalent selection across the groups of lay translators.

@incollection{Kudera/etal:2023a,
title = {Cross-linguistic intelligibility of idiomatic phrases in Polish-Russian translation tasks},
author = {Jacek Kudera and Irina Stenger and Philip Georgis and Bernd M{\"o}bius and Tania Avgustinova and Dietrich Klakow},
url = {https://pul.uclouvain.be/book/?GCOI=29303100163350&utm_source=rss&utm_medium=rss&utm_campaign=newreleases#h2tabFormats},
year = {2023},
date = {2023},
booktitle = {Phraseology, constructions and translation: Corpus-based, computational and cultural aspects},
pages = {237-249},
publisher = {Presses universitaires de Louvain},
abstract = {This paper presents the results of a translation task involving idiomatic phrases in closely related languages. The goal is to test auditory comprehension of idioms. The experiment was conducted with native speakers of either Polish or Russian, who were not professional translators. The translation equivalents were categorized according to three conditions: (1) semantic equivalent, found in a phraseological dictionary; (2) lemma-based referent, sharing a cognate component; and (3) literal translation of the source phrase. It is hypothesized that information-theoretic measures of surprisal in combination with lexical and syntactic distances between idioms can predict lay translators’ preferences. The results suggest that the proposed measures are valid predictors for the type of translation native speakers will select. The outcomes reveal an asymmetry in preference for equivalent selection across the groups of lay translators.},
pubstate = {published},
type = {incollection}
}

Copy BibTeX to Clipboard

Project: C4

Abdullah, Badr M.; Shaik, Mohammed Maqsood ; Möbius, Bernd; Klakow, Dietrich

An information-theoretic analysis of self-supervised discrete representations of speech Inproceedings

Proceedings of Interspeech 2023, pp. 2883-2887, Dublin, Ireland, 2023.

Abstract
|
Links
|
BibTeX

Self-supervised representation learning for speech often involves a quantization step that transforms the acoustic input into discrete units. However, it remains unclear how to characterize the relationship between these discrete units and abstract phonetic categories such as phonemes. In this paper, we develop an information-theoretic framework whereby we represent each phonetic category as a distribution over discrete units. We then apply our framework to two different self-supervised models (namely wav2vec 2.0 and XLSR) and use American English speech as a case study. Our study demonstrates that the entropy of phonetic distributions reflects the variability of the underlying speech sounds, with phonetically similar sounds exhibiting similar distributions. While our study confirms the lack of direct, one-to-one correspondence, we find an intriguing, indirect relationship between phonetic categories and discrete units.

@inproceedings{Abdullah/etal:2023a,
title = {An information-theoretic analysis of self-supervised discrete representations of speech},
author = {Badr M. Abdullah and Mohammed Maqsood Shaik and Bernd M{\"o}bius and Dietrich Klakow},
doi = {https://doi.org/10.21437/Interspeech.2023--2131},
year = {2023},
date = {2023},
booktitle = {Proceedings of Interspeech 2023},
pages = {2883-2887},
address = {Dublin, Ireland},
abstract = {Self-supervised representation learning for speech often involves a quantization step that transforms the acoustic input into discrete units. However, it remains unclear how to characterize the relationship between these discrete units and abstract phonetic categories such as phonemes. In this paper, we develop an information-theoretic framework whereby we represent each phonetic category as a distribution over discrete units. We then apply our framework to two different self-supervised models (namely wav2vec 2.0 and XLSR) and use American English speech as a case study. Our study demonstrates that the entropy of phonetic distributions reflects the variability of the underlying speech sounds, with phonetically similar sounds exhibiting similar distributions. While our study confirms the lack of direct, one-to-one correspondence, we find an intriguing, indirect relationship between phonetic categories and discrete units.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project: C4

«
1
2
3
…
5
6
7
…
28
29
30
»

Successfully