Publications

Zarcone, Alessandra; Demberg, Vera

A bathtub by any other name: the reduction of german compounds in predictive contexts Inproceedings

Proceedings of the Annual Meeting of the Cognitive Science Society, 43, 2021.

The Uniform Information Density hypothesis (UID) predicts that lexical choice between long and short word forms depends on the predictability of the referent in context, and recent studies have shown such an effect of predictability on lexical choice during online production. We here set out to test whether the UID predictions hold up in a related setting, but different language (German) and different phenomenon, namely the choice between compounds (e.g. Badewanne / bathtub) or their base forms (Wanne / tub). Our study is consistent with the UID: we find that participants choose the shorter base form more often in predictive contexts, showing an active tendency to be information-theoretically efficient.

@inproceedings{Zarcone2021,
title = {A bathtub by any other name: the reduction of german compounds in predictive contexts},
author = {Alessandra Zarcone and Vera Demberg},
url = {https://escholarship.org/uc/item/3w6451rz},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Annual Meeting of the Cognitive Science Society},
abstract = {The Uniform Information Density hypothesis (UID) predicts that lexical choice between long and short word forms depends on the predictability of the referent in context, and recent studies have shown such an effect of predictability on lexical choice during online production. We here set out to test whether the UID predictions hold up in a related setting, but different language (German) and different phenomenon, namely the choice between compounds (e.g. Badewanne / bathtub) or their base forms (Wanne / tub). Our study is consistent with the UID: we find that participants choose the shorter base form more often in predictive contexts, showing an active tendency to be information-theoretically efficient.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A3

Lapshinova-Koltunski, Ekaterina

Analysing the Dimension of Mode in Translation Book Chapter

Bisiada, Mario;  (Ed.): Empirical Studies in Translation and Discourse. Translation and Multilingual Natural Language Processing, Language Science Press, pp. 223-243, Berlin, 2021, ISBN 978-3-96110-300-3, ISSN 2364-8899.

The present chapter applies text classification to test how well we can distinguish between texts along two dimensions: a text-production dimension that distinguishes between translations and non-translations (where translations also include interpreted texts); and a mode dimension that distinguishes between and spoken and written texts. The chapter also aims to investigate the relationship between these two dimensions. Moreover, it investigates whether the same linguistic features that are derived from variational linguistics contribute to the prediction of mode in both translations and non-translations. The distributional information about these features was used to statistically model variation along the two dimensions. The results show that the same feature set can be used to automatically differentiate translations from non-translations, as well as spoken texts from the written texts. However, language variation along the dimension of mode is stronger
than that along the dimension of text production, as classification into spoken and written texts delivers better results. Besides, linguistic features that contribute to the distinction between spoken and written mode are similar in both translated and non-translated language.

@inbook{Lapshinova2021dimension,
title = {Analysing the Dimension of Mode in Translation},
author = {Ekaterina Lapshinova-Koltunski},
editor = {Mario Bisiada},
url = {https://doi.org/10.5281/zenodo.4450014},
doi = {https://doi.org/10.5281/zenodo.4450014},
year = {2021},
date = {2021},
booktitle = {Empirical Studies in Translation and Discourse. Translation and Multilingual Natural Language Processing},
isbn = {978-3-96110-300-3},
issn = {2364-8899},
pages = {223-243},
publisher = {Language Science Press},
address = {Berlin},
abstract = {The present chapter applies text classification to test how well we can distinguish between texts along two dimensions: a text-production dimension that distinguishes between translations and non-translations (where translations also include interpreted texts); and a mode dimension that distinguishes between and spoken and written texts. The chapter also aims to investigate the relationship between these two dimensions. Moreover, it investigates whether the same linguistic features that are derived from variational linguistics contribute to the prediction of mode in both translations and non-translations. The distributional information about these features was used to statistically model variation along the two dimensions. The results show that the same feature set can be used to automatically differentiate translations from non-translations, as well as spoken texts from the written texts. However, language variation along the dimension of mode is stronger than that along the dimension of text production, as classification into spoken and written texts delivers better results. Besides, linguistic features that contribute to the distinction between spoken and written mode are similar in both translated and non-translated language.},
pubstate = {published},
type = {inbook}
}

Copy BibTeX to Clipboard

Project:   B7

Sikos, Les; Venhuizen, Noortje; Drenhaus, Heiner; Crocker, Matthew W.

Reevaluating pragmatic reasoning in language games Journal Article

PLOS ONE, 2021.

The results of a highly influential study that tested the predictions of the Rational Speech Act (RSA) model suggest that (a) listeners use pragmatic reasoning in one-shot web-based referential communication games despite the artificial, highly constrained, and minimally interactive nature of the task, and (b) that RSA accurately captures this behavior. In this work, we reevaluate the contribution of the pragmatic reasoning formalized by RSA in explaining listener behavior by comparing RSA to a baseline literal listener model that is only driven by literal word meaning and the prior probability of referring to an object. Across three experiments we observe only modest evidence of pragmatic behavior in one-shot web-based language games, and only under very limited circumstances. We find that although RSA provides a strong fit to listener responses, it does not perform better than the baseline literal listener model. Our results suggest that while participants playing the role of the Speaker are informative in these one-shot web-based reference games, participants playing the role of the Listener only rarely take this Speaker behavior into account to reason about the intended referent. In addition, we show that RSA’s fit is primarily due to a combination of non-pragmatic factors, perhaps the most surprising of which is that in the majority of conditions that are amenable to pragmatic reasoning, RSA (accurately) predicts that listeners will behave non-pragmatically. This leads us to conclude that RSA’s strong overall correlation with human behavior in one-shot web-based language games does not reflect listener’s pragmatic reasoning about informative speakers.

@article{Sikos2021,
title = {Reevaluating pragmatic reasoning in language games},
author = {Les Sikos and Noortje Venhuizen and Heiner Drenhaus and Matthew W. Crocker},
url = {https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0248388},
doi = {https://doi.org/10.1371/journal.pone.0248388},
year = {2021},
date = {2021-03-17},
journal = {PLOS ONE},
abstract = {The results of a highly influential study that tested the predictions of the Rational Speech Act (RSA) model suggest that (a) listeners use pragmatic reasoning in one-shot web-based referential communication games despite the artificial, highly constrained, and minimally interactive nature of the task, and (b) that RSA accurately captures this behavior. In this work, we reevaluate the contribution of the pragmatic reasoning formalized by RSA in explaining listener behavior by comparing RSA to a baseline literal listener model that is only driven by literal word meaning and the prior probability of referring to an object. Across three experiments we observe only modest evidence of pragmatic behavior in one-shot web-based language games, and only under very limited circumstances. We find that although RSA provides a strong fit to listener responses, it does not perform better than the baseline literal listener model. Our results suggest that while participants playing the role of the Speaker are informative in these one-shot web-based reference games, participants playing the role of the Listener only rarely take this Speaker behavior into account to reason about the intended referent. In addition, we show that RSA’s fit is primarily due to a combination of non-pragmatic factors, perhaps the most surprising of which is that in the majority of conditions that are amenable to pragmatic reasoning, RSA (accurately) predicts that listeners will behave non-pragmatically. This leads us to conclude that RSA’s strong overall correlation with human behavior in one-shot web-based language games does not reflect listener’s pragmatic reasoning about informative speakers.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   C3

Köhne-Fuetterer, Judith; Drenhaus, Heiner; Delogu, Francesca; Demberg, Vera

The online processing of causal and concessive discourse connectives Journal Article

Linguistics, 59, pp. 417-448, 2021.

While there is a substantial amount of evidence for language processing being a highly incremental and predictive process, we still know relatively little about how top-down discourse based expectations are combined with bottom-up information such as discourse connectives. The present article reports on three experiments investigating this question using different methodologies (visual world paradigm and ERPs) in two languages (German and English). We find support for highly incremental processing of causal and concessive discourse connectives, causing anticipation of upcoming material. Our visual world study shows that anticipatory looks depend on the discourse connective; furthermore, the German ERP study revealed an N400 effect on a gender-marked adjective preceding the target noun, when the target noun was inconsistent with the expectations elicited by the combination of context and discourse connective. Moreover, our experiments reveal that the facilitation of downstream material based on earlier connectives comes at the cost of reversing original expectations, as evidenced by a P600 effect on the concessive relative to the causal connective.

@article{koehne2021online,
title = {The online processing of causal and concessive discourse connectives},
author = {Judith K{\"o}hne-Fuetterer and Heiner Drenhaus and Francesca Delogu and Vera Demberg},
url = {https://doi.org/10.1515/ling-2021-0011},
doi = {https://doi.org/doi:10.1515/ling-2021-0011},
year = {2021},
date = {2021-03-04},
journal = {Linguistics},
pages = {417-448},
volume = {59},
number = {2},
abstract = {While there is a substantial amount of evidence for language processing being a highly incremental and predictive process, we still know relatively little about how top-down discourse based expectations are combined with bottom-up information such as discourse connectives. The present article reports on three experiments investigating this question using different methodologies (visual world paradigm and ERPs) in two languages (German and English). We find support for highly incremental processing of causal and concessive discourse connectives, causing anticipation of upcoming material. Our visual world study shows that anticipatory looks depend on the discourse connective; furthermore, the German ERP study revealed an N400 effect on a gender-marked adjective preceding the target noun, when the target noun was inconsistent with the expectations elicited by the combination of context and discourse connective. Moreover, our experiments reveal that the facilitation of downstream material based on earlier connectives comes at the cost of reversing original expectations, as evidenced by a P600 effect on the concessive relative to the causal connective.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Projects:   A1 B2 B3

Lemke, Tyll Robin; Schäfer, Lisa; Reich, Ingo

Modeling the predictive potential of extralinguistic context with script knowledge: The case of fragments Journal Article

PLOS ONE, 16, pp. e0246255, 2021.

We describe a novel approach to estimating the predictability of utterances given extralinguistic context in psycholinguistic research. Predictability effects on language production and comprehension are widely attested, but so far predictability has mostly been manipulated through local linguistic context, which is captured with n-gram language models. However, this method does not allow to investigate predictability effects driven by extralinguistic context. Modeling effects of extralinguistic context is particularly relevant to discourse-initial expressions, which can be predictable even if they lack linguistic context at all. We propose to use script knowledge as an approximation to extralinguistic context. Since the application of script knowledge involves the generation of prediction about upcoming events, we expect that scrips can be used to manipulate the likelihood of linguistic expressions referring to these events. Previous research has shown that script-based discourse expectations modulate the likelihood of linguistic expressions, but script knowledge has often been operationalized with stimuli which were based on researchers’ intuitions and/or expensive production and norming studies. We propose to quantify the likelihood of an utterance based on the probability of the event to which it refers. This probability is calculated with event language models trained on a script knowledge corpus and modulated with probabilistic event chains extracted from the corpus. We use the DeScript corpus of script knowledge to obtain empirically founded estimates of the likelihood of an event to occur in context without having to resort to expensive pre-tests of the stimuli. We exemplify our method at a case study on the usage of nonsentential expressions (fragments), which shows that utterances that are predictable given script-based extralinguistic context are more likely to be reduced.

@article{Lemke2021,
title = {Modeling the predictive potential of extralinguistic context with script knowledge: The case of fragments},
author = {Tyll Robin Lemke and Lisa Sch{\"a}fer and Ingo Reich},
url = {https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0246255},
doi = {https://doi.org/10.1371/journal.pone.0246255},
year = {2021},
date = {2021-02-11},
journal = {PLOS ONE},
pages = {e0246255},
volume = {16},
number = {2},
abstract = {We describe a novel approach to estimating the predictability of utterances given extralinguistic context in psycholinguistic research. Predictability effects on language production and comprehension are widely attested, but so far predictability has mostly been manipulated through local linguistic context, which is captured with n-gram language models. However, this method does not allow to investigate predictability effects driven by extralinguistic context. Modeling effects of extralinguistic context is particularly relevant to discourse-initial expressions, which can be predictable even if they lack linguistic context at all. We propose to use script knowledge as an approximation to extralinguistic context. Since the application of script knowledge involves the generation of prediction about upcoming events, we expect that scrips can be used to manipulate the likelihood of linguistic expressions referring to these events. Previous research has shown that script-based discourse expectations modulate the likelihood of linguistic expressions, but script knowledge has often been operationalized with stimuli which were based on researchers’ intuitions and/or expensive production and norming studies. We propose to quantify the likelihood of an utterance based on the probability of the event to which it refers. This probability is calculated with event language models trained on a script knowledge corpus and modulated with probabilistic event chains extracted from the corpus. We use the DeScript corpus of script knowledge to obtain empirically founded estimates of the likelihood of an event to occur in context without having to resort to expensive pre-tests of the stimuli. We exemplify our method at a case study on the usage of nonsentential expressions (fragments), which shows that utterances that are predictable given script-based extralinguistic context are more likely to be reduced.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B3

Brouwer, Harm; Delogu, Francesca; Venhuizen, Noortje; Crocker, Matthew W.

Neurobehavioral Correlates of Surprisal in Language Comprehension: A Neurocomputational Model Journal Article

Frontiers in Psychology, 2021.

Expectation-based theories of language comprehension, in particular Surprisal Theory, go a long way in accounting for the behavioral correlates of word-by-word processing difficulty, such as reading times. An open question, however, is in which component(s) of the Event-Related brain Potential (ERP) signal Surprisal is reflected, and how these electrophysiological correlates relate to behavioral processing indices. Here, we address this question by instantiating an explicit neurocomputational model of incremental, word-by-word language comprehension that produces estimates of the N400 and the P600 – the two most salient ERP components for language processing – as well as estimates of `comprehension-centric‘ Surprisal for each word in a sentence. We derive model predictions for a recent experimental design that directly investigates `world-knowledge‘-induced Surprisal. By relating these predictions to both empirical electrophysiological and behavioral results, we establish a close link between Surprisal, as indexed by reading times, and the P600 component of the ERP signal. The resultant model thus offers an integrated neurobehavioral account of processing difficulty in language comprehension.

@article{Brouwer2021,
title = {Neurobehavioral Correlates of Surprisal in Language Comprehension: A Neurocomputational Model},
author = {Harm Brouwer and Francesca Delogu and Noortje Venhuizen and Matthew W. Crocker},
url = {https://www.frontiersin.org/articles/10.3389/fpsyg.2021.615538/full},
doi = {https://doi.org/10.3389/fpsyg.2021.615538},
year = {2021},
date = {2021-02-11},
journal = {Frontiers in Psychology},
abstract = {Expectation-based theories of language comprehension, in particular Surprisal Theory, go a long way in accounting for the behavioral correlates of word-by-word processing difficulty, such as reading times. An open question, however, is in which component(s) of the Event-Related brain Potential (ERP) signal Surprisal is reflected, and how these electrophysiological correlates relate to behavioral processing indices. Here, we address this question by instantiating an explicit neurocomputational model of incremental, word-by-word language comprehension that produces estimates of the N400 and the P600 - the two most salient ERP components for language processing - as well as estimates of `comprehension-centric' Surprisal for each word in a sentence. We derive model predictions for a recent experimental design that directly investigates `world-knowledge'-induced Surprisal. By relating these predictions to both empirical electrophysiological and behavioral results, we establish a close link between Surprisal, as indexed by reading times, and the P600 component of the ERP signal. The resultant model thus offers an integrated neurobehavioral account of processing difficulty in language comprehension.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A1

Teich, Elke; Fankhauser, Peter; Degaetano-Ortlieb, Stefania; Bizzoni, Yuri

Less is More/More Diverse: On The Communicative Utility of Linguistic Conventionalization Journal Article

Benîtez-Burraco, Antonio (Ed.): Frontiers in Communication, section Language Sciences, 2021.

We present empirical evidence of the communicative utility of CONVENTIONALIZATION, i.e., convergence in linguistic usage over time, and DIVERSIFICATION, i.e., linguistic items acquiring different, more specific usages/meanings. From a diachronic perspective, conventionalization plays a crucial role in language change as a condition for innovation and grammaticalization (Bybee, 2010; Schmid, 2015) and diversification is a cornerstone in the formation of sublanguages/registers, i.e., functional linguistic varieties (Halliday, 1988; Harris, 1991). While it is widely acknowledged that change in language use is primarily socio-culturally determined pushing towards greater linguistic expressivity, we here highlight the limiting function of communicative factors on diachronic linguistic variation showing that conventionalization and diversification are associated with a reduction of linguistic variability. To be able to observe effects of linguistic variability reduction, we first need a well-defined notion of choice in context. Linguistically, this implies the paradigmatic axis of linguistic organization, i.e., the sets of linguistic options available in a given or similar syntagmatic contexts. Here, we draw on word embeddings, weakly neural distributional language models that have recently been employed to model lexicalsemantic change and allow us to approximate the notion of paradigm by neighbourhood in vector space. Second, we need to capture changes in paradigmatic variability, i.e. reduction/expansion of linguistic options in a given context. As a formal index of paradigmatic variability we use entropy, which measures the contribution of linguistic units (e.g., words) in predicting linguistic choice in bits of information. Using entropy provides us with a link to a communicative interpretation, as it is a well-established measure of communicative efficiency with implications for cognitive processing (Linzen and Jaeger, 2016; Venhuizen et al., 2019); also, entropy is negatively correlated with distance in (word embedding) spaces which in turn shows cognitive reflexes in certain language processing tasks (Mitchel et al., 2008; Auguste et al., 2017). In terms of domain we focus on science, looking at the diachronic development of scientific English from the 17th century to modern time. This provides us with a fairly constrained yet dynamic domain of discourse that has witnessed a powerful systematization throughout the centuries and developed specific linguistic conventions geared towards efficient communication. Overall, our study confirms the assumed trends of conventionalization and diversification shown by diachronically decreasing entropy, interspersed with local, temporary entropy highs pointing to phases of linguistic expansion pertaining primarily to introduction of new technical terminology.

@article{Teich2021,
title = {Less is More/More Diverse: On The Communicative Utility of Linguistic Conventionalization},
author = {Elke Teich and Peter Fankhauser and Stefania Degaetano-Ortlieb and Yuri Bizzoni},
editor = {Antonio Benîtez-Burraco},
url = {https://www.frontiersin.org/articles/10.3389/fcomm.2020.620275/full?&utm_source=Email_to_authors_&utm_medium=Email&utm_content=T1_11.5e1_author&utm_campaign=Email_publication&field=&journalName=Frontiers_in_Communication&id=620275},
doi = {https://doi.org/10.3389/fcomm.2020.620275},
year = {2021},
date = {2021-01-26},
journal = {Frontiers in Communication, section Language Sciences},
abstract = {We present empirical evidence of the communicative utility of CONVENTIONALIZATION, i.e., convergence in linguistic usage over time, and DIVERSIFICATION, i.e., linguistic items acquiring different, more specific usages/meanings. From a diachronic perspective, conventionalization plays a crucial role in language change as a condition for innovation and grammaticalization (Bybee, 2010; Schmid, 2015) and diversification is a cornerstone in the formation of sublanguages/registers, i.e., functional linguistic varieties (Halliday, 1988; Harris, 1991). While it is widely acknowledged that change in language use is primarily socio-culturally determined pushing towards greater linguistic expressivity, we here highlight the limiting function of communicative factors on diachronic linguistic variation showing that conventionalization and diversification are associated with a reduction of linguistic variability. To be able to observe effects of linguistic variability reduction, we first need a well-defined notion of choice in context. Linguistically, this implies the paradigmatic axis of linguistic organization, i.e., the sets of linguistic options available in a given or similar syntagmatic contexts. Here, we draw on word embeddings, weakly neural distributional language models that have recently been employed to model lexicalsemantic change and allow us to approximate the notion of paradigm by neighbourhood in vector space. Second, we need to capture changes in paradigmatic variability, i.e. reduction/expansion of linguistic options in a given context. As a formal index of paradigmatic variability we use entropy, which measures the contribution of linguistic units (e.g., words) in predicting linguistic choice in bits of information. Using entropy provides us with a link to a communicative interpretation, as it is a well-established measure of communicative efficiency with implications for cognitive processing (Linzen and Jaeger, 2016; Venhuizen et al., 2019); also, entropy is negatively correlated with distance in (word embedding) spaces which in turn shows cognitive reflexes in certain language processing tasks (Mitchel et al., 2008; Auguste et al., 2017). In terms of domain we focus on science, looking at the diachronic development of scientific English from the 17th century to modern time. This provides us with a fairly constrained yet dynamic domain of discourse that has witnessed a powerful systematization throughout the centuries and developed specific linguistic conventions geared towards efficient communication. Overall, our study confirms the assumed trends of conventionalization and diversification shown by diachronically decreasing entropy, interspersed with local, temporary entropy highs pointing to phases of linguistic expansion pertaining primarily to introduction of new technical terminology.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   B1

Kudera, Jacek; Tavi, Lauri; Möbius, Bernd; Avgustinova, Tania; Klakow, Dietrich

The effect of surprisal on articulatory gestures in Polish consonant-to-vowel transitions: A pilot EMA study Inproceedings

14. ITG-Konferenz, ITG-Fachbericht 298: Speech Communication, pp. 179-183, Kiel, Germany, 2021, ISBN 978-3-8007-5627-8.

This study is concerned with the relation between the information-theoretic notion of surprisal and articulatory gesture in Polish consonant-to-vowel transitions. It addresses the question of the influence of diphone predictability on spectral trajectories and articulatory gestures by relating the effect of surprisal with motor fluency. The study combines the computation of locus equations (LE) with kinematic data obtained from electromagnetic articulograph (EMA). The kinematic and acoustic data showed that a small coarticulation effect was present in the highand low-surprisal clusters. Regardless of some small discrepancies across the measures, a high degree of overlap of adjacent segments is reported for the mid-surprisal group in both domains. Two explanations of the observed effect are proposed. The first refers to low-surprisal coarticulation resistance and suggests the need to disambiguate predictable sequences. The second, observed in high surprisal clusters, refers to the prominence given to emphasize the unexpected concatenation.

@inproceedings{Kudera/etal:2021c,
title = {The effect of surprisal on articulatory gestures in Polish consonant-to-vowel transitions: A pilot EMA study},
author = {Jacek Kudera and Lauri Tavi and Bernd M{\"o}bius and Tania Avgustinova and Dietrich Klakow},
url = {https://ieeexplore.ieee.org/document/9657527},
year = {2021},
date = {2021},
booktitle = {14. ITG-Konferenz, ITG-Fachbericht 298: Speech Communication},
isbn = {978-3-8007-5627-8},
pages = {179-183},
address = {Kiel, Germany},
abstract = {This study is concerned with the relation between the information-theoretic notion of surprisal and articulatory gesture in Polish consonant-to-vowel transitions. It addresses the question of the influence of diphone predictability on spectral trajectories and articulatory gestures by relating the effect of surprisal with motor fluency. The study combines the computation of locus equations (LE) with kinematic data obtained from electromagnetic articulograph (EMA). The kinematic and acoustic data showed that a small coarticulation effect was present in the highand low-surprisal clusters. Regardless of some small discrepancies across the measures, a high degree of overlap of adjacent segments is reported for the mid-surprisal group in both domains. Two explanations of the observed effect are proposed. The first refers to low-surprisal coarticulation resistance and suggests the need to disambiguate predictable sequences. The second, observed in high surprisal clusters, refers to the prominence given to emphasize the unexpected concatenation.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Kudera, Jacek; Georgis, Philip; Möbius, Bernd; Avgustinova, Tania; Klakow, Dietrich

Phonetic Distance and Surprisal in Multilingual Priming: Evidence from Slavic Inproceedings

Proc. Interspeech, pp. 3944-3948, 2021.

This study reveals the relation between surprisal, phonetic distance, and latency based on a multilingual, short-term priming framework. Four Slavic languages (Bulgarian, Czech, Polish, and Russian) are investigated across two priming conditions: associative and phonetic priming, involving true cognates and near-homophones, respectively. This research is grounded in the methodology of information theory and proposes new methods for quantifying differences between meaningful lexical primes and targets for closely related languages. It also outlines the influence of phonetic distance between cognate and noncognate pairs of primes and targets on response times in a cross-lingual lexical decision task. The experimental results show that phonetic distance moderates response times only in Polish and Czech, whereas the surprisal-based correspondence effect is an accurate predictor of latency for all tested languages. The information-theoretic approach of quantifying feature-based alternations between Slavic cognates and near-homophones appears to be a valid method for latency moderation in the auditory modality. The outcomes of this study suggest that the surprisal-based (un)expectedness of spoken stimuli is an accurate predictor of human performance in multilingual lexical decision tasks.

@inproceedings{kudera21_interspeech,
title = {Phonetic Distance and Surprisal in Multilingual Priming: Evidence from Slavic},
author = {Jacek Kudera and Philip Georgis and Bernd M{\"o}bius and Tania Avgustinova and Dietrich Klakow},
url = {https://www.isca-speech.org/archive/interspeech_2021/kudera21_interspeech.html},
doi = {https://doi.org/10.21437/Interspeech.2021-1003},
year = {2021},
date = {2021},
booktitle = {Proc. Interspeech},
pages = {3944-3948},
abstract = {This study reveals the relation between surprisal, phonetic distance, and latency based on a multilingual, short-term priming framework. Four Slavic languages (Bulgarian, Czech, Polish, and Russian) are investigated across two priming conditions: associative and phonetic priming, involving true cognates and near-homophones, respectively. This research is grounded in the methodology of information theory and proposes new methods for quantifying differences between meaningful lexical primes and targets for closely related languages. It also outlines the influence of phonetic distance between cognate and noncognate pairs of primes and targets on response times in a cross-lingual lexical decision task. The experimental results show that phonetic distance moderates response times only in Polish and Czech, whereas the surprisal-based correspondence effect is an accurate predictor of latency for all tested languages. The information-theoretic approach of quantifying feature-based alternations between Slavic cognates and near-homophones appears to be a valid method for latency moderation in the auditory modality. The outcomes of this study suggest that the surprisal-based (un)expectedness of spoken stimuli is an accurate predictor of human performance in multilingual lexical decision tasks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Abdullah, Badr M.; Zaitova, Iuliia; Avgustinova, Tania; Möbius, Bernd; Klakow, Dietrich

How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings Inproceedings

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, Association for Computational Linguistics, pp. 407-419, 2021.

How do neural networks “perceive” speech sounds from unknown languages? Does the typological similarity between the model’s training language (L1) and an unknown language (L2) have an impact on the model representations of L2 speech signals? To answer these questions, we present a novel experimental design based on representational similarity analysis (RSA) to analyze acoustic word embeddings (AWEs)—vector representations of variable-duration spoken-word segments. First, we train monolingual AWE models on seven Indo-European languages with various degrees of typological similarity. We then employ RSA to quantify the cross-lingual similarity by simulating native and non-native spoken-word processing using AWEs. Our experiments show that typological similarity indeed affects the representational similarity of the models in our study. We further discuss the implications of our work on modeling speech processing and language similarity with neural networks.

@inproceedings{abdullah-etal-2021-familiar,
title = {How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings},
author = {Badr M. Abdullah and Iuliia Zaitova and Tania Avgustinova and Bernd M{\"o}bius and Dietrich Klakow},
url = {https://aclanthology.org/2021.blackboxnlp-1.32/},
doi = {https://doi.org/10.18653/v1/2021.blackboxnlp-1.32},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP},
pages = {407-419},
publisher = {Association for Computational Linguistics},
abstract = {How do neural networks “perceive” speech sounds from unknown languages? Does the typological similarity between the model’s training language (L1) and an unknown language (L2) have an impact on the model representations of L2 speech signals? To answer these questions, we present a novel experimental design based on representational similarity analysis (RSA) to analyze acoustic word embeddings (AWEs)—vector representations of variable-duration spoken-word segments. First, we train monolingual AWE models on seven Indo-European languages with various degrees of typological similarity. We then employ RSA to quantify the cross-lingual similarity by simulating native and non-native spoken-word processing using AWEs. Our experiments show that typological similarity indeed affects the representational similarity of the models in our study. We further discuss the implications of our work on modeling speech processing and language similarity with neural networks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Zouhar, Vilém; Mosbach, Marius; Biswas, Debanjali; Klakow, Dietrich

Artefact Retrieval: Overview of NLP Models with Knowledge Base Access Inproceedings

Workshop on Commonsense Reasoning and Knowledge Bases, 2021.

Many NLP models gain performance by having access to a knowledge base. A lot of research has been devoted to devising and improving the way the knowledge base is accessed and incorporated into the model, resulting in a number of mechanisms and pipelines. Despite the diversity of proposed mechanisms, there are patterns in the designs of such systems. In this paper, we systematically describe the typology of *artefacts* (items retrieved from a knowledge base), retrieval mechanisms and the way these artefacts are *fused* into the model. This further allows us to uncover combinations of design decisions that had not yet been tried. Most of the focus is given to language models, though we also show how question answering, fact-checking and knowledgable dialogue models fit into this system as well. Having an abstract model which can describe the architecture of specific models also helps with transferring these architectures between multiple NLP tasks.

@inproceedings{zouhar2021artefact,
title = {Artefact Retrieval: Overview of NLP Models with Knowledge Base Access},
author = {Vil{\'e}m Zouhar and Marius Mosbach and Debanjali Biswas and Dietrich Klakow},
url = {https://arxiv.org/abs/2201.09651},
year = {2021},
date = {2021},
booktitle = {Workshop on Commonsense Reasoning and Knowledge Bases},
abstract = {Many NLP models gain performance by having access to a knowledge base. A lot of research has been devoted to devising and improving the way the knowledge base is accessed and incorporated into the model, resulting in a number of mechanisms and pipelines. Despite the diversity of proposed mechanisms, there are patterns in the designs of such systems. In this paper, we systematically describe the typology of *artefacts* (items retrieved from a knowledge base), retrieval mechanisms and the way these artefacts are *fused* into the model. This further allows us to uncover combinations of design decisions that had not yet been tried. Most of the focus is given to language models, though we also show how question answering, fact-checking and knowledgable dialogue models fit into this system as well. Having an abstract model which can describe the architecture of specific models also helps with transferring these architectures between multiple NLP tasks.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Hoek, Jet; Scholman, Merel; Sanders, Ted J. M.

Is there less agreement when the discourse is underspecified? Inproceedings

Proceedings of the Integrating Perspectives on Discourse Annotation (DiscAnn) Workshop, University of Tübingen, Germany, 2021.

When annotating coherence relations, interannotator agreement tends to be lower on implicit relations than on relations that are explicitly marked by means of a connective or a cue phrase. This paper explores one possible explanation for this: the additional inferencing involved in interpreting implicit relations compared to explicit relations. If this is the main source of disagreements, agreement should be highly related to the specificity of the connective. Using the CCR framework, we annotated relations from TED talks that were marked by a very specific marker, marked by a highly ambiguous connective, or not marked by means of a connective at all. We indeed reached higher inter-annotator agreement on explicit than on implicit relations. However, agreement on underspecified relations was not necessarily in between, which is what would be expected if agreement on implicit relations mainly suffers because annotators have less specific instructions for inferring the relation.

@inproceedings{hoek-etal-2021-discann,
title = {Is there less agreement when the discourse is underspecified?},
author = {Jet Hoek and Merel Scholman and Ted J. M. Sanders},
url = {https://aclanthology.org/2021.discann-1.1/},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Integrating Perspectives on Discourse Annotation (DiscAnn) Workshop},
address = {University of T{\"u}bingen, Germany},
abstract = {When annotating coherence relations, interannotator agreement tends to be lower on implicit relations than on relations that are explicitly marked by means of a connective or a cue phrase. This paper explores one possible explanation for this: the additional inferencing involved in interpreting implicit relations compared to explicit relations. If this is the main source of disagreements, agreement should be highly related to the specificity of the connective. Using the CCR framework, we annotated relations from TED talks that were marked by a very specific marker, marked by a highly ambiguous connective, or not marked by means of a connective at all. We indeed reached higher inter-annotator agreement on explicit than on implicit relations. However, agreement on underspecified relations was not necessarily in between, which is what would be expected if agreement on implicit relations mainly suffers because annotators have less specific instructions for inferring the relation.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Yung, Frances Pik Yu; Scholman, Merel; Demberg, Vera

A practical perspective on connective generation Inproceedings

Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI), Association for Computational Linguistics, pp. 72-83, Punta Cana, Dominican Republic and Online, 2021.

In data-driven natural language generation, we typically know what relation should be expressed and need to select a connective to lexicalize it. In the current contribution, we analyse whether a sophisticated connective generation module is necessary to select a connective, or whether this can be solved with simple methods (such as random choice between connectives that are known to express a given relation, or usage of a generic language model). Comparing these methods to the distributions of connective choices from a human connective insertion task, we find mixed results: for some relations, it is acceptable to lexicalize them using any of the connectives that mark this relation. However, for other relations (temporals, concessives) either a more detailed relation distinction needs to be introduced, or a more sophisticated connective choice module would be necessary.

@inproceedings{yung-etal-2021-practical,
title = {A practical perspective on connective generation},
author = {Frances Pik Yu Yung and Merel Scholman and Vera Demberg},
url = {https://aclanthology.org/2021.codi-main.7},
doi = {https://doi.org/10.18653/v1/2021.codi-main.7},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI)},
pages = {72-83},
publisher = {Association for Computational Linguistics},
address = {Punta Cana, Dominican Republic and Online},
abstract = {In data-driven natural language generation, we typically know what relation should be expressed and need to select a connective to lexicalize it. In the current contribution, we analyse whether a sophisticated connective generation module is necessary to select a connective, or whether this can be solved with simple methods (such as random choice between connectives that are known to express a given relation, or usage of a generic language model). Comparing these methods to the distributions of connective choices from a human connective insertion task, we find mixed results: for some relations, it is acceptable to lexicalize them using any of the connectives that mark this relation. However, for other relations (temporals, concessives) either a more detailed relation distinction needs to be introduced, or a more sophisticated connective choice module would be necessary.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Scholman, Merel; Dong, Tianai; Yung, Frances Pik Yu; Demberg, Vera

Comparison of methods for explicit discourse connective identification across various domains Inproceedings

Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI), Association for Computational Linguistics, pp. 95-106, Punta Cana, Dominican Republic and Online, 2021.

Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles. We here assess the performance on explicit connective identification of three parse methods (PDTB e2e, Lin et al., 2014; the winner of CONLL2015, Wang et al., 2015; and DisSent, Nie et al., 2019), along with a simple heuristic. We also examine how well these systems generalize to different datasets, namely written newspaper text (PDTB), written scientific text (BioDRB), prepared spoken text (TED-MDB) and spontaneous spoken text (Disco-SPICE). The results show that the e2e parser outperforms the other parse methods in all datasets. However, performance drops significantly from the PDTB to all other datasets. We provide a more fine-grained analysis of domain differences and connectives that prove difficult to parse, in order to highlight the areas where gains can be made.

@inproceedings{scholman-etal-2021-comparison,
title = {Comparison of methods for explicit discourse connective identification across various domains},
author = {Merel Scholman and Tianai Dong and Frances Pik Yu Yung and Vera Demberg},
url = {https://aclanthology.org/2021.codi-main.9},
doi = {https://doi.org/10.18653/v1/2021.codi-main.9},
year = {2021},
date = {2021},
booktitle = {Proceedings of the Second Workshop on Computational Approaches to Discourse (CODI)},
pages = {95-106},
publisher = {Association for Computational Linguistics},
address = {Punta Cana, Dominican Republic and Online},
abstract = {Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles. We here assess the performance on explicit connective identification of three parse methods (PDTB e2e, Lin et al., 2014; the winner of CONLL2015, Wang et al., 2015; and DisSent, Nie et al., 2019), along with a simple heuristic. We also examine how well these systems generalize to different datasets, namely written newspaper text (PDTB), written scientific text (BioDRB), prepared spoken text (TED-MDB) and spontaneous spoken text (Disco-SPICE). The results show that the e2e parser outperforms the other parse methods in all datasets. However, performance drops significantly from the PDTB to all other datasets. We provide a more fine-grained analysis of domain differences and connectives that prove difficult to parse, in order to highlight the areas where gains can be made.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Lemke, Tyll Robin

Satzäquivalente — Syntax oder Pragmatik? Incollection

Külpmann, Robert; Finkbeiner, Rita;  (Ed.): Neues zur Selbstständigkeit von Sätzen, Linguistische Berichte, Sonderheft, Buske, pp. 81-104, Hamburg, 2021, ISBN 978-3-96769-170-2 .

„Satzäquivalente“ scheinen einen Widerspruch zwischen Syntax und Pragmatik darzustellen, da sie trotz nichtsententialer Form die selben Funktionen wie Sätze erfüllen. Wir stellen zwei Experimente vor, die Vorhersagen zweier theoretischer Perspektiven auf diese Ausdrücke untersuchen. Einerseits generieren elliptische Ansätze (Morgan, 1973; Merchant, 2004; Reich, 2007) Satzäquivalente mittels Ellipse aus vollständigen Sätzen, andererseits schlagen nichtsententiale Ansätze (Barton & Progovac, 2005; Stainton, 2006) vor, dass die Syntax subsententiale Ausdrücke generieren kann.

@incollection{Lemke2021a,
title = {Satz{\"a}quivalente — Syntax oder Pragmatik?},
author = {Tyll Robin Lemke},
editor = {Robert K{\"u}lpmann and Rita Finkbeiner},
url = {https://buske.de/zeitschriften-bei-sonderhefte/linguistische-berichte-sonderhefte/neues-zur-selbststandigkeit-von-satzen-16620.html},
doi = {https://doi.org/10.46771/978-3-96769-170-2},
year = {2021},
date = {2021},
booktitle = {Neues zur Selbstst{\"a}ndigkeit von S{\"a}tzen},
isbn = {978-3-96769-170-2},
pages = {81-104},
publisher = {Buske},
address = {Hamburg},
abstract = {"Satz{\"a}quivalente" scheinen einen Widerspruch zwischen Syntax und Pragmatik darzustellen, da sie trotz nichtsententialer Form die selben Funktionen wie S{\"a}tze erf{\"u}llen. Wir stellen zwei Experimente vor, die Vorhersagen zweier theoretischer Perspektiven auf diese Ausdr{\"u}cke untersuchen. Einerseits generieren elliptische Ans{\"a}tze (Morgan, 1973; Merchant, 2004; Reich, 2007) Satz{\"a}quivalente mittels Ellipse aus vollst{\"a}ndigen S{\"a}tzen, andererseits schlagen nichtsententiale Ans{\"a}tze (Barton & Progovac, 2005; Stainton, 2006) vor, dass die Syntax subsententiale Ausdr{\"u}cke generieren kann.},
pubstate = {published},
type = {incollection}
}

Copy BibTeX to Clipboard

Project:   B3

Lemke, Tyll Robin

Experimental investigations on the syntax and usage of fragments Miscellaneous

Experimental investigations on the syntax and usage of fragments, Open Germanic Linguistics, Language Science Press, Berlin, 2021.

This book investigates the syntax and usage of fragments (Morgan 1973), apparently subsentential utterances like „A coffee, please!“ which fulfill the same communicative function as the corresponding full sentence „I’d like to have a coffee, please!“. Even though such utterances are frequently used, they challenge the central role that has been attributed to the notion of sentence in linguistic theory, particularly from a semantic perspective.

The first part of the book is dedicated to the syntactic analysis of fragments, which is investigated with experimental methods. Currently there are several competing theoretical analyses of fragments, which rely almost only on introspective data. The experiments presented in this book constitute a first systematic evaluation of some of their crucial predictions and, taken together, support an in situ ellipsis account of fragments, as has been suggested by Reich (2007).

The second part of the book addresses the questions of why fragments are used at all, and under which circumstances they are preferred over complete sentences. Syntactic accounts impose licensing conditions on fragments, but they do not explain, why fragments are sometimes (dis)preferred provided that their usage is licensed. This book proposes an information-theoretic account of fragments, which predicts that the usage of fragments in constrained by a general tendency to distribute processing effort uniformly across the utterance. With respect to fragments, this leads to two predictions, which are empirically confirmed: Speakers tend towards omitting predictable words and they insert additional redundancy before unpredictable words.

@miscellaneous{Lemke2021,
title = {Experimental investigations on the syntax and usage of fragments},
author = {Tyll Robin Lemke},
url = {https://langsci-press.org/catalog/book/321},
doi = {https://doi.org/10.5281/zenodo.5596236},
year = {2021},
date = {2021},
booktitle = {Experimental investigations on the syntax and usage of fragments},
publisher = {Language Science Press},
address = {Berlin},
abstract = {This book investigates the syntax and usage of fragments (Morgan 1973), apparently subsentential utterances like "A coffee, please!" which fulfill the same communicative function as the corresponding full sentence "I'd like to have a coffee, please!". Even though such utterances are frequently used, they challenge the central role that has been attributed to the notion of sentence in linguistic theory, particularly from a semantic perspective. The first part of the book is dedicated to the syntactic analysis of fragments, which is investigated with experimental methods. Currently there are several competing theoretical analyses of fragments, which rely almost only on introspective data. The experiments presented in this book constitute a first systematic evaluation of some of their crucial predictions and, taken together, support an in situ ellipsis account of fragments, as has been suggested by Reich (2007). The second part of the book addresses the questions of why fragments are used at all, and under which circumstances they are preferred over complete sentences. Syntactic accounts impose licensing conditions on fragments, but they do not explain, why fragments are sometimes (dis)preferred provided that their usage is licensed. This book proposes an information-theoretic account of fragments, which predicts that the usage of fragments in constrained by a general tendency to distribute processing effort uniformly across the utterance. With respect to fragments, this leads to two predictions, which are empirically confirmed: Speakers tend towards omitting predictable words and they insert additional redundancy before unpredictable words.},
pubstate = {published},
type = {miscellaneous}
}

Copy BibTeX to Clipboard

Project:   B3

Kalimuthu, Marimuthu; Mogadala, Aditya; Mosbach, Marius; Klakow, Dietrich

Fusion Models for Improved Image Captioning Inproceedings

Pattern Recognition. ICPR International Workshops and Challenges, pp. 381-395, Cham, 2020.

Visual captioning aims to generate textual descriptions given images or videos. Traditionally, image captioning models are trained on human annotated datasets such as Flickr30k and MS-COCO, which are limited in size and diversity. This limitation hinders the generalization capabilities of these models while also rendering them liable to making mistakes. Language models can, however, be trained on vast amounts of freely available unlabelled data and have recently emerged as successful language encoders and coherent text generators. Meanwhile, several unimodal and multimodal fusion techniques have been proven to work well for natural language generation and automatic speech recognition. Building on these recent developments, and with the aim of improving the quality of generated captions, the contribution of our work in this paper is two-fold: First, we propose a generic multimodal model fusion framework for caption generation as well as emendation where we utilize different fusion strategies to integrate a pretrained Auxiliary Language Model (AuxLM) within the traditional encoder-decoder visual captioning frameworks. Next, we employ the same fusion strategies to integrate a pretrained Masked Language Model (MLM), namely BERT, with a visual captioning model, viz. Show, Attend, and Tell, for emending both syntactic and semantic errors in captions. Our caption emendation experiments on three benchmark image captioning datasets, viz. Flickr8k, Flickr30k, and MSCOCO, show improvements over the baseline, indicating the usefulness of our proposed multimodal fusion strategies. Further, we perform a preliminary qualitative analysis on the emended captions and identify error categories based on the type of corrections.

@inproceedings{Kalimuthu2021fusion,
title = {Fusion Models for Improved Image Captioning},
author = {Marimuthu Kalimuthu and Aditya Mogadala and Marius Mosbach and Dietrich Klakow},
url = {https://arxiv.org/abs/2010.15251},
doi = {https://doi.org/10.1007/978-3-030-68780-9_32},
year = {2020},
date = {2020},
booktitle = {Pattern Recognition. ICPR International Workshops and Challenges},
pages = {381-395},
address = {Cham},
abstract = {Visual captioning aims to generate textual descriptions given images or videos. Traditionally, image captioning models are trained on human annotated datasets such as Flickr30k and MS-COCO, which are limited in size and diversity. This limitation hinders the generalization capabilities of these models while also rendering them liable to making mistakes. Language models can, however, be trained on vast amounts of freely available unlabelled data and have recently emerged as successful language encoders and coherent text generators. Meanwhile, several unimodal and multimodal fusion techniques have been proven to work well for natural language generation and automatic speech recognition. Building on these recent developments, and with the aim of improving the quality of generated captions, the contribution of our work in this paper is two-fold: First, we propose a generic multimodal model fusion framework for caption generation as well as emendation where we utilize different fusion strategies to integrate a pretrained Auxiliary Language Model (AuxLM) within the traditional encoder-decoder visual captioning frameworks. Next, we employ the same fusion strategies to integrate a pretrained Masked Language Model (MLM), namely BERT, with a visual captioning model, viz. Show, Attend, and Tell, for emending both syntactic and semantic errors in captions. Our caption emendation experiments on three benchmark image captioning datasets, viz. Flickr8k, Flickr30k, and MSCOCO, show improvements over the baseline, indicating the usefulness of our proposed multimodal fusion strategies. Further, we perform a preliminary qualitative analysis on the emended captions and identify error categories based on the type of corrections.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Mogadala, Aditya; Mosbach, Marius; Klakow, Dietrich

Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation Inproceedings

Bridge Between Perception and Reasoning: Graph Neural Networks & Beyond, Workshop at ICML, 2020.

Generating longer textual sequences when conditioned on the visual information is an interesting problem to explore. The challenge here proliferate over the standard vision conditioned sentence-level generation (e.g., image or video captioning) as it requires to produce a brief and coherent story describing the visual content. In this paper, we mask this Vision-to-Sequence as Graph-to-Sequence learning problem and approach it with the Transformer architecture. To be specific, we introduce Sparse Graph-to-Sequence Transformer (SGST) for encoding the graph and decoding a sequence. The encoder aims to directly encode graph-level semantics, while the decoder is used to generate longer sequences. Experiments conducted with the benchmark image paragraph dataset show that our proposed achieve 13.3% improvement on the CIDEr evaluation measure when comparing to the previous state-of-the-art approach.

@inproceedings{mogadala2020sparse,
title = {Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation},
author = {Aditya Mogadala and Marius Mosbach and Dietrich Klakow},
url = {https://arxiv.org/abs/2007.06077},
year = {2020},
date = {2020},
booktitle = {Bridge Between Perception and Reasoning: Graph Neural Networks & Beyond, Workshop at ICML},
abstract = {Generating longer textual sequences when conditioned on the visual information is an interesting problem to explore. The challenge here proliferate over the standard vision conditioned sentence-level generation (e.g., image or video captioning) as it requires to produce a brief and coherent story describing the visual content. In this paper, we mask this Vision-to-Sequence as Graph-to-Sequence learning problem and approach it with the Transformer architecture. To be specific, we introduce Sparse Graph-to-Sequence Transformer (SGST) for encoding the graph and decoding a sequence. The encoder aims to directly encode graph-level semantics, while the decoder is used to generate longer sequences. Experiments conducted with the benchmark image paragraph dataset show that our proposed achieve 13.3% improvement on the CIDEr evaluation measure when comparing to the previous state-of-the-art approach.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Ferber, Patrick; Hoffmann, Jörg; Helmert, Malte

Neural network heuristics for classical planning: A study of hyperparameter space Inproceedings

24th European Conference on Artificial Intelligence (ECAI’20), 2020.

Neural networks (NN) have been shown to be powerful state-value predictors in several complex games. Can similar successes be achieved in classical planning? Towards a systematic exploration of that question, we contribute a study of hyperparameter space in the most canonical setup: input = state, feed-forward NN, supervised learning, generalization only over initial state. We investigate a broad range of hyperparameters pertaining to NN design and training. We evaluate these techniques through their use as heuristic functions in Fast Downward. The results on IPC benchmarks show that highly competitive heuristics can be learned, yielding substantially smaller search spaces than standard techniques on some domains. But the heuristic functions are costly to evaluate, and the range of domains where useful heuristics are learned is limited. Our study provides the basis for further research improving on current weaknesses.

@inproceedings{Ferber2020network,
title = {Neural network heuristics for classical planning: A study of hyperparameter space},
author = {Patrick Ferber and J{\"o}rg Hoffmann and Malte Helmert},
url = {https://ecai2020.eu/papers/433_paper.pdf},
year = {2020},
date = {2020},
booktitle = {24th European Conference on Artificial Intelligence (ECAI’20)},
abstract = {Neural networks (NN) have been shown to be powerful state-value predictors in several complex games. Can similar successes be achieved in classical planning? Towards a systematic exploration of that question, we contribute a study of hyperparameter space in the most canonical setup: input = state, feed-forward NN, supervised learning, generalization only over initial state. We investigate a broad range of hyperparameters pertaining to NN design and training. We evaluate these techniques through their use as heuristic functions in Fast Downward. The results on IPC benchmarks show that highly competitive heuristics can be learned, yielding substantially smaller search spaces than standard techniques on some domains. But the heuristic functions are costly to evaluate, and the range of domains where useful heuristics are learned is limited. Our study provides the basis for further research improving on current weaknesses.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A7

Mecklinger, Axel; Bader, Regine

From fluency to recognition decisions: A broader view of familiarity-based remembering Journal Article

Neuropsychologia, 146, pp. 107527, 2020.

The goal of this article is to critically examine current claims and assumptions about the FN400, an event-related potential (ERP) component which has been related to familiarity memory though some uncertainty exists regarding the cognitive processes captured by the FN400. It is proposed that familiarity can be multiply determined and that an important distinction has to be made between a recent-exposure, relative familiarity mechanism indexed by the FN400 and an absolute/baseline familiarity mechanism being reflected by a coincidental but topographically distinct ERP effect. We suggest a broader conceptualization of the memory processes reflected by the FN400 and propose an unexpected fluency-attribution account of familiarity according to which familiarity results from a fast assessment of ongoing processing fluency relative to previous events or current expectations. The computations underlying fluency attribution may be closely related to those characterizing the relative familiarity mechanism underlying the FN400. We also argue that concerted activation of the perirhinal cortex (PrC) and the lateral prefrontal cortex (PFC) plays a pivotal role for fluency attributions and the generation of the FN400.

@article{MecklingerBader2020,
title = {From fluency to recognition decisions: A broader view of familiarity-based remembering},
author = {Axel Mecklinger and Regine Bader},
url = {https://www.sciencedirect.com/science/article/abs/pii/S0028393220302001},
doi = {https://doi.org/10.1016/j.neuropsychologia.2020.107527},
year = {2020},
date = {2020},
journal = {Neuropsychologia},
pages = {107527},
volume = {146},
abstract = {The goal of this article is to critically examine current claims and assumptions about the FN400, an event-related potential (ERP) component which has been related to familiarity memory though some uncertainty exists regarding the cognitive processes captured by the FN400. It is proposed that familiarity can be multiply determined and that an important distinction has to be made between a recent-exposure, relative familiarity mechanism indexed by the FN400 and an absolute/baseline familiarity mechanism being reflected by a coincidental but topographically distinct ERP effect. We suggest a broader conceptualization of the memory processes reflected by the FN400 and propose an unexpected fluency-attribution account of familiarity according to which familiarity results from a fast assessment of ongoing processing fluency relative to previous events or current expectations. The computations underlying fluency attribution may be closely related to those characterizing the relative familiarity mechanism underlying the FN400. We also argue that concerted activation of the perirhinal cortex (PrC) and the lateral prefrontal cortex (PFC) plays a pivotal role for fluency attributions and the generation of the FN400.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A6

Successfully