Publications

Ahrendt, Simon; Demberg, Vera

Improving event prediction by representing script participants Inproceedings

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, pp. 546-551, San Diego, California, 2016.

Automatically learning script knowledge has proved difficult, with previous work not or just barely beating a most-frequent baseline. Script knowledge is a type of world knowledge which can however be useful for various task in NLP and psycholinguistic modelling. We here propose a model that includes participant information (i.e., knowledge about which participants are relevant for a script) and show, on the Dinners from Hell corpus as well as the InScript corpus, that this knowledge helps us to significantly improve prediction performance on the narrative cloze task.

@inproceedings{ahrendt-demberg:2016:N16-1,
title = {Improving event prediction by representing script participants},
author = {Simon Ahrendt and Vera Demberg},
url = {http://www.aclweb.org/anthology/N16-1067},
year = {2016},
date = {2016-06-01},
booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
pages = {546-551},
publisher = {Association for Computational Linguistics},
address = {San Diego, California},
abstract = {Automatically learning script knowledge has proved difficult, with previous work not or just barely beating a most-frequent baseline. Script knowledge is a type of world knowledge which can however be useful for various task in NLP and psycholinguistic modelling. We here propose a model that includes participant information (i.e., knowledge about which participants are relevant for a script) and show, on the Dinners from Hell corpus as well as the InScript corpus, that this knowledge helps us to significantly improve prediction performance on the narrative cloze task.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Pusse, Florian; Sayeed, Asad; Demberg, Vera

LingoTurk: managing crowdsourced tasks for psycholinguistics Inproceedings

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, Association for Computational Linguistics, pp. 57-61, San Diego, California, 2016.

LingoTurk is an open-source, freely available crowdsourcing client/server system aimed primarily at psycholinguistic experimentation where custom and specialized user interfaces are required but not supported by popular crowdsourcing task management platforms. LingoTurk enables user-friendly local hosting of experiments as well as condition management and participant exclusion. It is compatible with Amazon Mechanical Turk and Prolific Academic. New experiments can easily be set up via the Play Framework and the LingoTurk API, while multiple experiments can be managed from a single system.

@inproceedings{pusse-sayeed-demberg:2016:N16-3,
title = {LingoTurk: managing crowdsourced tasks for psycholinguistics},
author = {Florian Pusse and Asad Sayeed and Vera Demberg},
url = {http://www.aclweb.org/anthology/N16-3012},
year = {2016},
date = {2016-06-01},
booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations},
pages = {57-61},
publisher = {Association for Computational Linguistics},
address = {San Diego, California},
abstract = {LingoTurk is an open-source, freely available crowdsourcing client/server system aimed primarily at psycholinguistic experimentation where custom and specialized user interfaces are required but not supported by popular crowdsourcing task management platforms. LingoTurk enables user-friendly local hosting of experiments as well as condition management and participant exclusion. It is compatible with Amazon Mechanical Turk and Prolific Academic. New experiments can easily be set up via the Play Framework and the LingoTurk API, while multiple experiments can be managed from a single system.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Wanzare, Lilian Diana Awuor; Zarcone, Alessandra; Thater, Stefan; Pinkal, Manfred

A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge Inproceedings

Calzolari, Nicoletta; Choukri, Khalid; Declerck, Thierry; Grobelnik, Marko; Maegaard, Bente; Mariani, Joseph; Moreno, Asuncion; Odijk, Jan; Piperidis, Stelios;  (Ed.): Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), European Language Resources Association (ELRA), Portorož, Slovenia, 2016, ISBN 978-2-9517408-9-1.

Scripts are standardized event sequences describing typical everyday activities, which play an important role in the computational modeling of cognitive abilities (in particular for natural language processing). We present a large-scale crowdsourced collection of explicit linguistic descriptions of script-specific event sequences (40 scenarios with 100 sequences each). The corpus is enriched with crowdsourced alignment annotation on a subset of the event descriptions, to be used in future work as seed data for automatic alignment of event descriptions (for example via clustering). The event descriptions to be aligned were chosen among those expected to have the strongest corrective effect on the clustering algorithm. The alignment annotation was evaluated against a gold standard of expert annotators. The resulting database of partially-aligned script-event descriptions provides a sound empirical basis for inducing high-quality script knowledge, as well as for any task involving alignment and paraphrase detection of events.

@inproceedings{WANZARE16.913,
title = {A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge},
author = {Lilian Diana Awuor Wanzare and Alessandra Zarcone and Stefan Thater and Manfred Pinkal},
editor = {Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
url = {https://aclanthology.org/L16-1556/},
year = {2016},
date = {2016},
booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
isbn = {978-2-9517408-9-1},
publisher = {European Language Resources Association (ELRA)},
address = {Portoro{\v{z}, Slovenia},
abstract = {Scripts are standardized event sequences describing typical everyday activities, which play an important role in the computational modeling of cognitive abilities (in particular for natural language processing). We present a large-scale crowdsourced collection of explicit linguistic descriptions of script-specific event sequences (40 scenarios with 100 sequences each). The corpus is enriched with crowdsourced alignment annotation on a subset of the event descriptions, to be used in future work as seed data for automatic alignment of event descriptions (for example via clustering). The event descriptions to be aligned were chosen among those expected to have the strongest corrective effect on the clustering algorithm. The alignment annotation was evaluated against a gold standard of expert annotators. The resulting database of partially-aligned script-event descriptions provides a sound empirical basis for inducing high-quality script knowledge, as well as for any task involving alignment and paraphrase detection of events.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A2

Le Maguer, Sébastien; Steiner, Ingmar; Möbius, Bernd

Toward a Speech Synthesis Guided by the Modeling of Unexpected Events Inproceedings

Schweitzer, Antje; Dogil, Grzegorz (Ed.): Workshop on Modeling Variability in Speech, Stuttgart, Germany, 2015.

@inproceedings{LeMaguer2015Variability,
title = {Toward a Speech Synthesis Guided by the Modeling of Unexpected Events},
author = {S{\'e}bastien Le Maguer and Ingmar Steiner and Bernd M{\"o}bius},
editor = {Antje Schweitzer and Grzegorz Dogil},
url = {https://www.bibsonomy.org/bibtex/217fb65d2ef291a8a10df15db8a8cf5c7/sfb1102},
year = {2015},
date = {2015},
booktitle = {Workshop on Modeling Variability in Speech},
address = {Stuttgart, Germany},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C5

Fischer, Andrea; Jágrová, Klára; Stenger, Irina; Avgustinova, Tania; Klakow, Dietrich; Marti, Roland

Models for Mutual Intelligibility Inproceedings

Data Mining and its Use and Usability for Linguistic Analysis, Universität des Saarlandes, Saarbrücken, Germany, 2015.

@inproceedings{andrea2015models,
title = {Models for Mutual Intelligibility},
author = {Andrea Fischer and Kl{\'a}ra J{\'a}grov{\'a} and Irina Stenger and Tania Avgustinova and Dietrich Klakow and Roland Marti},
year = {2015},
date = {2015},
booktitle = {Data Mining and its Use and Usability for Linguistic Analysis},
publisher = {Universit{\"a}t des Saarlandes},
address = {Saarbr{\"u}cken, Germany},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Fischer, Andrea; Jágrová, Klára; Stenger, Irina; Avgustinova, Tania; Klakow, Dietrich; Marti, Roland

An Orthography Transformation Experiment with Czech-Polish and Bulgarian-Russian Parallel Word Sets Inproceedings

Sharp, Bernadette; Lubaszewski, Wiesław; Delmonte, Rodolfo (Ed.): Natural Language Processing and Cognitive Science, Ca Foscarina Editrice, Venezia, pp. 115-126, 2015.

This article presents the methods and findings of a computational transformation of orthography within two Slavic language pairs (Czech­Polish and Bulgarian­Russian) on different word sets. The experiment aimed at investigating to what extent these closely related languages are mutually intelligible, concentrating on their orthographies as linguistic interfaces to the written text. Besides analyzing orthographic similarity, the aim was to gain insights into the applicability of rules based on traditional linguistic assumptions for the purposes of language modelling.

@inproceedings{klara2015orthography,
title = {An Orthography Transformation Experiment with Czech-Polish and Bulgarian-Russian Parallel Word Sets},
author = {Andrea Fischer and Kl{\'a}ra J{\'a}grov{\'a} and Irina Stenger and Tania Avgustinova and Dietrich Klakow and Roland Marti},
editor = {Bernadette Sharp and Wiesław Lubaszewski and Rodolfo Delmonte},
url = {https://www.bibsonomy.org/bibtex/231c7c8a9b94a872a7396d5b1a1ef7962/sfb1102},
year = {2015},
date = {2015},
booktitle = {Natural Language Processing and Cognitive Science},
pages = {115-126},
publisher = {Ca Foscarina Editrice, Venezia},
abstract = {This article presents the methods and findings of a computational transformation of orthography within two Slavic language pairs (Czech­Polish and Bulgarian­Russian) on different word sets. The experiment aimed at investigating to what extent these closely related languages are mutually intelligible, concentrating on their orthographies as linguistic interfaces to the written text. Besides analyzing orthographic similarity, the aim was to gain insights into the applicability of rules based on traditional linguistic assumptions for the purposes of language modelling.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Avgustinova, Tania; Fischer, Andrea; Jágrová, Klára; Stenger, Irina

The Empirical Basis of Slavic Intercomprehension Inproceedings

REMU, Joensuu, Finland, 2015.

The possibility of intercomprehension between related languages is a generally accepted fact suggesting that mutual intelligibility is systematic. Of particular interest are the Slavic languages, which are “sufficiently similar and sufficiently different to provide an attractive research laboratory” (Corbett 1998). They exhibit practically all typologically attested means of encoding grammatical information, ranging from extremely dense to highly redundant constructions, and their development is the result of various language contact scenarios (Balkansprachbund, German influence on West Slavic languages, Finno-Ugric substratum in East Slavic languages etc.).

@inproceedings{tania2015empirical,
title = {The Empirical Basis of Slavic Intercomprehension},
author = {Tania Avgustinova and Andrea Fischer and Kl{\'a}ra J{\'a}grov{\'a} and Irina Stenger},
url = {https://www.bibsonomy.org/bibtex/187b1c53b1bad76027e0a305d2a6e2cce/sfb1102},
year = {2015},
date = {2015},
booktitle = {REMU},
address = {Joensuu, Finland},
abstract = {The possibility of intercomprehension between related languages is a generally accepted fact suggesting that mutual intelligibility is systematic. Of particular interest are the Slavic languages, which are “sufficiently similar and sufficiently different to provide an attractive research laboratory” (Corbett 1998). They exhibit practically all typologically attested means of encoding grammatical information, ranging from extremely dense to highly redundant constructions, and their development is the result of various language contact scenarios (Balkansprachbund, German influence on West Slavic languages, Finno-Ugric substratum in East Slavic languages etc.).},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Fischer, Andrea; Demberg, Vera; Klakow, Dietrich

Towards Flexible, Small-Domain Surface Generation: Combining Data-Driven and Grammatical Approaches Inproceedings

Proceedings of the 15th European Workshop on Natural Language Generation (ENLG), Association for Computational Linguistics, pp. 105-108, Brighton, England, UK, 2015.

As dialog systems are getting more and more ubiquitous, there is an increasing number of application domains for natural language generation, and generation objectives are getting more diverse (e.g., generating informationally dense vs. less complex utterances, as a function of target user and usage situation). Flexible generation is difficult and labourintensive with traditional template-based generation systems, while fully data-driven approaches may lead to less grammatical output, particularly if the measures used for generation objectives are correlated with measures of grammaticality. We here explore the combination of a data-driven approach with two very simple automatic grammar induction methods, basing its implementation on OpenCCG.

@inproceedings{fischer:demberg:klakow,
title = {Towards Flexible, Small-Domain Surface Generation: Combining Data-Driven and Grammatical Approaches},
author = {Andrea Fischer and Vera Demberg and Dietrich Klakow},
url = {https://www.aclweb.org/anthology/W15-4718/},
year = {2015},
date = {2015},
booktitle = {Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)},
pages = {105-108},
publisher = {Association for Computational Linguistics},
address = {Brighton, England, UK},
abstract = {As dialog systems are getting more and more ubiquitous, there is an increasing number of application domains for natural language generation, and generation objectives are getting more diverse (e.g., generating informationally dense vs. less complex utterances, as a function of target user and usage situation). Flexible generation is difficult and labourintensive with traditional template-based generation systems, while fully data-driven approaches may lead to less grammatical output, particularly if the measures used for generation objectives are correlated with measures of grammaticality. We here explore the combination of a data-driven approach with two very simple automatic grammar induction methods, basing its implementation on OpenCCG.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A4 C4

Tourtouri, Elli; Delogu, Francesca; Crocker, Matthew W.

ERP Indices of situated reference in visual contexts Journal Article

37th Annual Conference of the Cognitive Science Society, Austin, Texas, USA, 2015.

Violations of the maxims of Quantity occur when utterances provide more (over-specified) or less (under-specified) information than strictly required for referent identification. While behavioural data suggest that under-specified expressions lead to comprehension difficulty and communicative failure, there is no consensus as to whether over-specified expressions are also detrimental to comprehension. In this study we shed light on this debate, providing neurophysiological evidence supporting the view that extra information facilitates comprehension. We further present novel evidence that referential failure due to underspecification is qualitatively different from explicit cases of referential failure, when no matching referential candidate is available in the context.

@article{Tourtouri2015,
title = {ERP Indices of situated reference in visual contexts},
author = {Elli Tourtouri and Francesca Delogu and Matthew W. Crocker},
url = {https://www.researchgate.net/publication/312296322_ERP_indices_of_situated_reference_in_visual_contexts},
year = {2015},
date = {2015},
publisher = {37th Annual Conference of the Cognitive Science Society},
address = {Austin, Texas, USA},
abstract = {Violations of the maxims of Quantity occur when utterances provide more (over-specified) or less (under-specified) information than strictly required for referent identification. While behavioural data suggest that under-specified expressions lead to comprehension difficulty and communicative failure, there is no consensus as to whether over-specified expressions are also detrimental to comprehension. In this study we shed light on this debate, providing neurophysiological evidence supporting the view that extra information facilitates comprehension. We further present novel evidence that referential failure due to underspecification is qualitatively different from explicit cases of referential failure, when no matching referential candidate is available in the context.},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   C3

Schulz, Erika; Malisz, Zofia; Andreeva, Bistra; Möbius, Bernd

Einfluss von Informationsdichte und prosodischer Struktur auf Vokalraumausdehnung Inproceedings

Phonetik und Phonologie 11, Marburg, 2015.

Vokalraumausdehnung wird von mehreren Faktoren bestimmt, z. B. von Geschlecht (Simpson und Ericsdotter 2007), Sprechstil (Bradlow, Kraus und Hayes 2003), Prosodie (Bergem 1993), Sprechgeschwindigkeit (Weirich und Simpson 2014) oder phonologischer Nachbarschaftsdichte (Munson und Solomon 2004). Auch Sprachredundanz kann als Prädiktor spektraler Ausprägung von Vokalen dienen (Aylett und Turk 2006). Diese Studie untersucht den Einfluss von Informationsdichte und prosodischen Strukturen auf Vokalraumausdehnung in Französisch, Deutsch, Amerikanischem Englisch und Finnisch.

@inproceedings{pundp11,
title = {Einfluss von Informationsdichte und prosodischer Struktur auf Vokalraumausdehnung},
author = {Erika Schulz and Zofia Malisz and Bistra Andreeva and Bernd M{\"o}bius},
url = {https://www.online.uni-marburg.de/pundp11/talks/Schulz_etal.pdf},
year = {2015},
date = {2015},
booktitle = {Phonetik und Phonologie 11},
address = {Marburg},
abstract = {Vokalraumausdehnung wird von mehreren Faktoren bestimmt, z. B. von Geschlecht (Simpson und Ericsdotter 2007), Sprechstil (Bradlow, Kraus und Hayes 2003), Prosodie (Bergem 1993), Sprechgeschwindigkeit (Weirich und Simpson 2014) oder phonologischer Nachbarschaftsdichte (Munson und Solomon 2004). Auch Sprachredundanz kann als Pr{\"a}diktor spektraler Auspr{\"a}gung von Vokalen dienen (Aylett und Turk 2006). Diese Studie untersucht den Einfluss von Informationsdichte und prosodischen Strukturen auf Vokalraumausdehnung in Franz{\"o}sisch, Deutsch, Amerikanischem Englisch und Finnisch.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C1

Oualil, Youssef; Schulder, Marc; Helmke, Hartmut; Schmidt, Anna; Klakow, Dietrich

Real-Time Integration of Dynamic Context Information for Improving Automatic Speech Recognition Inproceedings

INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015.
The use of prior situational/contextual knowledge about a given task can significantly improve automatic speech recognition (ASR) performance. This is typically done through adaptation of acoustic or language models if data is available or using knowledge-based rescoring. The main adaptation techniques, however, are either domain-specific, which makes them inadequate for other tasks, or static and offline, and therefore cannot deal with dynamic knowledge. To circumvent this problem, we propose a real-time system which dynamically integrates situational context into ASR. The context integration is done either post-recognition, in which case a weighted Levenshtein distance between the ASR hypotheses and the context information based on the ASR confidence scores is proposed to extract the most likely sequence of spoken words, or pre-recognition, where the search space is adjusted to the new situational knowledge through adaptation of the finite state machine modeling the spoken language. Experiments conducted on 3 hours of Air Traffic Control (ATC) data achieved a 51% reduction of the Command Error Rate (CmdER) which is used as evaluation metric in the ATC domain.

@inproceedings{youalil_interspeech_2015,
title = {Real-Time Integration of Dynamic Context Information for Improving Automatic Speech Recognition},
author = {Youssef Oualil and Marc Schulder and Hartmut Helmke and Anna Schmidt and Dietrich Klakow},
url = {https://core.ac.uk/display/31018097},
year = {2015},
date = {2015},
booktitle = {INTERSPEECH 2015, 16th Annual Conference of the International Speech Communication Association, Dresden, Germany},
abstract = {

The use of prior situational/contextual knowledge about a given task can significantly improve automatic speech recognition (ASR) performance. This is typically done through adaptation of acoustic or language models if data is available or using knowledge-based rescoring. The main adaptation techniques, however, are either domain-specific, which makes them inadequate for other tasks, or static and offline, and therefore cannot deal with dynamic knowledge. To circumvent this problem, we propose a real-time system which dynamically integrates situational context into ASR. The context integration is done either post-recognition, in which case a weighted Levenshtein distance between the ASR hypotheses and the context information based on the ASR confidence scores is proposed to extract the most likely sequence of spoken words, or pre-recognition, where the search space is adjusted to the new situational knowledge through adaptation of the finite state machine modeling the spoken language. Experiments conducted on 3 hours of Air Traffic Control (ATC) data achieved a 51% reduction of the Command Error Rate (CmdER) which is used as evaluation metric in the ATC domain.
},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B4

Greenberg, Clayton; Sayeed, Asad; Demberg, Vera

Improving Unsupervised Vector-Space Thematic Fit Evaluation via Role-Filler Prototype Clustering Inproceedings

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, pp. 21-31, Denver, Colorado, 2015.

Most recent unsupervised methods in vector space semantics for assessing thematic fit (e.g. Erk, 2007; Baroni and Lenci, 2010; Sayeed and Demberg, 2014) create prototypical rolefillers without performing word sense disambiguation. This leads to a kind of sparsity problem: candidate role-fillers for different senses of the verb end up being measured by the same “yardstick”, the single prototypical role-filler.

In this work, we use three different feature spaces to construct robust unsupervised models of distributional semantics. We show that correlation with human judgements on thematic fit estimates can be improved consistently by clustering typical role-fillers and then calculating similarities of candidate rolefillers with these cluster centroids. The suggested methods can be used in any vector space model that constructs a prototype vector from a non-trivial set of typical vectors

@inproceedings{greenberg-sayeed-demberg:2015:NAACL-HLT,
title = {Improving Unsupervised Vector-Space Thematic Fit Evaluation via Role-Filler Prototype Clustering},
author = {Clayton Greenberg and Asad Sayeed and Vera Demberg},
url = {http://www.aclweb.org/anthology/N15-1003},
year = {2015},
date = {2015},
booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
pages = {21-31},
publisher = {Association for Computational Linguistics},
address = {Denver, Colorado},
abstract = {Most recent unsupervised methods in vector space semantics for assessing thematic fit (e.g. Erk, 2007; Baroni and Lenci, 2010; Sayeed and Demberg, 2014) create prototypical rolefillers without performing word sense disambiguation. This leads to a kind of sparsity problem: candidate role-fillers for different senses of the verb end up being measured by the same “yardstick”, the single prototypical role-filler. In this work, we use three different feature spaces to construct robust unsupervised models of distributional semantics. We show that correlation with human judgements on thematic fit estimates can be improved consistently by clustering typical role-fillers and then calculating similarities of candidate rolefillers with these cluster centroids. The suggested methods can be used in any vector space model that constructs a prototype vector from a non-trivial set of typical vectors},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   B2 B4

Sayeed, Asad; Demberg, Vera; Shkadzko, Pavel

An Exploration of Semantic Features in an Unsupervised Thematic Fit Evaluation Framework Conference

IJCoL: Emerging Topics at the First Italian Conference on Computational Linguistics, 1, 2015.

Thematic fit is the extent to which an entity fits a thematic role in the semantic frame of an event, e.g., how well humans would rate “knife” as an instrument of an event of cutting. We explore the use of the SENNA semantic role-labeller in defining a distributional space in order to build an unsupervised model of event-entity thematic fit judgements. We test a number of ways of extracting features from SENNA-labelled versions of the ukWaC and BNC corpora and identify tradeoffs. Some of our Distributional Memory models outperform an existing syntax-based model (TypeDM) that uses hand-crafted rules for role inference on a previously tested data set. We combine the results of a selected SENNA-based model with TypeDM’s results and find that there is some amount of complementarity in what a syntactic and a semantic model will cover. In the process, we create a broad-coverage semantically-labelled corpus.

@conference{sayeed:demberg:shkadzko:2015:IJCOL,
title = {An Exploration of Semantic Features in an Unsupervised Thematic Fit Evaluation Framework},
author = {Asad Sayeed and Vera Demberg and Pavel Shkadzko},
url = {https://journals.openedition.org/ijcol/298},
year = {2015},
date = {2015},
booktitle = {IJCoL: Emerging Topics at the First Italian Conference on Computational Linguistics},
abstract = {

Thematic fit is the extent to which an entity fits a thematic role in the semantic frame of an event, e.g., how well humans would rate “knife” as an instrument of an event of cutting. We explore the use of the SENNA semantic role-labeller in defining a distributional space in order to build an unsupervised model of event-entity thematic fit judgements. We test a number of ways of extracting features from SENNA-labelled versions of the ukWaC and BNC corpora and identify tradeoffs. Some of our Distributional Memory models outperform an existing syntax-based model (TypeDM) that uses hand-crafted rules for role inference on a previously tested data set. We combine the results of a selected SENNA-based model with TypeDM’s results and find that there is some amount of complementarity in what a syntactic and a semantic model will cover. In the process, we create a broad-coverage semantically-labelled corpus.

},
pubstate = {published},
type = {conference}
}

Copy BibTeX to Clipboard

Project:   B2

Demberg, Vera; Torabi Asr, Fatemeh; Rohde, Hannah

Discourse Expectations Raised by Contrastive Connectives Inproceedings

Conference on Discourse Expectations: Theoretical, Experimental, and Computational Perspectives (DETEC), 2015.

Markers of negative polarity discourse relations, such as but, although and on the one hand… on the other hand have been shown to induce more processing difficulty than additive or causal markers (e.g., Murray, 1995), and to facilitate the processing of upcoming content (e.g., Köhne & Demberg, 2013). These markers have been argued to shape comprehenders‘ discourse expectations in a way that differs from what comprehenders would expect in the absence of such markers (Murray, 1995; Köhne & Demberg, 2013; Xiang & Kuperberg, 2014). Here, we present two studies on the nature of the expectations elicited by negative polarity connectives, addressing three primary questions: (i) How specific are the expectations elicited by ambiguous connectors such as but and although? (ii) Do the discourse expectations raised by a connective like on the one hand target any contrast or specifically on the other hand? (iii) Are expectations sensitive to discourse structure?

@inproceedings{demberg2015contrastive,
title = {Discourse Expectations Raised by Contrastive Connectives},
author = {Vera Demberg and Fatemeh Torabi Asr and Hannah Rohde},
url = {https://detec2015.files.wordpress.com/2015/05/demberg.pdf},
year = {2015},
date = {2015},
booktitle = {Conference on Discourse Expectations: Theoretical, Experimental, and Computational Perspectives (DETEC)},
abstract = {Markers of negative polarity discourse relations, such as but, although and on the one hand... on the other hand have been shown to induce more processing difficulty than additive or causal markers (e.g., Murray, 1995), and to facilitate the processing of upcoming content (e.g., K{\"o}hne & Demberg, 2013). These markers have been argued to shape comprehenders' discourse expectations in a way that differs from what comprehenders would expect in the absence of such markers (Murray, 1995; K{\"o}hne & Demberg, 2013; Xiang & Kuperberg, 2014). Here, we present two studies on the nature of the expectations elicited by negative polarity connectives, addressing three primary questions: (i) How specific are the expectations elicited by ambiguous connectors such as but and although? (ii) Do the discourse expectations raised by a connective like on the one hand target any contrast or specifically on the other hand? (iii) Are expectations sensitive to discourse structure?},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Torabi Asr, Fatemeh; Demberg, Vera

A Discourse Connector's Distribution Determines Its Interpretation Inproceedings

The 28th CUNY Conference on Human Sentence Processing 2015, 2015.

Many connectives, such as but and although, can be used to mark very similar sets of relations, see Table 1. Fraser 1999 proposes that each connective has a core meaning and that a more specific discourse relation will be inferred from the content of the involved clauses. This implies that connectives which can mark the same relations have the same core meaning, and that alternating between two such connectors should not change the meaning of the discourse. A fully distributional account (Asr & Demberg 2013), on the other hand, describes the information content of a connective based on its usage patterns. This means that a connective may even have different meanings in different sentence positions (i.e. when used sentenceinitially vs. between its arguments). This study shows how the fine-grained differences in the distribution of but vs. although vs. sentence-initial although affect text coherence. We created stories consisting of three sentences (see below) and normed them such that the first two sentences were equally acceptable in all conditions. The design was fully counter-balanced.

@inproceedings{asr2015interpretation,
title = {A Discourse Connector's Distribution Determines Its Interpretation},
author = {Fatemeh Torabi Asr and Vera Demberg},
url = {https://www.coli.uni-saarland.de/~fatemeh/CUNY2015_abstract.pdf},
year = {2015},
date = {2015},
booktitle = {The 28th CUNY Conference on Human Sentence Processing 2015},
abstract = {Many connectives, such as but and although, can be used to mark very similar sets of relations, see Table 1. Fraser 1999 proposes that each connective has a core meaning and that a more specific discourse relation will be inferred from the content of the involved clauses. This implies that connectives which can mark the same relations have the same core meaning, and that alternating between two such connectors should not change the meaning of the discourse. A fully distributional account (Asr & Demberg 2013), on the other hand, describes the information content of a connective based on its usage patterns. This means that a connective may even have different meanings in different sentence positions (i.e. when used sentenceinitially vs. between its arguments). This study shows how the fine-grained differences in the distribution of but vs. although vs. sentence-initial although affect text coherence. We created stories consisting of three sentences (see below) and normed them such that the first two sentences were equally acceptable in all conditions. The design was fully counter-balanced.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Torabi Asr, Fatemeh

An Information Theoretic Approach to Production and Comprehension of Discourse Markers PhD Thesis

Saarland University, Saarbruecken, Germany, 2015.

Discourse relations are the building blocks of a coherent text. The most important linguistic elements for constructing these relations are discourse markers. The presence of a discourse marker between two discourse segments provides information on the inferences that need to be made for interpretation of the two segments as a whole (e.g., because marks a reason).

This thesis presents a new framework for studying human communication at the level of discourse by adapting ideas from information theory. A discourse marker is viewed as a symbol with a measurable amount of relational information. This information is communicated by the writer of a text to guide the reader towards the right semantic decoding. To examine the information theoretic account of discourse markers, we conduct empirical corpus-based investigations, offline crowd-sourced studies and online laboratory experiments. The thesis contributes to computational linguistics by proposing a quantitative meaning representation for discourse markers and showing its advantages over the classic descriptive approaches. For the first time, we show that readers are very sensitive to the fine-grained information encoded in a discourse marker obtained from its natural usage and that writers use explicit marking for less expected relations in terms of linguistic and cognitive predictability. These findings open new directions for implementation of advanced natural language processing systems.

@phdthesis{BentPhd05,
title = {An Information Theoretic Approach to Production and Comprehension of Discourse Markers},
author = {Fatemeh Torabi Asr},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/26688},
year = {2015},
date = {2015},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {Discourse relations are the building blocks of a coherent text. The most important linguistic elements for constructing these relations are discourse markers. The presence of a discourse marker between two discourse segments provides information on the inferences that need to be made for interpretation of the two segments as a whole (e.g., because marks a reason). This thesis presents a new framework for studying human communication at the level of discourse by adapting ideas from information theory. A discourse marker is viewed as a symbol with a measurable amount of relational information. This information is communicated by the writer of a text to guide the reader towards the right semantic decoding. To examine the information theoretic account of discourse markers, we conduct empirical corpus-based investigations, offline crowd-sourced studies and online laboratory experiments. The thesis contributes to computational linguistics by proposing a quantitative meaning representation for discourse markers and showing its advantages over the classic descriptive approaches. For the first time, we show that readers are very sensitive to the fine-grained information encoded in a discourse marker obtained from its natural usage and that writers use explicit marking for less expected relations in terms of linguistic and cognitive predictability. These findings open new directions for implementation of advanced natural language processing systems.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   B2

Sayeed, Asad

Representing the Effort in Resolving Ambiguous Scope Inproceedings

Sinn und Bedeutung 20, Tübingen, Germany, 2015.
This work proposes a way to formally model online scope interpretation in terms of recent experimental results. Specifically, it attempts to reconcile underspecified representations of semantic processing with results that show that there are higher-order dependencies between relative quantifier scope orderings that the processor may assert. It proposes a constrained data structure and movement operator that provides just enough specification to allow these higher-order dependencies to be represented. The operation reflects regression probabilities in one of the cited experiments.

@inproceedings{SuB2015,
title = {Representing the Effort in Resolving Ambiguous Scope},
author = {Asad Sayeed},
url = {https://ojs.ub.uni-konstanz.de/sub/index.php/sub/article/view/284},
year = {2015},
date = {2015},
booktitle = {Sinn und Bedeutung 20},
address = {T{\"u}bingen, Germany},
abstract = {

This work proposes a way to formally model online scope interpretation in terms of recent experimental results. Specifically, it attempts to reconcile underspecified representations of semantic processing with results that show that there are higher-order dependencies between relative quantifier scope orderings that the processor may assert. It proposes a constrained data structure and movement operator that provides just enough specification to allow these higher-order dependencies to be represented. The operation reflects regression probabilities in one of the cited experiments.
},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Torabi Asr, Fatemeh; Demberg, Vera

A Distributional Account of Discourse Connectives and its Effect on Fine-grained Inferences Inproceedings

Text-link Conference, Louvain, Belgium, 2015.

@inproceedings{asr2015distributionalApproach,
title = {A Distributional Account of Discourse Connectives and its Effect on Fine-grained Inferences},
author = {Fatemeh Torabi Asr and Vera Demberg},
year = {2015},
date = {2015},
booktitle = {Text-link Conference},
address = {Louvain, Belgium},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Howcroft, David M.; White, Michael

Inducing Clause-Combining Operations for Natural Language Generation Inproceedings

Proc. of the 1st International Workshop on Data-to-Text Generation, Edinburgh, Scotland, UK, 2015.

@inproceedings{howcroft:white:d2t-2015,
title = {Inducing Clause-Combining Operations for Natural Language Generation},
author = {David M. Howcroft and Michael White},
url = {http://www.macs.hw.ac.uk/InteractionLab/d2t/papers/d2t_HowcroftWhite},
year = {2015},
date = {2015},
booktitle = {Proc. of the 1st International Workshop on Data-to-Text Generation},
address = {Edinburgh, Scotland, UK},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Vogels, Jorrig; Demberg, Vera; Kray, Jutta

Cognitive Load and Individual Differences in Multitasking Abilities Conference

Workshop on Individual differences in language processing across the adult life span, 2015.

@conference{vogelsjorrig2015cognitive,
title = {Cognitive Load and Individual Differences in Multitasking Abilities},
author = {Jorrig Vogels and Vera Demberg and Jutta Kray},
url = {https://www.bibsonomy.org/bibtex/222f7284011f7023bd8095b6b554278d3/sfb1102},
year = {2015},
date = {2015},
booktitle = {Workshop on Individual differences in language processing across the adult life span},
pubstate = {published},
type = {conference}
}

Copy BibTeX to Clipboard

Project:   A4

Successfully