Publications

Greenberg, Clayton; Sayeed, Asad; Demberg, Vera

Improving Unsupervised Vector-Space Thematic Fit Evaluation via Role-Filler Prototype Clustering Inproceedings

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, pp. 21-31, Denver, Colorado, 2015.

Most recent unsupervised methods in vector space semantics for assessing thematic fit (e.g. Erk, 2007; Baroni and Lenci, 2010; Sayeed and Demberg, 2014) create prototypical rolefillers without performing word sense disambiguation. This leads to a kind of sparsity problem: candidate role-fillers for different senses of the verb end up being measured by the same “yardstick”, the single prototypical role-filler.

In this work, we use three different feature spaces to construct robust unsupervised models of distributional semantics. We show that correlation with human judgements on thematic fit estimates can be improved consistently by clustering typical role-fillers and then calculating similarities of candidate rolefillers with these cluster centroids. The suggested methods can be used in any vector space model that constructs a prototype vector from a non-trivial set of typical vectors

@inproceedings{greenberg-sayeed-demberg:2015:NAACL-HLT,
title = {Improving Unsupervised Vector-Space Thematic Fit Evaluation via Role-Filler Prototype Clustering},
author = {Clayton Greenberg and Asad Sayeed and Vera Demberg},
url = {http://www.aclweb.org/anthology/N15-1003},
year = {2015},
date = {2015},
booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
pages = {21-31},
publisher = {Association for Computational Linguistics},
address = {Denver, Colorado},
abstract = {Most recent unsupervised methods in vector space semantics for assessing thematic fit (e.g. Erk, 2007; Baroni and Lenci, 2010; Sayeed and Demberg, 2014) create prototypical rolefillers without performing word sense disambiguation. This leads to a kind of sparsity problem: candidate role-fillers for different senses of the verb end up being measured by the same “yardstick”, the single prototypical role-filler. In this work, we use three different feature spaces to construct robust unsupervised models of distributional semantics. We show that correlation with human judgements on thematic fit estimates can be improved consistently by clustering typical role-fillers and then calculating similarities of candidate rolefillers with these cluster centroids. The suggested methods can be used in any vector space model that constructs a prototype vector from a non-trivial set of typical vectors},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   B2 B4

Sayeed, Asad; Demberg, Vera; Shkadzko, Pavel

An Exploration of Semantic Features in an Unsupervised Thematic Fit Evaluation Framework Conference

IJCoL: Emerging Topics at the First Italian Conference on Computational Linguistics, 1, 2015.

Thematic fit is the extent to which an entity fits a thematic role in the semantic frame of an event, e.g., how well humans would rate “knife” as an instrument of an event of cutting. We explore the use of the SENNA semantic role-labeller in defining a distributional space in order to build an unsupervised model of event-entity thematic fit judgements. We test a number of ways of extracting features from SENNA-labelled versions of the ukWaC and BNC corpora and identify tradeoffs. Some of our Distributional Memory models outperform an existing syntax-based model (TypeDM) that uses hand-crafted rules for role inference on a previously tested data set. We combine the results of a selected SENNA-based model with TypeDM’s results and find that there is some amount of complementarity in what a syntactic and a semantic model will cover. In the process, we create a broad-coverage semantically-labelled corpus.

@conference{sayeed:demberg:shkadzko:2015:IJCOL,
title = {An Exploration of Semantic Features in an Unsupervised Thematic Fit Evaluation Framework},
author = {Asad Sayeed and Vera Demberg and Pavel Shkadzko},
url = {https://journals.openedition.org/ijcol/298},
year = {2015},
date = {2015},
booktitle = {IJCoL: Emerging Topics at the First Italian Conference on Computational Linguistics},
abstract = {

Thematic fit is the extent to which an entity fits a thematic role in the semantic frame of an event, e.g., how well humans would rate “knife” as an instrument of an event of cutting. We explore the use of the SENNA semantic role-labeller in defining a distributional space in order to build an unsupervised model of event-entity thematic fit judgements. We test a number of ways of extracting features from SENNA-labelled versions of the ukWaC and BNC corpora and identify tradeoffs. Some of our Distributional Memory models outperform an existing syntax-based model (TypeDM) that uses hand-crafted rules for role inference on a previously tested data set. We combine the results of a selected SENNA-based model with TypeDM’s results and find that there is some amount of complementarity in what a syntactic and a semantic model will cover. In the process, we create a broad-coverage semantically-labelled corpus.

},
pubstate = {published},
type = {conference}
}

Copy BibTeX to Clipboard

Project:   B2

Demberg, Vera; Torabi Asr, Fatemeh; Rohde, Hannah

Discourse Expectations Raised by Contrastive Connectives Inproceedings

Conference on Discourse Expectations: Theoretical, Experimental, and Computational Perspectives (DETEC), 2015.

Markers of negative polarity discourse relations, such as but, although and on the one hand… on the other hand have been shown to induce more processing difficulty than additive or causal markers (e.g., Murray, 1995), and to facilitate the processing of upcoming content (e.g., Köhne & Demberg, 2013). These markers have been argued to shape comprehenders‘ discourse expectations in a way that differs from what comprehenders would expect in the absence of such markers (Murray, 1995; Köhne & Demberg, 2013; Xiang & Kuperberg, 2014). Here, we present two studies on the nature of the expectations elicited by negative polarity connectives, addressing three primary questions: (i) How specific are the expectations elicited by ambiguous connectors such as but and although? (ii) Do the discourse expectations raised by a connective like on the one hand target any contrast or specifically on the other hand? (iii) Are expectations sensitive to discourse structure?

@inproceedings{demberg2015contrastive,
title = {Discourse Expectations Raised by Contrastive Connectives},
author = {Vera Demberg and Fatemeh Torabi Asr and Hannah Rohde},
url = {https://detec2015.files.wordpress.com/2015/05/demberg.pdf},
year = {2015},
date = {2015},
booktitle = {Conference on Discourse Expectations: Theoretical, Experimental, and Computational Perspectives (DETEC)},
abstract = {Markers of negative polarity discourse relations, such as but, although and on the one hand... on the other hand have been shown to induce more processing difficulty than additive or causal markers (e.g., Murray, 1995), and to facilitate the processing of upcoming content (e.g., K{\"o}hne & Demberg, 2013). These markers have been argued to shape comprehenders' discourse expectations in a way that differs from what comprehenders would expect in the absence of such markers (Murray, 1995; K{\"o}hne & Demberg, 2013; Xiang & Kuperberg, 2014). Here, we present two studies on the nature of the expectations elicited by negative polarity connectives, addressing three primary questions: (i) How specific are the expectations elicited by ambiguous connectors such as but and although? (ii) Do the discourse expectations raised by a connective like on the one hand target any contrast or specifically on the other hand? (iii) Are expectations sensitive to discourse structure?},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Torabi Asr, Fatemeh; Demberg, Vera

A Discourse Connector's Distribution Determines Its Interpretation Inproceedings

The 28th CUNY Conference on Human Sentence Processing 2015, 2015.

Many connectives, such as but and although, can be used to mark very similar sets of relations, see Table 1. Fraser 1999 proposes that each connective has a core meaning and that a more specific discourse relation will be inferred from the content of the involved clauses. This implies that connectives which can mark the same relations have the same core meaning, and that alternating between two such connectors should not change the meaning of the discourse. A fully distributional account (Asr & Demberg 2013), on the other hand, describes the information content of a connective based on its usage patterns. This means that a connective may even have different meanings in different sentence positions (i.e. when used sentenceinitially vs. between its arguments). This study shows how the fine-grained differences in the distribution of but vs. although vs. sentence-initial although affect text coherence. We created stories consisting of three sentences (see below) and normed them such that the first two sentences were equally acceptable in all conditions. The design was fully counter-balanced.

@inproceedings{asr2015interpretation,
title = {A Discourse Connector's Distribution Determines Its Interpretation},
author = {Fatemeh Torabi Asr and Vera Demberg},
url = {https://www.coli.uni-saarland.de/~fatemeh/CUNY2015_abstract.pdf},
year = {2015},
date = {2015},
booktitle = {The 28th CUNY Conference on Human Sentence Processing 2015},
abstract = {Many connectives, such as but and although, can be used to mark very similar sets of relations, see Table 1. Fraser 1999 proposes that each connective has a core meaning and that a more specific discourse relation will be inferred from the content of the involved clauses. This implies that connectives which can mark the same relations have the same core meaning, and that alternating between two such connectors should not change the meaning of the discourse. A fully distributional account (Asr & Demberg 2013), on the other hand, describes the information content of a connective based on its usage patterns. This means that a connective may even have different meanings in different sentence positions (i.e. when used sentenceinitially vs. between its arguments). This study shows how the fine-grained differences in the distribution of but vs. although vs. sentence-initial although affect text coherence. We created stories consisting of three sentences (see below) and normed them such that the first two sentences were equally acceptable in all conditions. The design was fully counter-balanced.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Torabi Asr, Fatemeh

An Information Theoretic Approach to Production and Comprehension of Discourse Markers PhD Thesis

Saarland University, Saarbruecken, Germany, 2015.

Discourse relations are the building blocks of a coherent text. The most important linguistic elements for constructing these relations are discourse markers. The presence of a discourse marker between two discourse segments provides information on the inferences that need to be made for interpretation of the two segments as a whole (e.g., because marks a reason).

This thesis presents a new framework for studying human communication at the level of discourse by adapting ideas from information theory. A discourse marker is viewed as a symbol with a measurable amount of relational information. This information is communicated by the writer of a text to guide the reader towards the right semantic decoding. To examine the information theoretic account of discourse markers, we conduct empirical corpus-based investigations, offline crowd-sourced studies and online laboratory experiments. The thesis contributes to computational linguistics by proposing a quantitative meaning representation for discourse markers and showing its advantages over the classic descriptive approaches. For the first time, we show that readers are very sensitive to the fine-grained information encoded in a discourse marker obtained from its natural usage and that writers use explicit marking for less expected relations in terms of linguistic and cognitive predictability. These findings open new directions for implementation of advanced natural language processing systems.

@phdthesis{BentPhd05,
title = {An Information Theoretic Approach to Production and Comprehension of Discourse Markers},
author = {Fatemeh Torabi Asr},
url = {https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/26688},
year = {2015},
date = {2015},
school = {Saarland University},
address = {Saarbruecken, Germany},
abstract = {Discourse relations are the building blocks of a coherent text. The most important linguistic elements for constructing these relations are discourse markers. The presence of a discourse marker between two discourse segments provides information on the inferences that need to be made for interpretation of the two segments as a whole (e.g., because marks a reason). This thesis presents a new framework for studying human communication at the level of discourse by adapting ideas from information theory. A discourse marker is viewed as a symbol with a measurable amount of relational information. This information is communicated by the writer of a text to guide the reader towards the right semantic decoding. To examine the information theoretic account of discourse markers, we conduct empirical corpus-based investigations, offline crowd-sourced studies and online laboratory experiments. The thesis contributes to computational linguistics by proposing a quantitative meaning representation for discourse markers and showing its advantages over the classic descriptive approaches. For the first time, we show that readers are very sensitive to the fine-grained information encoded in a discourse marker obtained from its natural usage and that writers use explicit marking for less expected relations in terms of linguistic and cognitive predictability. These findings open new directions for implementation of advanced natural language processing systems.},
pubstate = {published},
type = {phdthesis}
}

Copy BibTeX to Clipboard

Project:   B2

Sayeed, Asad

Representing the Effort in Resolving Ambiguous Scope Inproceedings

Sinn und Bedeutung 20, Tübingen, Germany, 2015.
This work proposes a way to formally model online scope interpretation in terms of recent experimental results. Specifically, it attempts to reconcile underspecified representations of semantic processing with results that show that there are higher-order dependencies between relative quantifier scope orderings that the processor may assert. It proposes a constrained data structure and movement operator that provides just enough specification to allow these higher-order dependencies to be represented. The operation reflects regression probabilities in one of the cited experiments.

@inproceedings{SuB2015,
title = {Representing the Effort in Resolving Ambiguous Scope},
author = {Asad Sayeed},
url = {https://ojs.ub.uni-konstanz.de/sub/index.php/sub/article/view/284},
year = {2015},
date = {2015},
booktitle = {Sinn und Bedeutung 20},
address = {T{\"u}bingen, Germany},
abstract = {

This work proposes a way to formally model online scope interpretation in terms of recent experimental results. Specifically, it attempts to reconcile underspecified representations of semantic processing with results that show that there are higher-order dependencies between relative quantifier scope orderings that the processor may assert. It proposes a constrained data structure and movement operator that provides just enough specification to allow these higher-order dependencies to be represented. The operation reflects regression probabilities in one of the cited experiments.
},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Torabi Asr, Fatemeh; Demberg, Vera

A Distributional Account of Discourse Connectives and its Effect on Fine-grained Inferences Inproceedings

Text-link Conference, Louvain, Belgium, 2015.

@inproceedings{asr2015distributionalApproach,
title = {A Distributional Account of Discourse Connectives and its Effect on Fine-grained Inferences},
author = {Fatemeh Torabi Asr and Vera Demberg},
year = {2015},
date = {2015},
booktitle = {Text-link Conference},
address = {Louvain, Belgium},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Howcroft, David M.; White, Michael

Inducing Clause-Combining Operations for Natural Language Generation Inproceedings

Proc. of the 1st International Workshop on Data-to-Text Generation, Edinburgh, Scotland, UK, 2015.

@inproceedings{howcroft:white:d2t-2015,
title = {Inducing Clause-Combining Operations for Natural Language Generation},
author = {David M. Howcroft and Michael White},
url = {http://www.macs.hw.ac.uk/InteractionLab/d2t/papers/d2t_HowcroftWhite},
year = {2015},
date = {2015},
booktitle = {Proc. of the 1st International Workshop on Data-to-Text Generation},
address = {Edinburgh, Scotland, UK},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Vogels, Jorrig; Demberg, Vera; Kray, Jutta

Cognitive Load and Individual Differences in Multitasking Abilities Conference

Workshop on Individual differences in language processing across the adult life span, 2015.

@conference{vogelsjorrig2015cognitive,
title = {Cognitive Load and Individual Differences in Multitasking Abilities},
author = {Jorrig Vogels and Vera Demberg and Jutta Kray},
url = {https://www.bibsonomy.org/bibtex/222f7284011f7023bd8095b6b554278d3/sfb1102},
year = {2015},
date = {2015},
booktitle = {Workshop on Individual differences in language processing across the adult life span},
pubstate = {published},
type = {conference}
}

Copy BibTeX to Clipboard

Project:   A4

Kravtchenko, Ekaterina; Demberg, Vera

Underinformative Event Mentions Trigger Context-Dependent Implicatures Inproceedings

Talk presented at Formal and Experimental Pragmatics: Methodological Issues of a Nascent Liaison (MXPRAG), Zentrum für Allgemeine Sprachwissenschaft (ZAS), Berlin, June 2015, 2015.

@inproceedings{Kravtchenko2015b,
title = {Underinformative Event Mentions Trigger Context-Dependent Implicatures},
author = {Ekaterina Kravtchenko and Vera Demberg},
year = {2015},
date = {2015-10-17},
booktitle = {Talk presented at Formal and Experimental Pragmatics: Methodological Issues of a Nascent Liaison (MXPRAG)},
publisher = {Zentrum f{\"u}r Allgemeine Sprachwissenschaft (ZAS), Berlin, June 2015},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A4

Zarcone, Alessandra; Padó, Sebastian; Lenci, Alessandro

Same Same but Different: Type and Typicality in a Distributional Model of Complement Coercion Inproceedings

Word Structure and Word Usage. Proceedings of the NetWordS Final Conference. Pisa, March 30-April 1, 2015, pp. 91-94, Pisa, Italy, 2015.
We aim to model the results from a selfpaced reading experiment, which tested the effect of semantic type clash and typicality on the processing of German complement coercion. We present two distributional semantic models to test if they can model the effect of both type and typicality in the psycholinguistic study. We show that one of the models, without explicitly representing type information, can account both for the effect of type and typicality in complement coercion.

@inproceedings{zarcone2015same,
title = {Same Same but Different: Type and Typicality in a Distributional Model of Complement Coercion},
author = {Alessandra Zarcone and Sebastian Padó and Alessandro Lenci},
url = {https://www.researchgate.net/publication/282740292_Same_same_but_different_Type_and_typicality_in_a_distributional_model_of_complement_coercion},
year = {2015},
date = {2015},
booktitle = {Word Structure and Word Usage. Proceedings of the NetWordS Final Conference. Pisa, March 30-April 1, 2015},
pages = {91-94},
address = {Pisa, Italy},
abstract = {

We aim to model the results from a selfpaced reading experiment, which tested the effect of semantic type clash and typicality on the processing of German complement coercion. We present two distributional semantic models to test if they can model the effect of both type and typicality in the psycholinguistic study. We show that one of the models, without explicitly representing type information, can account both for the effect of type and typicality in complement coercion.
},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A2 A3

Demberg, Vera; Hoffmann, Jörg; Howcroft, David M.; Klakow, Dietrich; Torralba, Álvaro

Search Challenges in Natural Language Generation with Complex Optimization Objectives Journal Article

KI - Künstliche Intelligenz, Special Issue on Companion Technologies, Springer Berlin Heidelberg, 2015, ISSN 1610-1987.

Automatic natural language generation (NLG) is a difficult problem already when merely trying to come up with natural-sounding utterances. Ubiquituous applications, in particular companion technologies, pose the additional challenge of flexible adaptation to a user or a situation. This requires optimizing complex objectives such as information density, in combinatorial search spaces described using declarative input languages. We believe that AI search and planning is a natural match for these problems, and could substantially contribute to solving them effectively. We illustrate this using a concrete example NLG framework, give a summary of the relevant optimization objectives, and provide an initial list of research challenges.

@article{demberg:hoffmann:ki-2015,
title = {Search Challenges in Natural Language Generation with Complex Optimization Objectives},
author = {Vera Demberg and J{\"o}rg Hoffmann and David M. Howcroft and Dietrich Klakow and {\'A}lvaro Torralba},
url = {https://link.springer.com/article/10.1007/s13218-015-0409-5},
year = {2015},
date = {2015},
journal = {KI - K{\"u}nstliche Intelligenz, Special Issue on Companion Technologies},
publisher = {Springer Berlin Heidelberg},
abstract = {

Automatic natural language generation (NLG) is a difficult problem already when merely trying to come up with natural-sounding utterances. Ubiquituous applications, in particular companion technologies, pose the additional challenge of flexible adaptation to a user or a situation. This requires optimizing complex objectives such as information density, in combinatorial search spaces described using declarative input languages. We believe that AI search and planning is a natural match for these problems, and could substantially contribute to solving them effectively. We illustrate this using a concrete example NLG framework, give a summary of the relevant optimization objectives, and provide an initial list of research challenges.
},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A3

Kravtchenko, Ekaterina; Demberg, Vera

Semantically Underinformative Utterances Trigger Pragmatic Inferences Proceeding

Annual Conference of the Cognitive Science Society (CogSci), Mind, Technology, and Society Pasadena Convention Center, 2015.
Most theories of pragmatics and language processing predict that speakers avoid informationally redundant utterances. From a processing standpoint, it remains unclear what happens when listeners encounter such utterances, and how they interpret them. We argue that uninformative utterances can trigger pragmatic inferences, which increase utterance utility in line with listener expectations. In this study, we look at utterances that refer to stereotyped event sequences describing common activities (scripts). Literature on processing of event sequences shows that people automatically infer component actions, once a script is ‘invoked.’ We demonstrate that when comprehenders encounter utterances describing events that can be easily inferred from prior context, they interpret them as signifying that the event conveys new, unstated information. We also suggest that formal models of language comprehension would have difficulty in accurately estimating the predictability or potential processing cost incurred by such utterances.

@proceeding{kravtchenko:demberg,
title = {Semantically Underinformative Utterances Trigger Pragmatic Inferences},
author = {Ekaterina Kravtchenko and Vera Demberg},
url = {https://www.semanticscholar.org/paper/Semantically-underinformative-utterances-trigger-Kravtchenko-Demberg/33256a5fca918eef5de8998db5a695d9bced5975},
year = {2015},
date = {2015-10-17},
booktitle = {Annual Conference of the Cognitive Science Society (CogSci)},
address = {Mind, Technology, and Society Pasadena Convention Center},
abstract = {

Most theories of pragmatics and language processing predict that speakers avoid informationally redundant utterances. From a processing standpoint, it remains unclear what happens when listeners encounter such utterances, and how they interpret them. We argue that uninformative utterances can trigger pragmatic inferences, which increase utterance utility in line with listener expectations. In this study, we look at utterances that refer to stereotyped event sequences describing common activities (scripts). Literature on processing of event sequences shows that people automatically infer component actions, once a script is ‘invoked.’ We demonstrate that when comprehenders encounter utterances describing events that can be easily inferred from prior context, they interpret them as signifying that the event conveys new, unstated information. We also suggest that formal models of language comprehension would have difficulty in accurately estimating the predictability or potential processing cost incurred by such utterances.
},
pubstate = {published},
type = {proceeding}
}

Copy BibTeX to Clipboard

Project:   A3

Batiukova, Olga; Bertinetto, Pier Marco; Lenci, Alessandro; Zarcone, Alessandra

Identifying Actional Features Through Semantic Priming: Cross-Romance Comparison Incollection

Taming the TAME systems. Cahiers Chronos 27, Rodopi, pp. 161-187, Amsterdam/Philadelphia, 2015.
This paper reports four semantic priming experiments in Italian and Spanish, whose goal was to verify the psychological reality of two aspectual features, resultativity and durativity. In the durativity task, the participants were asked whether the verb referred to a durable situation, in the resultativity task if it denoted a situation with a clear outcome. The results prove that both features are involved in online processing of the verb meaning: achievements ([+resultative, -durative]) and activities ([-resultative, +durative]) were processed faster in certain priming contexts. The priming patterns in the Romance languages present some striking similarities (only achievements were primed in the resultativity task) alongside some intriguing differences, and interestingly contrast with the behaviour of another language tested, Russian, whose aspectual system differs in significant ways.

@incollection{batiukova2015identifying,
title = {Identifying Actional Features Through Semantic Priming: Cross-Romance Comparison},
author = {Olga Batiukova and Pier Marco Bertinetto and Alessandro Lenci and Alessandra Zarcone},
url = {https://brill.com/display/book/edcoll/9789004292772/B9789004292772-s010.xml},
year = {2015},
date = {2015},
booktitle = {Taming the TAME systems. Cahiers Chronos 27},
pages = {161-187},
publisher = {Rodopi},
address = {Amsterdam/Philadelphia},
abstract = {

This paper reports four semantic priming experiments in Italian and Spanish, whose goal was to verify the psychological reality of two aspectual features, resultativity and durativity. In the durativity task, the participants were asked whether the verb referred to a durable situation, in the resultativity task if it denoted a situation with a clear outcome. The results prove that both features are involved in online processing of the verb meaning: achievements ([+resultative, -durative]) and activities ([-resultative, +durative]) were processed faster in certain priming contexts. The priming patterns in the Romance languages present some striking similarities (only achievements were primed in the resultativity task) alongside some intriguing differences, and interestingly contrast with the behaviour of another language tested, Russian, whose aspectual system differs in significant ways.
},
pubstate = {published},
type = {incollection}
}

Copy BibTeX to Clipboard

Projects:   A2 A3

Rudinger, Rachel; Demberg, Vera; Modi, Ashutosh; Van Durme, Benjamin; Pinkal, Manfred

Learning to Predict Script Events from Domain-Specific Text Journal Article

Lexical and Computational Semantics (* SEM 2015), pp. 205-210, 2015.

The automatic induction of scripts (Schank and Abelson, 1977) has been the focus of many recent works. In this paper, we employ a variety of these methods to learn Schank and Abelson’s canonical restaurant script, using a novel dataset of restaurant narratives we have compiled from a website called “Dinners from Hell.” Our models learn narrative chains, script-like structures that we evaluate with the “narrative cloze” task (Chambers and Jurafsky, 2008).

@article{rudinger2015learning,
title = {Learning to Predict Script Events from Domain-Specific Text},
author = {Rachel Rudinger and Vera Demberg and Ashutosh Modi and Benjamin Van Durme and Manfred Pinkal},
url = {http://www.aclweb.org/anthology/S15-1024},
year = {2015},
date = {2015},
journal = {Lexical and Computational Semantics (* SEM 2015)},
pages = {205-210},
abstract = {The automatic induction of scripts (Schank and Abelson, 1977) has been the focus of many recent works. In this paper, we employ a variety of these methods to learn Schank and Abelson’s canonical restaurant script, using a novel dataset of restaurant narratives we have compiled from a website called “Dinners from Hell.” Our models learn narrative chains, script-like structures that we evaluate with the “narrative cloze” task (Chambers and Jurafsky, 2008).},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A3

White, Michael; Howcroft, David M.

Inducing Clause-Combining Rules: A Case Study with the SPaRKy Restaurant Corpus Inproceedings

Proc. of the 15th European Workshop on Natural Language Generation, Association for Computational Linguistics, Brighton, England, UK, 2015.

We describe an algorithm for inducing clause-combining rules for use in a traditional natural language generation architecture. An experiment pairing lexicalized text plans from the SPaRKy Restaurant Corpus with logical forms obtained by parsing the corresponding sentences demonstrates that the approach is able to learn clause-combining operations which have essentially the same coverage as those used in the SPaRKy Restaurant Corpus. This paper fills a gap in the literature, showing that it is possible to learn microplanning rules for both aggregation and discourse connective insertion, an important step towards ameliorating the knowledge acquisition bottleneck for NLG systems that produce texts with rich discourse structures using traditional architectures.

@inproceedings{white:howcroft:enlg-2015,
title = {Inducing Clause-Combining Rules: A Case Study with the SPaRKy Restaurant Corpus},
author = {Michael White and David M. Howcroft},
url = {http://www.aclweb.org/anthology/W15-4704},
year = {2015},
date = {2015},
booktitle = {Proc. of the 15th European Workshop on Natural Language Generation},
publisher = {Association for Computational Linguistics},
address = {Brighton, England, UK},
abstract = {We describe an algorithm for inducing clause-combining rules for use in a traditional natural language generation architecture. An experiment pairing lexicalized text plans from the SPaRKy Restaurant Corpus with logical forms obtained by parsing the corresponding sentences demonstrates that the approach is able to learn clause-combining operations which have essentially the same coverage as those used in the SPaRKy Restaurant Corpus. This paper fills a gap in the literature, showing that it is possible to learn microplanning rules for both aggregation and discourse connective insertion, an important step towards ameliorating the knowledge acquisition bottleneck for NLG systems that produce texts with rich discourse structures using traditional architectures.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A3

Tourtouri, Elli; Delogu, Francesca; Crocker, Matthew W.

ERP indices of referential informativity in visual contexts Inproceedings

Paper presented at the 28th CUNY Conference on Human Sentence Processing, University of South California, Los Angeles, USA, 2015.
Violations of the Maxims of Quantity occur when utterances provide more (over- specified) or less (under-specified) information than strictly required for referent identification. While behavioural data suggest that under-specified (US) expressions lead to comprehension difficulty and communicative failure, there is no consensus as to whether over- specified (OS) expressions are also detrimental to comprehension. In this study we shed light on this debate, providing neurophysiological evidence supporting the view that extra information facilitates comprehension. We further present novel evidence that referential failure due to underspecification is qualitatively different from explicit cases of referential failure, when no matching referential candidate is available in the context.

@inproceedings{Tourtourietal2015a,
title = {ERP indices of referential informativity in visual contexts},
author = {Elli Tourtouri and Francesca Delogu and Matthew W. Crocker},
url = {https://www.researchgate.net/publication/322570166_ERP_indices_of_referential_informativity_in_visual_contexts},
year = {2015},
date = {2015},
booktitle = {Paper presented at the 28th CUNY Conference on Human Sentence Processing},
publisher = {University of South California},
address = {Los Angeles, USA},
abstract = {

Violations of the Maxims of Quantity occur when utterances provide more (over- specified) or less (under-specified) information than strictly required for referent identification. While behavioural data suggest that under-specified (US) expressions lead to comprehension difficulty and communicative failure, there is no consensus as to whether over- specified (OS) expressions are also detrimental to comprehension. In this study we shed light on this debate, providing neurophysiological evidence supporting the view that extra information facilitates comprehension. We further present novel evidence that referential failure due to underspecification is qualitatively different from explicit cases of referential failure, when no matching referential candidate is available in the context.
},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   A1 C3

Degaetano-Ortlieb, Stefania; Kermes, Hannah; Khamis, Ashraf; Ordan, Noam; Teich, Elke

The taming of the data: Using text mining in building a corpus for diachronic analysis Inproceedings

Varieng - From Data to Evidence (d2e), University of Helsinki, 2015.

Social and historical linguistic studies benefit from corpora encoding contextual metadata (e.g. time, register, genre) and relevant structural information (e.g. document structure). While small, handcrafted corpora control over selected contextual variables (e.g. the Brown/LOB corpora encoding variety, register, and time) and are readily usable for analysis, big data (e.g. Google or Microsoft n-grams) are typically poorly contextualized and considered of limited value for linguistic analysis (see, however, Lieberman et al. 2007). Similarly, when we compile new corpora, sources may not contain all relevant metadata and structural data (e.g. the Old Bailey sources vs. the richly annotated corpus in Huber 2007).

@inproceedings{Degaetano-etal2015,
title = {The taming of the data: Using text mining in building a corpus for diachronic analysis},
author = {Stefania Degaetano-Ortlieb and Hannah Kermes and Ashraf Khamis and Noam Ordan and Elke Teich},
url = {https://www.ashrafkhamis.com/d2e2015.pdf},
year = {2015},
date = {2015-10-01},
booktitle = {Varieng - From Data to Evidence (d2e)},
address = {University of Helsinki},
abstract = {Social and historical linguistic studies benefit from corpora encoding contextual metadata (e.g. time, register, genre) and relevant structural information (e.g. document structure). While small, handcrafted corpora control over selected contextual variables (e.g. the Brown/LOB corpora encoding variety, register, and time) and are readily usable for analysis, big data (e.g. Google or Microsoft n-grams) are typically poorly contextualized and considered of limited value for linguistic analysis (see, however, Lieberman et al. 2007). Similarly, when we compile new corpora, sources may not contain all relevant metadata and structural data (e.g. the Old Bailey sources vs. the richly annotated corpus in Huber 2007).},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B1

Torabi Asr, Fatemeh; Demberg, Vera

Uniform Information Density at the Level of Discourse Relations: Negation Markers and Discourse Connective Omission Inproceedings

IWCS 2015, pp. 118, 2015.

About half of the discourse relations annotated in Penn Discourse Treebank (Prasad et al., 2008) are not explicitly marked using a discourse connective. But we do not have extensive theories of when or why a discourse relation is marked explicitly or when the connective is omitted. Asr and Demberg (2012a) have suggested an information-theoretic perspective according to which discourse connectives are more likely to be omitted when they are marking a relation that is expected or predictable. This account is based on the Uniform Information Density theory (Levy and Jaeger, 2007), which suggests that speakers choose among alternative formulations that are allowed in their language the ones that achieve a roughly uniform rate of information transmission. Optional discourse markers should thus be omitted if they would lead to a trough in information density, and be inserted in order to avoid peaks in information density. We here test this hypothesis by observing how far a specific cue, negation in any form, affects the discourse relations that can be predicted to hold in a text, and how the presence of this cue in turn affects the use of explicit discourse connectives.

@inproceedings{asr2015uniform,
title = {Uniform Information Density at the Level of Discourse Relations: Negation Markers and Discourse Connective Omission},
author = {Fatemeh Torabi Asr and Vera Demberg},
url = {https://www.semanticscholar.org/paper/Uniform-Information-Density-at-the-Level-of-Markers-Asr-Demberg/cee6437e3aba3e772ef8cc7e9aaf3d7ba1114d8b},
year = {2015},
date = {2015},
booktitle = {IWCS 2015},
pages = {118},
abstract = {About half of the discourse relations annotated in Penn Discourse Treebank (Prasad et al., 2008) are not explicitly marked using a discourse connective. But we do not have extensive theories of when or why a discourse relation is marked explicitly or when the connective is omitted. Asr and Demberg (2012a) have suggested an information-theoretic perspective according to which discourse connectives are more likely to be omitted when they are marking a relation that is expected or predictable. This account is based on the Uniform Information Density theory (Levy and Jaeger, 2007), which suggests that speakers choose among alternative formulations that are allowed in their language the ones that achieve a roughly uniform rate of information transmission. Optional discourse markers should thus be omitted if they would lead to a trough in information density, and be inserted in order to avoid peaks in information density. We here test this hypothesis by observing how far a specific cue, negation in any form, affects the discourse relations that can be predicted to hold in a text, and how the presence of this cue in turn affects the use of explicit discourse connectives.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Sayeed, Asad; Fischer, Stefan; Demberg, Vera

To What Extent Do We Adapt Spoken Word Durations to a Domain? Inproceedings

Architectures and mechanisms for language processing (AMLaP), Malta, 2015.

@inproceedings{AMLaP2015a,
title = {To What Extent Do We Adapt Spoken Word Durations to a Domain?},
author = {Asad Sayeed and Stefan Fischer and Vera Demberg},
url = {https://www.bibsonomy.org/bibtex/ddebcecc8adb8f40a0abf87294b11a02},
year = {2015},
date = {2015},
booktitle = {Architectures and mechanisms for language processing (AMLaP)},
address = {Malta},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   B2

Successfully