Publications

Greenberg, Clayton; Demberg, Vera; Sayeed, Asad

Verb Polysemy and Frequency Effects in Thematic Fit Modeling Inproceedings

Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics, Association for Computational Linguistics, pp. 48-57, Denver, Colorado, 2015.

While several data sets for evaluating thematic fit of verb-role-filler triples exist, they do not control for verb polysemy. Thus, it is unclear how verb polysemy affects human ratings of thematic fit and how best to model that. We present a new dataset of human ratings on high vs. low-polysemy verbs matched for verb frequency, together with high vs. low-frequency and well-fitting vs. poorly-fitting patient rolefillers. Our analyses show that low-polysemy verbs produce stronger thematic fit judgements than verbs with higher polysemy. Rolefiller frequency, on the other hand, had little effect on ratings. We show that these results can best be modeled in a vector space using a clustering technique to create multiple prototype vectors representing different “senses” of the verb.

@inproceedings{greenberg-demberg-sayeed:2015:CMCL,
title = {Verb Polysemy and Frequency Effects in Thematic Fit Modeling},
author = {Clayton Greenberg and Vera Demberg and Asad Sayeed},
url = {http://www.aclweb.org/anthology/W15-1106},
year = {2015},
date = {2015-06-01},
booktitle = {Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics},
pages = {48-57},
publisher = {Association for Computational Linguistics},
address = {Denver, Colorado},
abstract = {While several data sets for evaluating thematic fit of verb-role-filler triples exist, they do not control for verb polysemy. Thus, it is unclear how verb polysemy affects human ratings of thematic fit and how best to model that. We present a new dataset of human ratings on high vs. low-polysemy verbs matched for verb frequency, together with high vs. low-frequency and well-fitting vs. poorly-fitting patient rolefillers. Our analyses show that low-polysemy verbs produce stronger thematic fit judgements than verbs with higher polysemy. Rolefiller frequency, on the other hand, had little effect on ratings. We show that these results can best be modeled in a vector space using a clustering technique to create multiple prototype vectors representing different “senses” of the verb.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Projects:   B2 B4

Crocker, Matthew W.; Demberg, Vera; Teich, Elke

Information Density and Linguistic Encoding (IDeaL) Journal Article

KI - Künstliche Intelligenz, 30, pp. 77-81, 2015.

We introduce IDEAL (Information Density and Linguistic Encoding), a collaborative research center that investigates the hypothesis that language use may be driven by the optimal use of the communication channel. From the point of view of linguistics, our approach promises to shed light on selected aspects of language variation that are hitherto not sufficiently explained. Applications of our research can be envisaged in various areas of natural language processing and AI, including machine translation, text generation, speech synthesis and multimodal interfaces.

@article{crocker:demberg:teich,
title = {Information Density and Linguistic Encoding (IDeaL)},
author = {Matthew W. Crocker and Vera Demberg and Elke Teich},
url = {http://link.springer.com/article/10.1007/s13218-015-0391-y/fulltext.html},
doi = {https://doi.org/10.1007/s13218-015-0391-y},
year = {2015},
date = {2015},
journal = {KI - K{\"u}nstliche Intelligenz},
pages = {77-81},
volume = {30},
number = {1},
abstract = {

We introduce IDEAL (Information Density and Linguistic Encoding), a collaborative research center that investigates the hypothesis that language use may be driven by the optimal use of the communication channel. From the point of view of linguistics, our approach promises to shed light on selected aspects of language variation that are hitherto not sufficiently explained. Applications of our research can be envisaged in various areas of natural language processing and AI, including machine translation, text generation, speech synthesis and multimodal interfaces.
},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Projects:   A1 A3 B1

Kampmann, Alexander; Thater, Stefan; Pinkal, Manfred

A Case-Study of Automatic Participant Labeling Inproceedings

Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology (GSCL 2015), 2015.

Knowlegde about stereotypical activities like visiting a restaurant or checking in at the airport is an important component to model text-understanding. We report on a case study of automatically relating texts to scripts representing such stereotypical knowledge. We focus on the subtask of mapping noun phrases in a text to participants in the script. We analyse the effect of various similarity measures and show that substantial positive results can be achieved on this complex task, indicating that the general problem is principally solvable.

@inproceedings{kampmann2015case,
title = {A Case-Study of Automatic Participant Labeling},
author = {Alexander Kampmann and Stefan Thater and Manfred Pinkal},
url = {https://www.bibsonomy.org/bibtex/256c2839962cccb21f7a2d41b3a83267?postOwner=sfb1102&intraHash=132779a64f2563005c65ee9cc14beb5f},
year = {2015},
date = {2015},
booktitle = {Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology (GSCL 2015)},
abstract = {Knowlegde about stereotypical activities like visiting a restaurant or checking in at the airport is an important component to model text-understanding. We report on a case study of automatically relating texts to scripts representing such stereotypical knowledge. We focus on the subtask of mapping noun phrases in a text to participants in the script. We analyse the effect of various similarity measures and show that substantial positive results can be achieved on this complex task, indicating that the general problem is principally solvable.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   A2

Rohrbach, Marcus; Rohrbach, Anna; Regneri, Michaela; Amin, Sikandar; Andriluka, Mykhaylo; Pinkal, Manfred; Schiele, Bernt

Recognizing Fine-Grained and Composite Activities using Hand-Centric Features and Script Data Journal Article

International Journal of Computer Vision, pp. 1-28, 2015.

Activity recognition has shown impressive progress in recent years. However, the challenges of detecting fine-grained activities and understanding how they are combined into composite activities have been largely overlooked. In this work we approach both tasks and present a dataset which provides detailed annotations to address them. The first challenge is to detect fine-grained activities, which are defined by low inter-class variability and are typically characterized by fine-grained body motions. We explore how human pose and hands can help to approach this challenge by comparing two pose-based and two hand-centric features with state-of-the-art holistic features. To attack the second challenge, recognizing composite activities, we leverage the fact that these activities are compositional and that the essential components of the activities can be obtained from textual descriptions or scripts. We show the benefits of our hand-centric approach for fine-grained activity classification and detection. For composite activity recognition we find that decomposition into attributes allows sharing information across composites and is essential to attack this hard task. Using script data we can recognize novel composites without having training data for them.

@article{rohrbach2015recognizing,
title = {Recognizing Fine-Grained and Composite Activities using Hand-Centric Features and Script Data},
author = {Marcus Rohrbach and Anna Rohrbach and Michaela Regneri and Sikandar Amin and Mykhaylo Andriluka and Manfred Pinkal and Bernt Schiele},
url = {https://link.springer.com/article/10.1007/s11263-015-0851-8},
year = {2015},
date = {2015},
journal = {International Journal of Computer Vision},
pages = {1-28},
abstract = {

Activity recognition has shown impressive progress in recent years. However, the challenges of detecting fine-grained activities and understanding how they are combined into composite activities have been largely overlooked. In this work we approach both tasks and present a dataset which provides detailed annotations to address them. The first challenge is to detect fine-grained activities, which are defined by low inter-class variability and are typically characterized by fine-grained body motions. We explore how human pose and hands can help to approach this challenge by comparing two pose-based and two hand-centric features with state-of-the-art holistic features. To attack the second challenge, recognizing composite activities, we leverage the fact that these activities are compositional and that the essential components of the activities can be obtained from textual descriptions or scripts. We show the benefits of our hand-centric approach for fine-grained activity classification and detection. For composite activity recognition we find that decomposition into attributes allows sharing information across composites and is essential to attack this hard task. Using script data we can recognize novel composites without having training data for them.
},
pubstate = {published},
type = {article}
}

Copy BibTeX to Clipboard

Project:   A2

Klakow, Dietrich; Avgustinova, Tania; Stenger, Irina; Fischer, Andrea; Jágrová, Klára

The INCOMSLAV Project Inproceedings

Seminar in formal linguistics at ÚFAL, Charles University, Prague, 2014.

The human language processing mechanism shows a remarkable robustness with different kinds of imperfect linguistic signal. The INCOMSLAV project aims at gaining insights about human retrieval of information in the mode of intercomprehension, i.e. from texts in genetically related languages not acquired through language learning. Furthermore it adds to this synchronic approach a diachronic perspective which provides the vital common denominator in establishing the extent of linguistic proximity. The languages to be analysed are chosen from the group of Slavic languages (CZ, PL, RU, BG). Whereas the possibility of intercomprehension between related languages is a generally accepted fact and the ways it functions have been studied for certain language groups, such analyses have not yet been undertaken from a systematic point of view focusing on information en- and decoding at different linguistic levels. The research programme will bring together results from the analysis of parallel corpora and from a variety of experiments with native speakers of Slavic languages and will compare them with insights of comparative historical linguistics on the relationship between Slavic languages. The results should add a cross-linguistic perspective to the question of how language users master high degrees of surprisal (due to partial incomprehensibility) and extract information from “noisy” code.

@inproceedings{dietrich2014incomslav,
title = {The INCOMSLAV Project},
author = {Dietrich Klakow and Tania Avgustinova and Irina Stenger and Andrea Fischer and Kl{\'a}ra J{\'a}grov{\'a}},
url = {https://ufal.mff.cuni.cz/events/incomslav-project},
year = {2014},
date = {2014},
booktitle = {Seminar in formal linguistics at ÚFAL},
publisher = {Charles University},
address = {Prague},
abstract = {The human language processing mechanism shows a remarkable robustness with different kinds of imperfect linguistic signal. The INCOMSLAV project aims at gaining insights about human retrieval of information in the mode of intercomprehension, i.e. from texts in genetically related languages not acquired through language learning. Furthermore it adds to this synchronic approach a diachronic perspective which provides the vital common denominator in establishing the extent of linguistic proximity. The languages to be analysed are chosen from the group of Slavic languages (CZ, PL, RU, BG). Whereas the possibility of intercomprehension between related languages is a generally accepted fact and the ways it functions have been studied for certain language groups, such analyses have not yet been undertaken from a systematic point of view focusing on information en- and decoding at different linguistic levels. The research programme will bring together results from the analysis of parallel corpora and from a variety of experiments with native speakers of Slavic languages and will compare them with insights of comparative historical linguistics on the relationship between Slavic languages. The results should add a cross-linguistic perspective to the question of how language users master high degrees of surprisal (due to partial incomprehensibility) and extract information from “noisy” code.},
pubstate = {published},
type = {inproceedings}
}

Copy BibTeX to Clipboard

Project:   C4

Successfully