Locality Effects and Ambiguity Avoidance in Natural Language Generation

Abstract

Locality Effects and Ambiguity Avoidance in Natural Language Generation

Michael White
The Ohio State University, Department of Linguistics

Joint work with Rajakrishnan Rajkumar, Manjuan Duan, Ethan Hill, Marten van Schijndel and William Schuler

Comprehension and corpus studies have found that the tendency to minimize dependency length, or dependency locality (Gibson 2000, inter alia), has a strong influence on constituent ordering choices. In this talk, I’ll begin by examining dependency locality in the context of discriminative realization ranking, showing that adding a global feature for capturing a dependency length minimization preference to an otherwise comprehensive realization ranking model yields statistically significant improvements in BLEU scores, significantly reduces the number of heavy/light ordering errors and better matches the distributional characteristics of sentence orderings in English news text. Next, complementing this realization ranking study, I’ll present the results of a recent corpus study that goes beyond Temperley’s (2007) study of locality effects in written English by taking into account lexical and syntactic surprisal as competing control factors, where we find that dependency length remains a significant predictor of the corpus sentence for a wide variety of syntactic constructions, and moreover that embedding depth and embedding difference (Wu et al., 2010) together help to improve the prediction accuracy in cases of anti-locality. After that, I’ll turn to the question of whether statistical parsers can be used for self-monitoring in order to avoid remaining ordering errors in surface realization, in particular those involving “vicious” ambiguities (van Deemter, 2004), i.e. those where the intended interpretation fails to be considerably more likely than alternative ones. Using parse accuracy in a simple reranking strategy for self-monitoring, we find that BLEU scores cannot be improved with any of the parsers we tested, since these parsers too often make errors that human readers would be unlikely to make; however, by using an SVM ranker to combine the realizer’s model score together with features from multiple parsers, including ones designed to make the ranker more robust to parsing mistakes, we show that significant increases in BLEU scores can be achieved, and that vicious ambiguities can frequently be avoided. Finally, to conclude I’ll briefly present work in progress suggesting that self-monitoring can be effectively used in generating disambiguating paraphrases in order to crowd-source judgments of meaning.

If you would like to meet with the speaker please contact Vera Demberg.