Detecting Intellectual Influence from Dynamic Word Embeddings: Applications to Historical Newspapers and Contemporary Research Articles - Speaker: Jacob Eisenstein

The talk will describe two papers with a similar goal: using diachronic corpora to detect intellectual influence. We want to know who is leading the development of ideas over time, and we hope to get some insight on this by examining changes in the meanings of individual words, as quantified by dynamic word embeddings. The talk presents methods for (1) detecting changes in word meaning, (2) identifying the leaders and followers of these changes, and (3) aggregating over many changes into an overall picture of linguistic influence, using techniques from social network analysis. As the first part of the talk will show, this aggregate measure has predictive power: in the ACL anthology, papers that have a high level of linguistic influence in the short term tend to accrue more citations in the long term. This finding holds even when controlling for short-term citations: if two papers have the same number of citations after two years, the one with more linguistic influence will receive significantly more citations in the following three years. Next, we quantify linguistic influence in a corpus of historical newspapers from the abolitionist movement in the pre-Civil War United States. Here, a measure of linguistic influence helps to shed light on the unique roles played by newspapers edited by women and formerly enslaved people. This research was led by Sandeep Soni, in collaboration with Lauren F. Klein and David Bamman.