Variation in language use across social variables: a data-driven approach
Proceedings of the Corpus and Language Variation in English Research Conference (CLAVIER), Bari, Italy, 2017.
We present a data-driven approach to study language use over time according to social variables (henceforth SV), considering also interactions between different variables. Besides sociolinguistic studies on language variation according to SVs (e.g., Weinreich et al. 1968, Bernstein 1971, Eckert 1989, Milroy and Milroy 1985), recently computational approaches have gained prominence (see e.g., Eisenstein 2015, Danescu-Niculescu-Mizil et al. 2013, and Nguyen et al. 2017 for an overview), not least due to an increase in data availability based on social media and an increasing awareness of the importance of linguistic variation according to SVs in the NLP community.