Talamo, Luigi; Verkerk, Annemarie

A new methodology for an old problem: A corpus-based typology of adnominal word order in European languages

Italian Journal of Linguistics, 34, pp. 171-226, 2022.

Linguistic typology is generally characterized by strong data reduction, stemming from the use of binary or categorical classifications. An example are the categories commonly used in describing word order: adjective-noun vs noun-adjective; genitive-noun vs noun-genitive; etc. Token-based typology is part of an answer towards more fine-grained and appropriate measurement in typology. We discuss an implementation of this methodology and provide a case-study involving adnominal word order in a sample of eleven European languages, using a parallel corpus automatically parsed with models from the Universal Dependencies project. By quantifying adnominal word order variability in terms of Shannon’s entropy, we find that the placement of certain nominal modifiers in relation to their head noun is more variable than reported by typological databases , both within and across language genera. Whereas the low variability of placement of articles, adpositions and relative clauses is generally confirmed by our findings, the adnominal ordering of demonstratives and adjectives is more variable than previously reported.