Mosbach, Marius; Stenger, Irina; Avgustinova, Tania; Klakow, Dietrich
incom.py – A Toolbox for Calculating Linguistic Distances and Asymmetries between Related Languages
Angelova, Galia; Mitkov, Ruslan; Nikolova, Ivelina; Temnikova, Irina (Ed.): Proceedings of Recent Advances in Natural Language Processing, RANLP 2019, Varna, Bulgaria, 2-4 September 2019, pp. 811-819, Varna, Bulgaria, 2019.
Languages may be differently distant from each other and their mutual intelligibility may be asymmetric. In this paper we introduce incom.py, a toolbox for calculating linguistic distances and asymmetries between related languages. incom.py allows linguist experts to quickly and easily perform statistical analyses and compare those with experimental results. We demonstrate the efficacy of incom.py in an incomprehension experiment on two Slavic languages: Bulgarian and Russian. Using incom.py we were able to validate three methods to measure linguistic distances and asymmetries: Levenshtein distance, word adaptation surprisal, and conditional entropy as predictors of success in a reading intercomprehension experiment.