Karakanta, Alina; Menzel, Katrin; Przybyl, Heike; Teich, Elke

Detecting linguistic variation in translated vs. interpreted texts using relative entropy

Empirical Investigations in the Forms of Mediated Discourse at the European Parliament, Thematic Session at the 49th Poznan Linguistic Meeting (PLM2019), Poznan, 2019.

Our aim is to identify the features distinguishing simultaneously interpreted texts from translations (apart from being more oral) and the characteristics they have in common which set them apart from originals (translationese features). Empirical research on the features of interpreted language and cross-modal analyses in contrast to research on translated language alone has attracted wider interest only recently. Previous interpreting studies are typically based on relatively small datasets of naturally occurring or experimental data (e.g. Shlesinger/Ordan, 2012, Chmiel et al. forthcoming, Dragsted/Hansen 2009) for specific language pairs. We propose a corpus-based, exploratory approach to detect typical linguistic features of interpreting vs. translation based on a well-structured multilingual European Parliament translation and interpreting corpus. We use the Europarl-UdS corpus (Karakanta et al. 2018)1 containing originals and translations for English, German and Spanish, and selected material from existing interpreting/combined interpreting-translation corpora (EPIC: Sandrelli/Bendazzoli 2005; TIC: Kajzer-Wietrzny 2012; EPICG: Defrancq 2015), complemented with additional interpreting data (German). The data were transcribed or revised according to our transcription guidelines ensuring comparability across different datasets. All data were enriched with relevant metadata. We aim to contribute to a more nuanced understanding of the characteristics of translated and interpreted texts and a more adequate empirical theory of mediated discourse.