Information Density Aware Text-to-Speech Synthesis
Project C5 Completed
Project C5 investigates how text-to-speech (TTS) synthesis techniques can be enhanced to take knowledge about information and encoding density into account. The project explores methods to connect and align the processing of high-level information with its encoding into low-level phonetic parameters in TTS synthesis. The approach is to encode information density in two stages: first, directly as high-level parameters during TTS voice building (offline) and, second, during runtime synthesis (online).
Quantification of information density can also be used to develop a model of listeners’ susceptibility to synthesis artifacts, in order to automatically predict and pre-emptively improve the perceived output quality by selecting a sequence of acoustic units that forms the desired variation and density of encoding given a defined degree of information density.
Keywords: text-to-speech synthesis, voicebuilding, acoustic correlates of information density
Publications C5
Other Area-C Projects
- Information Density and the Predictability of Phonetic Structure C1
- Rational Encoding and Decoding of Referring Expressions C3
- Mutual Intelligibility and Surprisal in Slavic Intercomprehension (INCOMSLAV-3) C4
- Information Management as a Factor for Syntactic Variation in the History of German C6
- Cross-linguistic Information-Theoretic Modelling of Communicative Efficiency C7
- Information density and linguistic encoding in “Leichte Sprache“ (IDeaLite) T1