Abdullah, Badr M.; Klakow, Dietrich

Analyzing the Representational Geometry of Acoustic Word Embeddings

Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, Association for Computational Linguistics, pp. 178-191, Abu Dhabi, United Arab Emirates (Hybrid), 2022.

Acoustic word embeddings (AWEs) are fixed-dimensionality vector representations in a vector space such that different acoustic exemplars of the same word are projected nearby in the embedding space. In addition to their use in speech technology applications such as spoken term discovery and keyword spotting, AWE models have been adopted as models of spoken-word processing in several cognitively motivated studies and they have shown to exhibit a human-like performance in some auditory processing tasks. Nevertheless, the representation geometry of AWEs remains an under-explored topic that has not been studied in the literature. In this paper, we take a closer analytical look at AWEs and study how the choice of the learning objective and the architecture shapes their representational profile. Our main findings highlight the prominent role of the learning objective on the representational geometry over the architecture.