A cross-linguistic study of speech modulation spectra

Léo Varnet; Maria Clemencia Ortiz-Barajas; Ramón Guevara Erra; Judit Gervain; Christian Lorenzi

doi:10.1121/1.5006179

A cross-linguistic study of speech modulation spectra

J Acoust Soc Am. 2017 Oct;142(4):1976. doi: 10.1121/1.5006179.

Authors

Léo Varnet¹, Maria Clemencia Ortiz-Barajas², Ramón Guevara Erra², Judit Gervain², Christian Lorenzi¹

Affiliations

¹ Laboratoire des Systèmes Perceptifs, Département d'Études Cognitives, École normale supérieure, PSL Research University, CNRS, 29 rue d'Ulm, 75005 Paris, France.
² Laboratoire Psychologie de la Perception, Centre National de la Recherche Scientifique, UMR 8242, Université Paris-Descartes, 45 rue des Saints Pères, 75006 Paris, France.

PMID: 29092595
DOI: 10.1121/1.5006179

Abstract

Languages show systematic variation in their sound patterns and grammars. Accordingly, they have been classified into typological categories such as stress-timed vs syllable-timed, or Head-Complement (HC) vs Complement-Head (CH). To date, it has remained incompletely understood how these linguistic properties are reflected in the acoustic characteristics of speech in different languages. In the present study, the amplitude-modulation (AM) and frequency-modulation (FM) spectra of 1797 utterances in ten languages were analyzed. Overall, the spectra were found to be similar in shape across languages. However, significant effects of linguistic factors were observed on the AM spectra. These differences were magnified with a perceptually plausible representation based on the modulation index (a measure of the signal-to-noise ratio at the output of a logarithmic modulation filterbank): the maximum value distinguished between HC and CH languages, with the exception of Turkish, while the exact frequency of this maximum differed between stress-timed and syllable-timed languages. An additional study conducted on a semi-spontaneous speech corpus showed that these differences persist for a larger number of speakers but disappear for less constrained semi-spontaneous speech. These findings reveal that broad linguistic categories are reflected in the temporal modulation features of different languages, although this may depend on speaking style.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Humans
Language*
Linguistics*
Sound Spectrography
Speech Acoustics*
Speech Production Measurement