Source and Filter Acoustic Measures of Young, Middle-Aged and Elderly Adults for Application in Vowel Synthesis

Giovanna Castilho Davatz; Rosiane Yamasaki; Adriana Hachiya; Domingos Hiroshi Tsuji; Arlindo Neto Montagnoli

doi:10.1016/j.jvoice.2021.08.025

Source and Filter Acoustic Measures of Young, Middle-Aged and Elderly Adults for Application in Vowel Synthesis

J Voice. 2024 Mar;38(2):253-263. doi: 10.1016/j.jvoice.2021.08.025. Epub 2021 Oct 28.

Authors

Giovanna Castilho Davatz¹, Rosiane Yamasaki², Adriana Hachiya³, Domingos Hiroshi Tsuji³, Arlindo Neto Montagnoli⁴

Affiliations

¹ Interunit Graduate Program in Bioengineering, Programa de Pós-Graduação Interunidades em Bioengenharia da EESC/IQSC/FMRP - USP - University of São Paulo - Av. Trabalhador São-carlense, 400, São Carlos/SP, Brazil, Zip Code: 13566-590.
² Federal University of São Paulo, Universidade Federal de São Paulo - UNIFESP - Department of Speech-Language Pathology - R. Botucatu, 802 - Vila Clementino - São Paulo/SP, Brazil, Zip Code: 04023-062. Electronic address: r.yamasaki@unifesp.br.
³ Department of Otolaryngology of Clinical Hospital of University of São Paulo - Faculdade de Medicina da Universidade de São Paulo (FMUSP) - Rua, Av. Dr. Enéas Carvalho de Aguiar, 255, São Paulo/SP, Brazil, Zip Code: 05403-000.
⁴ Federal University of São Carlos, Universidade Federal de São Carlos - UFSCar- Department of Electrical Engineering - Rodovia Washington Luís, km 235 - São Carlos/SP, Brazil, Zip Code: 13565-905.

PMID: 34756498
DOI: 10.1016/j.jvoice.2021.08.025

Abstract

Introduction: The output sound has important changes throughout life due to anatomical and physiological modifications in the larynx and vocal tract. Understanding the young adult to the elderly speech acoustic characteristics may assist in the synthesis of representative voices of men and women of different age groups.

Objective: To obtain the fundamental frequency (f₀), formant frequencies (F₁, F₂, F₃, F₄), and bandwidth (B₁, B₂, B₃, B₄) values extracted from the sustained vowel /a/ of young, middle-aged, and elderly adults who are Brazilian Portuguese speakers; to present the application of these parameters in vowel synthesis.

Study design: Prospective study.

Methods: The acoustic analysis of tokens of the 162 sustained vowel /a/ produced by vocally healthy adults, men, and women, between 18 and 80 years old, was performed. The adults were divided into three groups: young adults (18 to 44 years old); middle-aged adults (45 to 59 years old) and, elderly adults (60 to 80 years old). The f₀, F₁, F₂, F₃, F₄, B₁, B₂, B₃, B₄ were extracted from the audio signals. Their average values were applied to a source-filter mathematical model to perform vowel synthesis in each age group both men and woman.

Results: Young women had higher f₀ than middle-aged and elderly women. Elderly women had lower F₁ than middle-aged women. Young women had higher F₂ than elderly women. For the men's output sound, the source-filter acoustic measures were statistically equivalent among the age groups. Average values of the f₀, F₁, F₂, F₃, F₄, B₁, and B₂ were higher in women. The sound waves distance in signals, the position of formant frequencies and the dimension of the bandwidths visible in spectra of the synthesized sounds represent the average values extracted from the volunteers' emissions for the sustained vowel /a/ in Brazilian Portuguese.

Conclusion: Sustained vowel /a/ produced by women presented different values of f_0,F₁ and F₂ between age groups, which was not observed for men. In addition to the f₀ and the formant frequencies, the bandwidths were also different between women and men. The synthetic vowels available represent the acoustic changes found for each sex as a function of age.

Keywords: Acoustical measurements; Formant frequency; Source-filter concept; Speech synthesis.

MeSH terms

Acoustics
Adolescent
Adult
Aged
Aged, 80 and over
Female
Humans
Male
Middle Aged
Phonetics
Prospective Studies
Sound
Speech Acoustics*
Voice*
Young Adult