Source and Filter Acoustic Measures of Young, Middle-Aged and Elderly Adults for Application in Vowel Synthesis

J Voice. 2024 Mar;38(2):253-263. doi: 10.1016/j.jvoice.2021.08.025. Epub 2021 Oct 28.

Abstract

Introduction: The output sound has important changes throughout life due to anatomical and physiological modifications in the larynx and vocal tract. Understanding the young adult to the elderly speech acoustic characteristics may assist in the synthesis of representative voices of men and women of different age groups.

Objective: To obtain the fundamental frequency (f0), formant frequencies (F1, F2, F3, F4), and bandwidth (B1, B2, B3, B4) values extracted from the sustained vowel /a/ of young, middle-aged, and elderly adults who are Brazilian Portuguese speakers; to present the application of these parameters in vowel synthesis.

Study design: Prospective study.

Methods: The acoustic analysis of tokens of the 162 sustained vowel /a/ produced by vocally healthy adults, men, and women, between 18 and 80 years old, was performed. The adults were divided into three groups: young adults (18 to 44 years old); middle-aged adults (45 to 59 years old) and, elderly adults (60 to 80 years old). The f0, F1, F2, F3, F4, B1, B2, B3, B4 were extracted from the audio signals. Their average values were applied to a source-filter mathematical model to perform vowel synthesis in each age group both men and woman.

Results: Young women had higher f0 than middle-aged and elderly women. Elderly women had lower F1 than middle-aged women. Young women had higher F2 than elderly women. For the men's output sound, the source-filter acoustic measures were statistically equivalent among the age groups. Average values of the f0, F1, F2, F3, F4, B1, and B2 were higher in women. The sound waves distance in signals, the position of formant frequencies and the dimension of the bandwidths visible in spectra of the synthesized sounds represent the average values extracted from the volunteers' emissions for the sustained vowel /a/ in Brazilian Portuguese.

Conclusion: Sustained vowel /a/ produced by women presented different values of f0,F1 and F2 between age groups, which was not observed for men. In addition to the f0 and the formant frequencies, the bandwidths were also different between women and men. The synthetic vowels available represent the acoustic changes found for each sex as a function of age.

Keywords: Acoustical measurements; Formant frequency; Source-filter concept; Speech synthesis.

MeSH terms

  • Acoustics
  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Female
  • Humans
  • Male
  • Middle Aged
  • Phonetics
  • Prospective Studies
  • Sound
  • Speech Acoustics*
  • Voice*
  • Young Adult