A perceptual model of vowel recognition based on the auditory representation of American English vowels

A K Syrdal; H S Gopal

doi:10.1121/1.393381

A perceptual model of vowel recognition based on the auditory representation of American English vowels

J Acoust Soc Am. 1986 Apr;79(4):1086-100. doi: 10.1121/1.393381.

Authors

A K Syrdal, H S Gopal

PMID: 3700864
DOI: 10.1121/1.393381

Abstract

A quantitative perceptual model of human vowel recognition based upon psychoacoustic and speech perception data is described. At an intermediate auditory stage of processing, the specific bark difference level of the model represents the pattern of peripheral auditory excitation as the distance in critical bands (barks) between neighboring formants and between the fundamental frequency (F0) and first formant (F1). At a higher, phonetic stage of processing, represented by the critical bark difference level of the model, the transformed vowels may be dichotomously classified based on whether the difference between formants in each dimension falls within or exceeds the critical distance of 3 bark for the spectral center of gravity effect [Chistovich et al., Hear. Res. 1, 185-195 (1979)]. Vowel transformations and classifications correspond well to several major phonetic dimensions and features by which vowels are perceived and traditionally classified. The F1-F0 dimension represents vowel height, and high vowels have F1-F0 differences within 3 bark. The F3-F2 dimension corresponds to vowel place of articulation, and front vowels have F3-F2 differences of less than 3 bark. As an inherent, speaker-independent normalization procedure, the model provides excellent vowel clustering while it greatly reduces between-speaker variability. It offers robust normalization through feature classification because gross binary categorization allows for considerable acoustic variability. There was generally less formant and bark difference variability for closely spaced formants than for widely spaced formants. These findings agree with independently observed perceptual results and support Stevens' quantal theory of vowel production and perceptual constraints on production predicted from the critical bark difference level of the model.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Adult
Child
Female
Humans
Male
Phonetics*
Psychoacoustics
Sound Spectrography
Speech Perception*

Grants and funding

CMS 5 KO5 NS00548/NS/NINDS NIH HHS/United States