Effects of disfluencies, predictability, and utterance position on word form variation in English conversation

J Acoust Soc Am. 2003 Feb;113(2):1001-24. doi: 10.1121/1.1534836.

Abstract

Function words, especially frequently occurring ones such as (the, that, and, and of), vary widely in pronunciation. Understanding this variation is essential both for cognitive modeling of lexical production and for computer speech recognition and synthesis. This study investigates which factors affect the forms of function words, especially whether they have a fuller pronunciation (e.g., thi, thaet, aend, inverted-v v) or a more reduced or lenited pronunciation (e.g., thax, thixt, n, ax). It is based on over 8000 occurrences of the ten most frequent English function words in a 4-h sample from conversations from the Switchboard corpus. Ordinary linear and logistic regression models were used to examine variation in the length of the words, in the form of their vowel (basic, full, or reduced), and whether final obstruents were present or not. For all these measures, after controlling for segmental context, rate of speech, and other important factors, there are strong independent effects that made high-frequency monosyllabic function words more likely to be longer or have a fuller form (1) when neighboring disfluencies (such as filled pauses uh and um) indicate that the speaker was encountering problems in planning the utterance; (2) when the word is unexpected, i.e., less predictable in context; (3) when the word is either utterance initial or utterance final. Looking at the phenomenon in a different way, frequent function words are more likely to be shorter and to have less-full forms in fluent speech, in predictable positions or multiword collocations, and utterance internally. Also considered are other factors such as sex (women are more likely to use fuller forms, even after controlling for rate of speech, for example), and some of the differences among the ten function words in their response to the factors.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adult
  • Age Factors
  • Aged
  • Attention*
  • Data Interpretation, Statistical
  • Female
  • Humans
  • Male
  • Middle Aged
  • Phonetics*
  • Regression Analysis
  • Semantics
  • Sex Factors
  • Sound Spectrography
  • Speech Acoustics
  • Speech Disorders / diagnosis*
  • Speech Production Measurement / statistics & numerical data
  • Verbal Behavior*