An acoustic feature-based similarity scoring system for speech rehabilitation assistance

Disabil Rehabil Assist Technol. 2016 Aug;11(6):501-15. doi: 10.3109/17483107.2015.1027297. Epub 2015 Apr 7.

Abstract

The purpose of this study is to develop a tool to assist speech therapy and rehabilitation, which focused on automatic scoring based on the comparison of the patient's speech with another normal speech on several aspects including pitch, vowel, voiced-unvoiced segments, strident fricative and sound intensity. The pitch estimation employed the use of cepstrum-based algorithm for its robustness; the vowel classification used multilayer perceptron (MLP) to classify vowel from pitch and formants; and the strident fricative detection was based on the major peak spectral intensity, location and the pitch existence in the segment. In order to evaluate the performance of the system, this study analyzed eight patient's speech recordings (four males, four females; 4-58-years-old), which had been recorded in previous study in cooperation with Taipei Veterans General Hospital and Taoyuan General Hospital. The experiment result on pitch algorithm showed that the cepstrum method had 5.3% of gross pitch error from a total of 2086 frames. On the vowel classification algorithm, MLP method provided 93% accuracy (men), 87% (women) and 84% (children). In total, the overall results showed that 156 tool's grading results (81%) were consistent compared to 192 audio and visual observations done by four experienced respondents. Implication for Rehabilitation Difficulties in communication may limit the ability of a person to transfer and exchange information. The fact that speech is one of the primary means of communication has encouraged the needs of speech diagnosis and rehabilitation. The advances of technology in computer-assisted speech therapy (CAST) improve the quality, time efficiency of the diagnosis and treatment of the disorders. The present study attempted to develop tool to assist speech therapy and rehabilitation, which provided simple interface to let the assessment be done even by the patient himself without the need of particular knowledge of speech processing while at the same time, also provided further deep analysis of the speech, which can be useful for the speech therapist.

Keywords: Computer-assisted speech therapy; multilayer perceptron; speech disorder; speech processing; strident fricative detection.

Publication types

  • Clinical Trial

MeSH terms

  • Adolescent
  • Adult
  • Algorithms*
  • Child
  • Child, Preschool
  • Female
  • Humans
  • Male
  • Middle Aged
  • Pitch Perception
  • Speech Therapy / instrumentation
  • Speech Therapy / methods*
  • Therapy, Computer-Assisted / instrumentation
  • Therapy, Computer-Assisted / methods*
  • Young Adult