Deep learning architectures for estimating breathing signal and respiratory parameters from speech recordings

Venkata Srikanth Nallanthighal; Zohreh Mostaani; Aki Härmä; Helmer Strik; Mathew Magimai-Doss

doi:10.1016/j.neunet.2021.03.029

Deep learning architectures for estimating breathing signal and respiratory parameters from speech recordings

Neural Netw. 2021 Sep:141:211-224. doi: 10.1016/j.neunet.2021.03.029. Epub 2021 Apr 5.

Authors

Venkata Srikanth Nallanthighal¹, Zohreh Mostaani², Aki Härmä³, Helmer Strik⁴, Mathew Magimai-Doss⁵

Affiliations

¹ Philips Research, Eindhoven, The Netherlands; Centre for Language Studies (CLS), Radboud University Nijmegen, The Netherlands. Electronic address: srikanth.nallanthighal@philips.com.
² Idiap Research Institute, Martigny, Switzerland; Ecole polytechnique fédérale de Lausanne, Lausanne, Switzerland.
³ Philips Research, Eindhoven, The Netherlands.
⁴ Centre for Language Studies (CLS), Radboud University Nijmegen, The Netherlands.
⁵ Idiap Research Institute, Martigny, Switzerland.

PMID: 33915446
DOI: 10.1016/j.neunet.2021.03.029

Abstract

Respiration is an essential and primary mechanism for speech production. We first inhale and then produce speech while exhaling. When we run out of breath, we stop speaking and inhale. Though this process is involuntary, speech production involves a systematic outflow of air during exhalation characterized by linguistic content and prosodic factors of the utterance. Thus speech and respiration are closely related, and modeling this relationship makes sensing respiratory dynamics directly from the speech plausible, however is not well explored. In this article, we conduct a comprehensive study to explore techniques for sensing breathing signal and breathing parameters from speech using deep learning architectures and address the challenges involved in establishing the practical purpose of this technology. Estimating the breathing pattern from the speech would give us information about the respiratory parameters, thus enabling us to understand the respiratory health using one's speech.

Keywords: Deep neural networks; Respiratory parameters; Signal processing; Speech breathing; Speech technology.

MeSH terms

Adult
Deep Learning*
Female
Humans
Linguistics
Male
Respiration*
Speech*
Young Adult