Send to

Choose Destination
Eur Respir J. 2019 Feb 14. pii: 1801660. doi: 10.1183/13993003.01660-2018. [Epub ahead of print]

Artificial intelligence outperforms pulmonologists in the interpretation of pulmonary function tests.

Author information

Respiratory Medicine, University Hospital Leuven, Chronic Diseases, Metabolism and Ageing, KU Leuven, Belgium.
Cochin Hospital, Assistance Publique Hôpitaux de Paris, Université Paris Descartes, Sorbonne Paris Cité, Paris, France.
Department of Respiratory Medicine, Hospital Oost-Limburg, Genk, Belgium.
Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium.
Department of Respiratory Medicine, AZ Sint-Jan Hospital, Bruges, Belgium.
Department of Pulmonary Medicine, Canisius Wilhelmina Hospital, Nijmegen, The Netherlands.
Department of Pulmonary Medicine and Tuberculosis, University of Groningen, and University Medical Center Groningen, Groningen, The Netherlands.
Université Catholique de Louvain (UCL), Department of Pneumology, Cliniques universitaires St-Luc, Brussels, Belgium.
Department of Respiratory Medicine, University Hospital, Liege, Belgium.
Department of Respiratory Medicine, Saint-Pierre Hospital, Université Libre de Bruxelles, Brussels, Belgium.
Service hospitalier universitaire de Pneumologie et Physiologie, Centre Hospitalier Universitaire Grenoble Alpes, Université Grenoble Alpes, France.
Department of Pulmonary Medicine, Centre Hospitalier de Luxembourg, Luxembourg.
Department of Respiratory Medicine, Onze-Lieve-Vrouw Hospital, Aalst, Belgium.
Department of Medicine, Pulmonary and Critical Care Medicine, University Medical Center Giessen and Marburg, Marburg, Germany, member of the German Center for Lung Research (DZL).
Department of Respiratory Medicine, Maastricht University Medical Center, Maastricht, The Netherlands.
Department of Pneumology, Jessa Hospital, Hasselt, Belgium.


The interpretation of pulmonary function tests (PFTs) to diagnose respiratory diseases is built on expert opinion which relies on the recognition of patterns and clinical context for the detection of specific diseases. In the study, we aimed to explore the accuracy and inter-rater variability of pulmonologists when interpreting PFTs and compared it against that of artificial intelligence (AI)-based software which was developed and validated in more than 1500 historical patient cases.120 pulmonologists from 16 European hospitals evaluated 50 cases comprising with PFT and clinical information resulting in 6000 independent interpretations. AI software examined the same data. ATS/ERS guidelines were used as the gold standard for PFT pattern interpretation. The gold standard for diagnosis was derived from clinical history, PFT and all additional tests.The pattern recognition of PFTs by pulmonologists (senior 73%, junior 27%) matched the guidelines in 74.4% (±5.9) of the cases (range: 56-88%). The inter-rater variability of 0.67 (kappa) pointed to a common agreement. Pulmonologists made correct diagnoses in 44.6% (±8.7) of the cases (range: 24-62%) with a large inter-rater variability (kappa=0.35). The AI-based software perfectly matched the PFT pattern interpretations (100%) and assigned a correct diagnosis in 82% of all cases (p<0.0001 for both measures).The interpretation of PFTs by pulmonologists leads to marked variations and errors. AI-based software provides more accurate interpretations and may serve as a powerful decision support tool to improve clinical practice.

Supplemental Content

Full text links

Icon for HighWire
Loading ...
Support Center