Send to

Choose Destination
See comment in PubMed Commons below
Ann Emerg Med. 2016 Dec;68(6):729-735. doi: 10.1016/j.annemergmed.2016.02.018. Epub 2016 Mar 29.

Examining Reliability and Validity of an Online Score (ALiEM AIR) for Rating Free Open Access Medical Education Resources.

Author information

Division of Emergency Medicine, McMaster University, Hamilton, ON, Canada; Academic Life in Emergency Medicine and the MedEdLIFE Research Collaborative. Electronic address:
Department of Emergency Medicine, University of California, Olive View, CA, and Academic Life in Emergency Medicine.
Regions Hospital and the Department of Emergency Medicine, University of Minnesota.
Department of Family and Community Medicine, Faculty of Medicine, University of Toronto, and the Wilson Centre for Health Professions Education, University Health Network, Toronto, ON, Canada.
Department of Emergency Medicine, Oregon Health & Science University, Portland, OR.
Academic Life in Emergency Medicine and the MedEdLIFE Research Collaborative; Department of Emergency Medicine, University of California, San Francisco, CA.



Since 2014, Academic Life in Emergency Medicine (ALiEM) has used the Approved Instructional Resources (AIR) score to critically appraise online content. The primary goals of this study are to determine the interrater reliability (IRR) of the ALiEM AIR rating score and determine its correlation with expert educator gestalt. We also determine the minimum number of educator-raters needed to achieve acceptable reliability.


Eight educators each rated 83 online educational posts with the ALiEM AIR scale. Items include accuracy, usage of evidence-based medicine, referencing, utility, and the Best Evidence in Emergency Medicine rating score. A generalizability study was conducted to determine IRR and rating variance contributions of facets such as rater, blogs, posts, and topic. A randomized selection of 40 blog posts previously rated through ALiEM AIR was then rated again by a blinded group of expert medical educators according to their gestalt. Their gestalt impression was subsequently correlated with the ALiEM AIR score.


The IRR for the ALiEM AIR rating scale was 0.81 during the 6-month pilot period. Decision studies showed that at least 9 raters were required to achieve this reliability. Spearman correlations between mean AIR score and the mean expert gestalt ratings were 0.40 for recommendation for learners and 0.35 for their colleagues.


The ALiEM AIR scale is a moderately to highly reliable, 5-question tool when used by medical educators for rating online resources. The score displays a fair correlation with expert educator gestalt in regard to the quality of the resources. The score displays a fair correlation with educator gestalt.

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science
    Loading ...
    Support Center