Send to

Choose Destination
Int J Emerg Med. 2018 Sep 28;11(1):39. doi: 10.1186/s12245-018-0198-3.

Challenges in measuring ACGME competencies: considerations for milestones.

Author information

Department of Educational Psychology, University of North Texas, 1155 Union Circle #311335, Denton, TX, 76203, USA.
Department of Emergency Medicine, American University of Beirut Medical Center, Hamra, Beirut, Lebanon.
Department of Emergency Medicine, American University of Beirut Medical Center, Hamra, Beirut, Lebanon.



Measuring milestones, competencies, and sub-competencies as residents progress through a training program is an essential strategy in Accreditation Council for Graduate Medical Education (ACGME)'s attempts to ensure graduates meet expected professional standards. Previous studies have found, however, that physicians make global ratings often by using a single criterion.


We use advanced statistical analysis to extend these studies by examining the validity of ACGME International competency measures for an international setting, across emergency medicine (EM) and neurology, and across evaluators. Confirmatory factor analysis (CFA) models were fitted to both EM and neurology data. A single-factor CFA was hypothesized to fit each dataset. This model was modified based on model fit indices. Differences in how different EM physicians perceived the core competencies were tested using a series of measurement invariance tests.


Extremely high alpha reliability coefficients, factor coefficients (>‚ÄČ.93), and item correlations indicated multicollinearity, that is, most items being evaluated could essentially replace the underlying construct itself. This was true for both EM and neurology data, as well as all six EM faculty.


Evaluation forms measuring the six core ACGME competencies did not possess adequate validity. Severe multicollinearity exists for the six competencies in this study. ACGME is introducing milestones with 24 sub-competencies. Attempting to measure these as discrete elements, without recognizing the inherent weaknesses in the tools used will likely serve to exacerbate an already flawed strategy. Physicians likely use their "gut feelings" to judge a resident's overall performance. A better process could be conceived in which this subjectivity is acknowledged, contributing to more meaningful evaluation and feedback.

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center