Send to

Choose Destination
Psychiatr Serv. 2019 Dec 18:appips201900295. doi: 10.1176/ [Epub ahead of print]

Defining Success in Measurement-Based Care for Depression: A Comparison of Common Metrics.

Author information

Kaiser Permanente Washington Health Research Institute, Seattle (Coley, Simon); Department of Biostatistics (Coley) and Department of Biomedical Informatics and Medical Education (Hartzler),University of Washington, Seattle; Institute for Health Research, Kaiser Permanente Colorado, Denver (Boggs, Beck).



The National Committee for Quality Assurance recommends response and remission as indicators of successful depression treatment for the Healthcare Effectiveness and Data Information Set. Effect size and severity-adjusted effect size (SAES) offer alternative metrics. This study compared measures and examined the relationship between baseline symptom severity and treatment success.


Electronic records from two large integrated health systems (Kaiser Permanente Colorado and Washington) were used to identify 5,554 new psychotherapy episodes with a baseline Patient Health Questionnaire (PHQ-9) score of ≥10 and a PHQ-9 follow-up score from 14-180 days after treatment initiation. Treatment success was defined for four measures: response (≥50% reduction in PHQ-9 score), remission (PHQ-9 score <5), effect size ≥0.8, and SAES ≥0.8. Descriptive analyses examined agreement of measures. Logistic regression estimated the association between baseline severity and success on each measure. Sensitivity analyses evaluated the impact of various outcome definitions and loss to follow-up.


Effect size ≥0.8 was most frequently attained (72% across sites), followed by SAES ≥0.8 (66%), response (46%), and remission (22%). Response was the only measure not associated with baseline PHQ-9 score. Effect size ≥0.8 favored episodes with a higher baseline PHQ-9 score (odds ratio [OR]=2.3, p<0.001, for 10-point difference in baseline PHQ-9 score), whereas SAES ≥0.8 (OR=0.61, p<0.001) and remission (OR=0.43, p<0.001) favored episodes with lower baseline scores.


Response is preferable for comparing treatment outcomes, because it does not favor more or less baseline symptom severity, indicates clinically meaningful improvement, and is transparent and easy to calculate.


Depression; Measurement-based care; Performance measures; Psychotherapy outcomes; Quality of care; Treatment response


Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center