The Veterans Health Administration (VA) uses quality improvement strategies including clinical practice guidelines, clinical reminders in the electronic medical record and performance measurement to improve care processes. For veterans with depression and other mental illnesses managed in primary care settings, the VA has recently made major investments in integrated primary care-mental health programs. This project was nominated by Ira Katz, Deputy Chief, Patient Care Services for Mental Health and Carla Cassidy and Joe Francis, Office of Quality and Performance with input from a technical expert panel, and assigned to the Durham VA Evidence Synthesis Team. The overall goal was to synthesize data on two key issues – the responsiveness of depression severity instruments and minimum duration of treatment with antidepressants – to inform future quality improvement efforts.

The final key questions (KQ) are:

KQ1: In patients with major depressive disorder treated in primary care settings, what assessment tools are responsive to change? This review should specifically address instruments that are feasible for the primary care setting.

KQ2: In primary care patients with major depressive disorder who remit with antidepressant medication, what is the minimum treatment duration to decrease the risk of relapse or recurrence? This review will focus on patients without comorbid substance abuse, post-traumatic stress disorder, psychosis or other conditions where guidelines would recommend specialty based care.


We conducted a search in Medline and PsychInfo for literature published from 1950 through February 2009. For key question one (KQ1), we searched for relevant primary literature. For key question two (KQ2), our search strategy was designed to identify recent high quality systematic reviews and any relevant randomized controlled trials published since the review. A high quality review was identified that included articles published through March 2007; our randomized controlled trials (RCT) included articles published from January 2007 through February 2009. Appendix A provides the search strategy in detail. We reviewed reference lists of pertinent studies for additional citations. All citations were imported into an electronic database (EndNote X1).


Two trained researchers reviewed the titles and/or abstracts of citations identified from literature searches. Full-text articles of potentially relevant citations were retrieved for further review. Each article was reviewed with a brief screening form (see Appendix B) to determine eligibility and record reasons for exclusion. In case of disagreement, the two reviewers met to identify and resolve the disagreement. Eligible articles had English-language abstracts and provided primary data relevant to the key questions. Eligibility criteria varied depending on the question of interest, as described below.

To be included in our evidence report for KQ1, a study had to:

  • Evaluate Beck Depression Fast Screen [24], Center for Epidemiologic Studies Depression Scale 10-item version [25], DEPS scale [26], Geriatric Depression Scale 15 item version [27], the Patient Heath Questionnaire-9[28], or Symptom Driven Diagnostic System-PC [29]
  • Compare the depression questionnaire to an interview-based depression severity assessment such as the Hamilton Depression Rating Scale or Clinical Global Impression
  • Use a longitudinal study design so that response to change could be assessed
  • Be conducted in adult patients with depressive disorder followed in the outpatient setting and
  • Be published in English

We restricted the depression questionnaires to those that had been identified in a previous systematic review[14, 15] as having adequate performance characteristics to identify patients with major depression in primary care settings, had a range of scores sufficient to show change and that were feasible for use as self- or interviewer administered instruments. Thus, questionnaires with a very limited scoring range (e.g. Yale, PRIME-MD) or with greater than 10 items (e.g., 21 item Beck Depression Inventory, 21 item Center for Epidemiologic Depression Scale, Hopkins Symptom Checklist) were not considered. Although the Geriatric Depression Scale is 15 items, we included this measure because it is specifically cited as an option in the VA/DOD Major Depression Guideline.

To be included in our evidence report for KQ2, a study had to:

We then applied quality criteria (see below) and retained the most recent high quality systematic review. We included newly identified studies if they were randomized controlled trials, instead of reviews, and if they met all other criteria described for systematic reviews


We abstracted the following data from included studies: Study Design/setting, eligibility criteria/method for assembling cohort, exclusion criteria, sample size, duration of follow-up, demographics, clinical category/baseline depression, results and conclusions. For KQ 1, we also abstracted information on the method of administration and version of depression questionnaire and on the interview-based depression evaluation. For KQ2, we also abstracted information on the intervention and comparator and follow-up rate. Data abstractions were completed by a single reviewer, then over-read for accuracy by 1–2 additional reviewers. Any disagreements were resolved by discussion and consensus.


To assess internal validity of studies, we used criteria appropriate to the study design (see Appendix C). For KQ1, we abstracted data on whether the interview-based assessment was performed blind to the depression questionnaire results; whether the depression questionnaire was performed blind to the interview-based assessment; whether the interview-based assessment was adequate; the completeness of follow-up; whether the analytic methods were appropriate; study funding; and whether a conflict of interest statement was given.

For KQ2, we abstracted data for systematic reviews and separately for randomized controlled trials. For systematic reviews, we abstracted search methods and strategy; whether inclusion/exclusion criteria were clearly defined and appropriate; whether primary studies were appropriately evaluated for quality; were the assessments reproducible; was there an analysis of variability; were results combined appropriately; was publication bias assessed; were clinically important outcomes, including harms and benefits, reported. For randomized trials, we determined whether the method of randomization and allocation concealment was adequate; whether intervention and control groups were similar at baseline regarding the most important prognostic indicators; was the outcome assessed using a valid methodology and the assessor blinded; was the care provider blinded; was the patient blinded; was loss to follow-up < 20% and differential loss between groups < 10%; were missing outcome data addressed adequately; and was there a conflict of interest.


We constructed evidence tables showing the study characteristics and results for all included studies, organized by key question. We critically analyzed studies to compare their characteristics, methods, and findings. We compiled a summary of findings for each key question or clinical topic, and drew conclusions based on qualitative synthesis of the findings. We assigned an overall quality of evidence using the GRADE criteria.[30]


A draft version of this report was sent to four peer reviewers. Their comments and our responses are presented in Appendix D.

