Figure 1. Summary of literature search and review process for primary literature (number of articles)
The Agency for Healthcare Research and Quality (AHRQ), through its Evidence-Based Practice Centers (EPCs), sponsors the development of evidence reports and technology assessments to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. The reports and assessments provide organizations with comprehensive, science-based information on common, costly medical conditions and new health care technologies. The EPCs systematically review the relevant scientific literature on topics assigned to them by AHRQ and conduct additional analyses when appropriate prior to developing their reports and assessments.
To bring the broadest range of experts into the development of evidence reports and health technology assessments, AHRQ encourages the EPCs to form partnerships and enter into collaborations with other medical and research organizations. The EPCs work with these partner organizations to ensure that the evidence reports and technology assessments they produce will become building blocks for health care quality improvement projects throughout the Nation. The reports undergo peer review prior to their release.
AHRQ expects that the EPC evidence reports and technology assessments will inform individual health plans, providers, and purchasers as well as the health care system as a whole by providing important information to help improve health care quality.
We welcome comments on this evidence report. They may be sent by mail to the Task Order Officer named below at: Agency for Healthcare Research and Quality, 540 Gaither Road, Rockville, MD 20850, or by e-mail to epc@ahrq.gov.
Carolyn M. Clancy, M.D.
Director
Agency for Healthcare Research and Quality
Beth A. Collins-Sharp, Ph.D., R.N.
Director, EPC Program
Agency for Healthcare Research and Quality
Jean Slutsky, P.A., M.S.P.H.
Director, Center for Outcomes and Evidence
Agency for Healthcare Research and Quality
Ernestine Murray, B.S.N., R.N., M.A.S.
EPC Program Task Order Officer
Agency for Healthcare Research and Quality
The Evidence-based Practice Center thanks Karen Robinson for her assistance in developing the search strategies; Lori Bash, Gabriel Lai, Rachel Millstein, and Chidinma Ibe for their assistance with article reviewing and data entry; Renee Wilson, Brenda Zacharko, and Laura Barnes for their assistance with final preparations of the report.
Objective: Despite the broad range of continuing medical education (CME) offerings aimed at educating practicing physicians through the provision of up-to-date clinical information, physicians commonly overuse, under-use, and misuse therapeutic and diagnostic interventions. It has been suggested that the ineffective nature of CME either accounts for the discrepancy between evidence and practice or at a minimum contributes to this gap. Understanding what CME tools and techniques are most effective in disseminating and retaining medical knowledge is critical to improving CME and thus diminishing the gap between evidence and practice. The purpose of this review was to comprehensively and systematically synthesize evidence regarding the effectiveness of CME and differing instructional designs in terms of knowledge, attitudes, skills, practice behavior, and clinical practice outcomes.
Methods: We formulated specific questions with input from external experts and representatives of the Agency for Healthcare Research and Quality (AHRQ) and the American College of Chest Physicians (ACCP) which nominated this topic. We systematically searched the literature using specific eligibility criteria, hand searching of selected journals, and electronic databases including: MEDLINE®, EMBASE®, the Cochrane Database of Systematic Reviews, The Cochrane Central Register of Controlled Trials (CENTRAL), the Cochrane Database of Abstracts of Reviews of Effects (DARE), PsycINFO, and the Educational Resource Information Center (ERIC®). Two independent reviewers conducted title scans, abstract reviews, and then full article reviews to identify eligible articles. Each eligible article underwent double review for data abstraction and assessment of study quality.
Results: Of the 68,000 citations identified by literature searching, 136 articles and 9 systematic reviews ultimately met our eligibility criteria. The overall quality of the literature was low and consequently firm conclusions were not possible. Despite this, the literature overall supported the concept that CME was effective, at least to some degree, in achieving and maintaining the objectives studied, including knowledge (22 of 28 studies), attitudes (22 of 26), skills (12 of 15), practice behavior (61 of 105), and clinical practice outcomes (14 of 33). Common themes included that live media was more effective than print, multimedia was more effective than single media interventions, and multiple exposures were more effective than a single exposure. The number of articles that addressed internal and/or external characteristics of CME activities was too small and the studies too heterogeneous to determine if any of these are crucial for CME success. Evidence was limited on the reliability and validity of the tools that have been used to assess CME effectiveness. Based on previous reviews, the evidence indicates that simulation methods in medical education are effective in the dissemination of psychomotor and procedural skills.
Conclusion: Despite the low quality of the evidence, CME appears to be effective at the acquisition and retention of knowledge, attitudes, skills, behaviors and clinical outcomes. More research is needed to determine with any degree of certainty which types of media, techniques, and exposure volumes as well as what internal and external audience characteristics are associated with improvements in outcomes.
Continuing medical education (CME) is defined as educational activities that serve to maintain, develop, or increase the knowledge, skills, performance, and relationships a physician uses to provide services for patients, the public, or the profession. Despite the broad range of CME aimed at educating practicing physicians, researchers have found that physicians commonly overuse, under use, and misuse therapeutic and diagnostic interventions. It has been suggested that CME may not be effective enough to significantly narrow the gap between what is done in clinical practice and what should be done based on current evidence. Understanding what CME tools and techniques are most effective in disseminating and retaining medical knowledge is critical to improving the effectiveness of CME and thus diminishing the gap between evidence and practice.
To date, relatively little has been done to comprehensively and systematically synthesize evidence regarding the effectiveness of CME and the comparative effectiveness of differing instructional designs for CME in terms of impact on knowledge, attitudes, skills, practice behavior, and clinical practice outcomes. Review of evidence elucidating the value of CME (and ways the activities could be improved, if appropriate) could yield tremendous value to policy makers and professional organizations seeking to make recommendations regarding the optimal delivery of medical care.
The American College of Chest Physicians (ACCP) recognized the potential value of identifying and synthesizing the evidence in this area, and nominated this topic to the Evidence-based Practice Center (EPC) Program of the Agency for Healthcare Research and Quality (AHRQ). In response to this request by the ACCP, the Johns Hopkins EPC performed a systematic review to address the following key questions (KQ) pertaining to the effectiveness of CME:
KQ1 Is there evidence that particular methods of delivering CME are more effective in: a) imparting knowledge to physicians, b) changing physician attitudes, c) acquiring skills, d) changing physician practice behavior, or e) changing clinical practice outcomes?
KQ2 Do changes in knowledge, attitudes, skills, practice behavior, or clinical practice outcomes produced by CME persist over time (greater than or equal to 30 days)?
KQ3 What is the evidence from systematic reviews about the effectiveness of simulation methods in medical education outside of CME?
KQ4 Which characteristics of the audience by themselves or in combination with other characteristics influence the effectiveness of certain educational techniques?
KQ5 Which external factors by themselves or in combination with other factors reinforce the effects of CME in changing behavior?
KQ6 What is the reported validity and reliability of the methods that have been used for measuring the effects of CME in terms of: a) imparting knowledge, b) changing attitudes, c) acquiring skills, d) changing practice behavior, or e) changing clinical practice outcomes?
To answer these questions, we identified primary literature on the effectiveness of CME and systematic reviews on the effectiveness of simulation techniques in medical education by running searches through February 2006 of the following databases: MEDLINE®, EMBASE®, the Cochrane Database of Systematic Reviews, The Cochrane Central Register of Controlled Trials (CENTRAL), the Cochrane Database of Abstracts of Reviews of Effects (DARE), PsycINFO©, and the Educational Resource Information Center (ERIC®). Additionally, we searched by hand the references of included articles and the table of contents of selected journals from February 2005 through February 2006.
Two independent reviewers conducted title scans in a parallel fashion. If either reviewer felt that a title was potentially eligible, it was promoted to abstract review. The abstract review phase was designed to identify studies reporting on the effects of CME or simulation on clinical practice in terms of knowledge, attitudes, skills, practice behaviors, or clinical outcomes. Abstracts were promoted to full article review if both reviewers agreed the abstract met our specific inclusion criteria.
| Media method | Definition |
|---|---|
| Live | Any CME activity that is conducted in-person |
| Computer-based, off-line | Any CME activity that is conducted on the computer, but is not conveyed through the Internet (e.g., CD-ROM) |
| Internet, real-time (e.g., streaming) | Any CME activity that is conducted real-time via the Internet |
| Internet, not real-time | Any CME activity that is conducted via the Internet, but is not conducted in real-time |
| Video | Any CME activity that uses a videotape to convey its message |
| Audio | Any CME activity that uses an audiotape to convey its message |
| Handheld | Any CME activity that involves handheld materials (e.g., laminated card) |
| Any CME activity that is conducted via educational printed materials or readings |
We graded the quantity, quality, and consistency of the available evidence addressing KQs 1, 2, and 3 by adapting an evidence grading scheme recommended by the GRADE Working Group. We applied evidence grades to bodies of evidence on each type of objective (i.e., knowledge, attitudes, skills, practice behaviors, and clinical outcomes).
A total of 39 studies addressed 41 knowledge objectives. Only 28 of those studies had a control group.
Seventy-eight percent of the 28 studies with an adequate control group demonstrated that CME activities were effective at improving knowledge with the majority (68 percent) of these studies demonstrating long-term improvements in knowledge.
The studies were heterogeneous, making it difficult to determine how results differed according to media type, educational technique, or number of exposures.
The only recognized trends regarding differences by media type were that combine multimedia interventions (e.g., use of live and print media) were better than a single media intervention and that print interventions were either not beneficial or very weak in their ability to improve knowledge. We defined “media” as the method through which the CME activity is delivered.
When these studies were reviewed according to educational technique, it appeared that multiple techniques that most commonly included case-based learning were more likely to improve knowledge when compared to a single technique. Case-based learning is an educational technique where actual or authored clinical cases are created to highlight learning objectives; clinical material is presented and followed with questions usually determined by the instructor.
The evidence also suggested that multiple exposures produced better knowledge gains than a single exposure to content. Exposure was defined as one session versus more than one. An additional session could have used print media, computer media, a repeat live performance, or audio tape.
A total of 35 studies addressed 45 attitude objectives. Thirty one of the studies had a control group.
Seventy-one percent of the 31 studies with an adequate control group demonstrated that CME activities were effective at improving attitudes, such as attitudes regarding use of screening tests or clinical management options. The majority (68 percent) of these studies demonstrated long-term improvements in attitudes.
The studies were heterogeneous, again making it difficult to determine how results differed according to media type, educational technique, or number of exposures.
The only recognized trends regarding differences by media type were that multimedia interventions were better than a single media intervention and that print interventions were either not beneficial or very weak in their ability to improve attitudes.
The only recognized trend regarding differences by educational technique was that use of multiple techniques that most commonly include case-based learning seemed to be more likely to improve attitudes than use of a single technique.
The evidence suggested a trend toward multiple exposures being of greater benefit for attitudinal change then a single exposure, although it must be pointed out that all seven studies that evaluated a single exposure indeed demonstrated improvements in attitudes.
Twelve (80 percent) of the 15 studies that reported skill outcomes involved cognitive skills (i.e., ability to apply knowledge), with the remaining three involving psychomotor skills (i.e., procedures or physical examination techniques). Little can be said about the effectiveness of CME for psychomotor skills given the paucity of data in this area.
Seven (47 percent) of the 15 studies reporting skill outcomes had an evaluation beyond 30 days after the CME activity. Six of seven studies addressed the long-term effect of CME on cognitive skills and five of the six demonstrated a positive effect.
Given the dominance of live methods (seven live, four print, two video, two audio, three Internet/computer) among the studies that met their skill objectives, the data suggested that live methods had the greatest impact on the effectiveness of CME regarding skill-related outcomes. Given the paucity of data and the varied results, little can be said about the relative effectiveness of other CME media methods on skills. Based on the limited data, it is difficult to draw conclusions on particular media methods that have an impact on skill long-term.
Given the limited number of studies, the wide variety of techniques described, and the conflicting results, it is difficult to draw conclusions about the educational techniques that have the greatest short- and long-term effects on skills.
Most of the studies that met their skill objectives had multiple exposures to the CME activity as did most of the studies that evaluated the long-term effect on skills.
A total of 105 studies evaluated the impact of CME on short- and long-term physician practice behavior objectives, and 61 (58 percent) of the studies met practice objectives. Fifty studies with evaluation duration greater than 30 days met objectives, suggesting not only short-term, but also long-term CME effectiveness.
Most studies that evaluated the impact of different types of CME media found that use of single live media had both a short- and long-term effect on practice behavior objectives, and that single print media is ineffective. Most of the studies suggested that multimedia-based CME have both a short- and long-term effect on practice behavior objectives.
Of the studies evaluating the short- and long-term impact of different types of CME techniques on practice behavior objectives, most reported mixed results for a single technique, and overall effectiveness for use of multiple techniques. The use of multiple techniques may be advantageous over the use of a single technique.
Most studies evaluating the short- and long-term impact of different volumes of exposure to CME on practice behavior objectives suggested that both single and multiple exposures are effective overall.
Thirty-seven studies evaluated the impact of CME on clinical outcomes.
Only one of the 37 studies measured short-term clinical outcomes and it suggested an inconclusive effect. Three of the studies did not report the time at which clinical outcomes were measured.
Of the 33 studies that measured long-term clinical outcomes, 14 (42 percent) were successful in demonstrating a beneficial effect of CME.
When evaluating the impact of different types of CME media on long-term clinical outcomes, the use of multiple media was more effective than use of single media in six of the seven studies that made this comparison.
Of the five studies that compared clinical outcomes with use of single versus multiple educational techniques in CME, three showed that multiple techniques were superior to a single technique.
In four (57 percent) of the seven studies that evaluated the impact of a single CME exposure on clinical outcomes, the CME objective was met. However, insufficient data were available on whether multiple CME exposures produce better clinical outcomes than single exposure CME.
Eight reviews evaluated the role of simulation in skill acquisition, while two reviews evaluated the role of simulators in knowledge acquisition. Simulation methods used in these reviews included computer simulations (e.g., learning cardiovascular physiology), virtual reality (e.g., learning laparoscopic procedures), standardized patients (e.g., learning effective communication), and manikins (e.g., learning physical diagnosis).
Overall, the direction of evidence pointed to the effectiveness of simulation training, especially in psychomotor skills (i.e., procedures or physical examination techniques) and communication skills.
The strength of the evidence was considered low, due to the small number of appropriate studies, the scarcity of quantitative data, and a number of study limitations.
Thirteen studies examined the influence of audience characteristics on the educational intervention. These included such characteristics as age, gender, practice setting, and years in practice, among others.
The small and heterogeneous studies available did not allow us to reach definitive conclusions regarding the influence of audience characteristics on the effectiveness of CME.
Five studies examined the influence of external factors on the educational intervention. The small and heterogeneous studies available did not allow us to reach definitive conclusions regarding the influence of external factors on the effectiveness of CME.
Forty-five of 136 articles (33 percent) reported the validity and/or reliability within the study population of at least one evaluation method for assessing the effectiveness of CME, for a total of 61 methods. Validity refers to the degree to which a method truly measures what it is intended to measure. Reliability refers to the consistency or reproducibility of measurements.
Of the 61 evaluation methods with validity or reliability reported, 29 evaluation methods were drawn from previous studies, and 24 were created for the current studies. For eight methods, the source was unclear. Authors frequently did not report reliability testing within the new study population for methods found to be reliable in other populations.
Of 61 evaluation methods with validity or reliability reported, 16 (26 percent) included descriptions of validity alone, 28 (46 percent) included descriptions of reliability alone, and nine (15 percent) had descriptions of both validity and reliability. For six methods (10 percent), the methods were described as valid and/or reliable, but the specific type of validity or reliability was not reported.
The most common type of outcome evaluated by valid and/or reliable evaluation methods involved practice behaviors, for 31 out of 61 methods (51 percent). Knowledge or cognitive skills were evaluated by 15 methods (24 percent). Attitudes were evaluated by seven methods (12 percent). Skills were evaluated by 11 methods (18 percent). Clinical outcomes (with or without practice behaviors) were evaluated by 8 methods (13 percent).
Among these 61 methods, content validity (i.e., the degree to which an instrument accurately represents the skill or characteristic it was designed to measure) was the most commonly reported type of validity (27 percent).
Among these 61 methods, inter-rater reliability (28 percent) and internal consistency (23 percent) were the most common types of reliability reported. Inter-rater reliability is the degree to which measurements are the same when obtained by different persons. Internal consistency is a measure of how well items reflecting the same construct yield similar results.
Overall, despite the generally low quality of the evidence, most of the studies reviewed suggested that CME is effective, at least to some degree, in not only achieving, but also in maintaining the objectives studied. Despite the wide variety of CME techniques, media, and exposures used, and the heterogeneity of the studies, we found common themes among studies which applied across objectives. For example, when assessing the effectiveness of CME across domains, print media seemed to be less effective than live media, and multimedia activities generally seemed more effective than single media. In addition, interactive techniques seem to be more effective than non-interactive ones, and multiple exposures to the CME activity seem to be more effective than a single exposure. Thus, the evidence supports consideration of these attributes of effective educational interventions when designing a CME course.
We evaluated the effect of simulation methods in medical education by conducting a review of systematic reviews. Although we found that simulation training generally was effective, especially in the dissemination of psychomotor skills (e.g., procedures or physical examination techniques), studies which examined simulation did not review outcomes along the entire continuum of domains (i.e., knowledge through clinical outcomes), and were heterogeneous enough that few other conclusions could be drawn.
We also studied whether certain internal (audience) and external characteristics or factors may affect the effectiveness of CME. We found that the small and heterogeneous studies available did not allow us to reach definitive conclusions regarding the influence of audience characteristics or external factors on the effectiveness of CME.
This review has several important limitations. The heterogeneous nature of the studies precludes a quantitative summary of the effectiveness of CME. The educational interventions studied targeted different types of audiences, using multiple types of objectives across diverse content areas. This makes it difficult to generalize results from one field of CME to another.
Furthermore, we cannot draw firm conclusions about the effectiveness of CME because of the generally low quality of study designs, the variable quality of reporting in studies, and the lack of valid and reliable CME evaluation tools. Although we used a comprehensive search strategy, we cannot rule out some degree of publication bias. The review does point out a lack of standardization of approaches to CME research in general, including the lack of standardization for definitions of controls. The CME literature in general lacks standardization of terminology related to media type, educational techniques, and exposure volume, which makes it difficult to determine the impact of these factors on the effectiveness of CME.
Educators should develop strategies for identifying and prioritizing the gaps in our knowledge about CME that should be the focus of further research.
More randomized controlled studies of CME should be performed with clear definition of intervention and control groups and measurement of effectiveness at multiple points post-intervention. Such studies should focus on high priority areas given the resource limitations that educators face in conducting research on CME.
To advance such research on CME, leaders in medical education could develop a national agenda on what is needed most to improve the effectiveness of CME.
In developing a national agenda for research on CME, educational leaders should establish a clear definition of what constitutes CME. For example, does quality improvement or practice improvement alone constitute CME?
Future research on CME should include development of more standardized approaches to describe CME interventions, media, techniques, and exposure volumes.
Further studies should examine emerging methods of CME such as Internet-based CME that could be available to clinicians at the point of care.
Future research on CME should be based on a sound conceptual model of what influences the effectiveness of CME.
Continuing medical education (CME) was initiated by the American Academy of General Practice, which has required CME for membership since 1947.1 The American Medical Association (AMA) presently defines continuing medical education as “educational activities that serve to maintain, develop, or increase the knowledge, skills, and professional performance and relationships a physician uses to provide services for patients, the public, or the profession.”2 The AMA defines the content of CME as “that body of knowledge and skills generally recognized and accepted by the profession as within the basic medical sciences, the discipline of clinical medicine, and the provision of health care to the public.”2 In 1971 New Mexico became the first state to require CME credit for relicensure. Physicians commonly spend an average of 50 hours per year in CME activities geared toward improving their performance and optimizing their care of patients. Participation in CME is not voluntary as, in 41 states, it is a requirement for continued physician licensure and is often a requirement for hospital credentials and participation in many managed care plans. This has led to the proliferation of CME offerings aimed at providing physicians with the CME credits they require.
Despite the broad range of CME offerings aimed at educating the nation's practicing physicians through the provision of up-to-date clinical information, researchers have found that physicians commonly overuse, underuse, and misuse therapeutic and diagnostic interventions.3 Some medical educators have suggested that CME may not be effective enough to significantly narrow the gap between what is done in clinical practice and what should be done based on current evidence. Understanding what CME tools and techniques are most effective in disseminating and retaining medical knowledge is critical to improving the effectiveness of CME and thus diminishing the gap between evidence and practice.
To date, relatively little has been done to comprehensively and systematically synthesize evidence regarding the effectiveness of CME and the comparative effectiveness of differing instructional designs for CME and their impact on knowledge, attitudes, skills, practice behavior, and clinical practice outcomes. Review of evidence elucidating the value of CME (and ways the activities could be improved, if appropriate) could yield tremendous value to policy makers and professional organizations seeking to make policy recommendations regarding the optimal delivery of medical care. Such a review could have particular significance in light of the American Board of Medical Specialties (ABMS) recently initiated robust Maintenance of Certification (MOC) program that requires evidence of self-directed learning which is commonly accomplished through attendance at CME activities.4 In addition, as the National Institutes of Health (NIH) and the Agency for Healthcare Research and Quality (AHRQ) place more emphasis on translating scientific knowledge into clinical practice,5 educators will need to apply evidence-based approaches to CME that will support translating knowledge into practice. Furthermore, the published literature demonstrates a large gap between recommended processes of care and those delivered.6 Effective CME is required to help fill that gap.
The American College of Chest Physicians (ACCP) recognized the potential value of identifying and synthesizing the evidence in this area, and nominated this topic to the Evidence-based Practice Center Program (EPC) of AHRQ. In response to this request by the ACCP, the Johns Hopkins EPC performed a systematic review to address the following key questions pertaining to the effectiveness of CME:
Is there evidence that particular methods of delivering CME are more effective in: a) imparting knowledge to physicians, b) changing physician attitudes, c) acquiring skills, d) changing physician practice behavior, or e) changing clinical practice outcomes?
Do changes in knowledge, attitudes, skills, practice behavior, or clinical practice outcomes produced by CME persist over time (greater than or equal to 30 days)?
What is the evidence from systematic reviews about the effectiveness of simulation methods in medical education outside of CME?
Which characteristics of the audience by themselves or in combination with other characteristics influence the effectiveness of certain educational techniques?
Which external factors by themselves or in combination with other factors reinforce the effects of CME in changing behavior?
What is the reported validity and reliability of the methods that have been used for measuring the effects of CME in terms of a) imparting knowledge, b) changing attitudes, c) acquiring skills, d) changing practice behavior, or e) changing clinical practice outcomes?
The ACCP requested an evidence report to review and synthesize published literature regarding the effectiveness of CME, the comparative effectiveness of instructional designs for CME, and their impact on imparting knowledge, changing attitudes, acquiring skills, changing practice behavior, and changing clinical practice outcomes. Our EPC established a team and a work plan to develop the evidence report. The project consisted of recruiting technical experts, formulating and refining the specific questions, performing a comprehensive literature search, summarizing the state of the literature, constructing evidence tables, synthesizing the evidence, and submitting the report for peer review.
The topic for this report was nominated in a public process. At the beginning of the project, we recruited a panel of external technical experts to give input on key steps including the selection and refinement of the questions to be examined. The panel included external experts who have strong expertise in CME (see Appendix Aa).
We worked with the technical experts and representatives of AHRQ and ACCP to develop the Key Questions that are presented in the Scope and Key Questions section of Chapter 1 (Introduction). Based on the feedback from the technical experts, AHRQ, ACCP, and our team members, we expanded the preliminary questions to include knowledge, attitude, skills, practice behavior, and clinical outcomes and to address potential synergies between learning methods. We refined Key Question 3 to focus on the effectiveness of simulation methods used in medical education. Additionally, we added Key Question 6, which assesses the validity and reliability of tools used to evaluate the effectiveness of CME. The Key Questions focus on the effectiveness of CME in (1) imparting knowledge, (2) changing attitudes, (3) acquiring skills, (4) changing practice behaviors, or (5) changing clinical practice outcomes. We considered any test of physician or CME participant knowledge as knowledge. Attitudes were limited to physician or CME participant attitudes; attitudes could include physician attitudes toward a medical topic, physician comfort level, or satisfaction with the course. Skills were divided into cognitive skills (ability to apply knowledge) and psychomotor skills (e.g., procedures or physical examination techniques). Practice behavior referred to any type of physician behavior. We defined clinical outcomes as any change in patient health status, health-related behavior of patients, or attitudes of the patients about the physicians toward whom the CME intervention was directed. Thus, in addition to direct measures of health status such as blood pressure and fasting blood glucose, we also included indirect measures such as patient satisfaction, medication adherence, and smoking cessation.
Searching the literature included the steps of identifying reference sources, formulating a search strategy for each source, and executing and documenting each search. Additionally, we searched for medical subject heading (MeSH) terms that were relevant to CME. We used a systematic approach for searching the literature, with specific eligibility criteria, to minimize the risk of bias in selecting articles for inclusion in the review. The systematic approach was intended to help identify gaps in the published literature.
Our comprehensive search plan included electronic and hand searching. Beginning in February of 2006 we ran searches of the following databases: MEDLINE®, EMBASE®, the Cochrane Database of Systematic Reviews, The Cochrane Central Register of Controlled Trials (CENTRAL), the Cochrane Database of Abstracts of Reviews of Effects (DARE), PsycINFO©, and the Educational Resource Information Center (ERIC®), to identify primary literature on the effectiveness of CME and systematic reviews on the effectiveness of simulation techniques in medical education.
Hand searching for possibly relevant citations took several forms. From our electronic search, we identified the 13 journals (see Appendix Ba) that were most likely to publish articles on this topic (i.e., these journals had the highest number of abstracts and articles included in the review). We scanned the table of contents of each issue of these journals for relevant citations from February 2005 through February 2006. For the second form of hand searching, reviewers received eligible articles and flagged references of interest for the team to compare to the existing database.
Search strategies, specific to each database, were designed to enable the team to focus available resources on articles most likely to be relevant to the Key Questions. Initially, we developed a core strategy for MEDLINE®, accessed via PubMed®, based on an analysis of the MeSH terms and text words of key articles identified a priori. The PubMed® strategy formed the basis for the strategies developed for the other electronic databases (see Appendix Ca).
The results of the searches were downloaded and imported into ProCite® version 5 (ISI ResearchSoft, Carlsbad, CA). We used the duplication scan feature in ProCite® to delete citations already retrieved. From ProCite®, the articles were uploaded to SRS 3.0 (TrialStat! Corporation, Ottawa, Ontario, Canada), a Web-based software package developed for systematic review data management. Additionally, this database was used to store citations in portable document format (PDF) and to track the search results at title review, abstract review, article inclusion/exclusion, and data abstraction levels. A list of excluded articles is presented in Appendix Da.
After the electronic databases were searched, citations were downloaded into ProCite®, and uploaded to the SRS 3.0 tracking system. The study team scanned all titles. Two independent reviewers conducted title scans in a parallel fashion. For a title to be eliminated at this level, both reviewers had to indicate that it was ineligible. If the two reviewers did not agree on the eligibility of an article, it was promoted to the next level (see Appendix Ea, Title Review Form). The title review phase was designed to capture as many studies as possible reporting on the effectiveness of CME or as many systematic reviews reporting on the effectiveness of simulation in medical education. All titles that were thought to address the above effectiveness issues were promoted to the abstract review phase.
The abstract review phase was designed to identify studies reporting on the effects of CME or simulation in medical education on clinical practice in terms of knowledge, attitudes, skills, practice behaviors, or clinical outcomes. All articles with abstracts meeting these criteria were kept for further review. Abstracts were reviewed independently by two investigators. Abstracts concerning the effectiveness of CME were excluded if both investigators agreed that the article met one or more of the following exclusion criteria: (1) not written in English; (2) contained no human data; (3) no original data and did not apply to Key Question 3; (4) was a meeting abstract, editorial, commentary, or letter; (5) did not include at least 15 fully trained physicians or less than 50 percent of the CME participants were fully trained physicians and there was not a separate analysis for fully trained physicians; (6) did not include training or education; (7) did not evaluate an educational activity; (8) published prior to 1981; (9) not conducted in the United States or Canada; (10) did not apply to a Key Question; (11) did not include data from a concurrent or historical comparison group; or (12) involved quality improvement without an educational activity. Since CME accreditation from the Accreditation Council for Continuing Medication Education began in 1981, we decided to limit our review to studies published after that date. We decided to exclude studies not conducted in the United States or Canada because we felt that the medical education systems in other countries could be very different from and not relevant to CME in the United States. To qualify for Key Question 6, an abstract must also address at least one of the other key questions.
Abstracts concerning the effectiveness of simulation in medical education were excluded if both investigators agreed that the article met one or more of the following exclusion criteria: (1) not written in English; (2) contained no human data; (3) was not a systematic review (i.e., identified a question, described a search strategy, described eligibility criteria, and synthesized results either quantitatively or qualitatively); (4) was a meeting abstract, editorial, commentary, or letter; (5) did not include medical students or physicians-in-training; (6) did not include medical training or education; (7) did not evaluate an educational activity; (8) did not involve simulation, virtual reality, manikins, or standardized patients; (9) published prior to 1990; (10) did not apply to Key Question 3; (11) included only fully trained physicians or CME; or (12) did not report separately on the effects of simulation. The decision to limit our review to reviews published after 1990 was based partly on when simulation began to be used in medical education and partly on a desire to focus on reviews that are not too out-of-date. The cut-off date for inclusion in this review was February 2006. Differences in opinions regarding abstract inclusion or exclusion were resolved through consensus adjudication. At this level of inclusion/exclusion, the reviewers were also asked to identify which Key Questions the article might apply to if the article was eligible.
Because of the broad array of potentially eligible articles obtained at the abstract review phase, full articles initially selected for review underwent another independent parallel review by investigators to determine if they should be included for full data abstraction. At this phase of the review, investigators determined which of the Key Questions each article addressed (see Appendix E, Article Inclusion/Exclusion Form). If articles were still deemed to have applicable information, they were included in the final article review. Differences in opinions regarding article inclusion or exclusion were resolved through consensus adjudication.
The purpose of the article review was to confirm the relevance of each article to the research questions, to determine methodological characteristics pertaining to study quality, and to collect evidence that addressed the research questions. Articles eligible for full review could address one or more of the Key Questions. We used standardized forms for data extraction to minimize the risk of bias in how data were extracted from eligible studies and to maximize consistency in identifying all pertinent data available for synthesis. Additionally, we developed definitions and created examples, which were reviewed by the study team, to enhance the consistency of data extraction.
Each article underwent double review by study investigators for full data abstraction and assessment of study quality. For all data abstracted from studies, we used a sequential review process. In this process, the primary reviewer completed all data abstraction forms. The second reviewer confirmed the first reviewer's data abstraction forms for completeness and accuracy. Reviewer pairs were formed to include personnel with both clinical and methodological expertise. A third reviewer re-reviewed a random sample of articles by the first two reviewers to ensure consistency in the classification of the articles. Reviewers were not masked to the articles' authors, institution, or journal.7 In most instances, data were directly abstracted from the article. If possible, relevant data were also abstracted from figures. Differences in opinion were resolved through consensus adjudication. For assessments of study quality and the reporting of adult learning principles, each reviewer independently judged study quality and reporting and rated items on quality assessment and adult learning principles forms (see Appendix E, Data Abstraction Review Forms). The second reviewer provided the assessment of study quality and reporting of adult learning principles used in this report.
All information from the article review process was entered into the SRS 3.0 database by the individual completing the review. Reviewers entered comments into the system whenever applicable. The SRS 3.0 database was used to maintain and clean the data, as well as to create detailed evidence tables and summary tables (see Appendix F and Summary Tables).
| Technique/Educational method | Definition |
|---|---|
| Academic detailing | Detailing provided by an institution or hospital |
| Audience response systems | Addresses knowledge objectives. Used in combination with live lectures or discussion groups, these are computerized feedback tools that allow the teacher/instructor to pose a question to a large group and receive immediate feedback from each learner which is collated and presented on a screen. Instructor may choose to alter content based on audience response |
| Case-based learning | Addresses higher order knowledge and skill objectives. Actual or authored clinical cases are created to highlight learning objectives; clinical material is presented and followed with questions usually determined by the instructor |
| Clinical experiences | Addresses skill, knowledge and attitudinal objectives. Generally refers to a preceptorship or observership with an expert, as in attending a specialty clinic or operating room |
| Demonstration | Addresses skill and or knowledge (knows how) objectives; can be presented live, or with video or audio. Teacher determines amount and pace of content |
| Discussion group | Addresses knowledge, especially application or higher order knowledge, or affective objectives; usually requires preparation with readings, or another experience, such as viewing a videotape, or a role play. Can be facilitated by instructor, but group often determines content |
| Feedback | The provision of information about an individual's performance to learners |
| Lecture | Presentation of knowledge content; live, video, audio or slide presentation available online. Teacher/instructor determines amount and pace of content |
| Mentor/Preceptor | Addresses higher order cognitive, skill and affective objectives. Learner is paired with a mentor who may observe, review documentation of performance, advise, coach, and facilitate learning |
| Point of care | Addresses knowledge and higher order cognitive objectives (decision-making). Information which is provided at the time of clinical need, integrated into chart or electronic medical record |
| Problem-based learning or team-based learning | Addresses higher order knowledge objectives, metacognition and some skill (group work) objectives. A clinical scenario is presented to a team, who identify the learning objectives, assign information-seeking tasks, and return to share information and answer questions about the case. Can be facilitated or non-facilitated |
| Programmed learning | Addresses knowledge objectives. Content is delivered in sequential steps, which are tested with the learner, before moving to the next, usually more complicated step. Pace is determined by the learner, but objectives are set by the program (teacher). Can be delivered in text or online |
| Readings | Presentation of knowledge content or background for attitudinal objectives. Requires learner to complete; can be done at learner's pace. Teacher/instructor directed or self-directed (e.g., journals, newsletters, searching online) |
| Role play | Addresses skill, knowledge and affective objectives. Learners assume role of patients and/ or clinicians in practicing focused encounters around training problems, usually when standardized patients are unavailable. Encounter may be recorded and reviewed or followed with a discussion group. Rarely used as sole method of education |
| Simulation (other than standardized patient or role-play) | Addresses knowledge and skill objectives; ability to simulate potentially addresses higher order integrative objectives, such as responding to an emerging clinical situation, understanding the unfolding of a protein structure, working in teams. Technology can be used for simulation training of procedures, as in endoscopy virtual reality trainers or anesthesia simulators. Includes also models, such as joint injection and suture. Requires active participation of learner; can use multiple learners in some scenarios |
| Standardized patient | Addresses skill and some knowledge and affective objectives. Usually used for communication skills training and assessment, the standardized patient or simulated patient is trained in a specific patient scenario and presentation of a clinical problem. Encounter may be audio or videotaped and timed. Review offers opportunity for reflection and “replay” of the scenario |
| Writing/Authoring | Addresses knowledge and affective objectives. Can include authoring test items and participation in test development. Journaling is used frequently for affective objectives, and may be followed with discussion groups or review with a mentor |
Additionally, each reviewer independently completed a study quality form and a reporting of adult learning principles form. The study quality form was based on the Jadad criteria8 to assess randomization, blinding, and withdrawals, and included additional questions regarding power calculations. On the reporting of adult learning principles form, reviewers assessed how well the article reported the extent to which the CME activity incorporated adult learning principles, such as enabling learners to be active contributors to their learning, relating the curriculum to learners' current experiences, and tailoring the curriculum to learners' past experiences (Romsai Boonyasai, personal communication). The questions on the reporting of adult learning principles form was derived from a review of adult education.9
| Simulation type | Definition |
|---|---|
| Full simulation | Whole room or whole patient simulations |
| Partial task simulation | The use of products to learn or practice a specific skill, such as intubation heads, central venous line chests, intraosseous line legs, or umbilical artery cannulation trainers |
| Computer simulation | The use of computer programs that allow the student to practice decision making skills, specific knowledge sets such as Advanced Cardiac Life Support (ACLS) trainers and trauma management trainers |
| Virtual reality | The use of advanced computerized technology to allow students to learn or practice how to perform cardiac catheterizations, colonoscopies, bronchoscopies, ureteroscopies, laparoscopic surgery, hysteroscopy, arthroscopy, ocular surgery, intravenous line placement, etc. |
| Standardized patient | The use of individuals trained to play the roles of patients, family members, or others to allow students to practice physical exam skills, history taking skills, communication skills, etc. |
| Role play | Participants play roles of patients, family members, or others to allow practice of communication skills, etc. |
Each reviewer also independently completed a study quality form. The study quality form was based off of the QUORUM statement10 and assessed the reporting of the study question, search methods, inclusion/exclusion criteria, analysis, quality assessment, and conclusions. Additional questions regarding assessment of publication bias were included.
For studies addressing the influence of audience characteristics and/or external factors (Key Questions 4 and 5), an audience characteristics/external factors form (see Appendix E, Data Abstraction Review Forms) was completed. Data abstracted to this form included the audience characteristic or external factor that was being analyzed, whether a primary goal of the study was to assess the effects of this audience characteristic or external factor, any covariates used in the analysis, and a qualitative summary of the results. Additionally, reviewers abstracted data regarding general study characteristics, CME activity characteristics, outcomes, study quality, and the reporting of adult learning principles.
Data regarding validity and reliability of methods used to assess the effectiveness of CME (Key Question 6) were abstracted to a validity/reliability of tools form (see Appendix E, Data Abstraction Review Forms). Articles need not have used the specific terms “validity” or “reliability” to be included in Key Question 6. If authors did not label the specific type of validity or reliability reported, we classified the type based on the definitions from Reed et al.11 Articles that used a previously validated/reliable method were included if the authors described the method as valid/reliable or described a process or statistic used for psychometric testing. Reviewers also abstracted data regarding general study characteristics, CME activity characteristics, outcomes, study quality, and the reporting of adult learning principles.
Article quality was assessed differently for clinical trials and systematic reviews. The dual, independent review of article quality judged articles on several aspects of each study type's internal validity. Quality assessment of trials was based on the Jadad criteria8 and included: (1) appropriateness of the randomization scheme, (2) appropriateness of the blinding, and (3) description of withdrawals and drop-outs. For each trial, we created a score between 5 (high quality) and 0 (low quality). Two questions regarding power calculations were added to this form, however, the answers to these questions did not factor into the quality score.
The quality of each systematic review was assessed using criteria based off the QUORUM statement10: (1) whether the question being addressed by the review was clearly stated; (2) comprehensiveness of search methods used and described in the report; (3) whether inclusion/exclusion criteria were clearly defined and appropriate; (4) whether analyses were conducted to measure variability in efficacy; (5) whether study quality was assessed and done appropriately (using validated instruments); (6) whether differences in how outcomes were reported and analyzed across studies were taken into consideration; (7) whether the study methodology was reproducible; and (8) whether conclusions were supported by the data presented. Additional questions regarding assessment of publication bias were included.
For each Key Question, we created a set of detailed evidence tables containing all information extracted from eligible studies. The investigators reviewed the tables and eliminated items that were rarely reported. For Key Questions 1 and 2, the results were categorized and sorted based on the media method used, educational technique used, and the amount of exposure. Media methods were categorized into single print media (i.e., the CME activity used only print methods), single live media, single Internet media, other single media, multiple media (i.e., the CME activity used more than 1 media methods), and single vs. multiple media (i.e., the CME activity for one group used only 1 media method and the CME activity for the another group used more than 1 media methods). Educational techniques were categorized into single educational techniques (i.e., the CME activity used only 1 educational technique), multiple educational techniques (i.e., the CME activity used more than 1 educational techniques), single vs. multiple, and other/not reported. The amount of exposure was categorized into single exposure (i.e., the CME participants were exposed to the activity on only 1 occasion), multiple exposures (i.e., the CME participants were exposed to the activity on multiple occasions), single vs. multiple exposures, and other/not reported. Investigators used the resulting versions of the evidence tables to prepare the text of the report and selected summary tables.
For Key Question 6, the data were grouped according to similar evaluation methods to facilitate evaluating validity and reliability of these methods.
Initial data were abstracted by investigators and entered directly into Web-based data collection forms using SRS® 3.0 (TrialStat! Corporation, Ottawa, Ontario, Canada). After a second reviewer reviewed data, adjudicated data were re-entered into the Web-based data collection forms by the second reviewer. Second reviewers were generally more experienced members of the research team, and one of their main priorities was to check the quality and consistency of the first reviewers' answers. In addition to the second reviewers checking the consistency and accuracy of the first reviewers, a lead investigator examined a random sample of the reviews to identify problems with the data abstraction. If problems were recognized in a reviewer's data abstraction, the problems were discussed at a meeting with the reviewers. In addition, research assistants used a system of random data checks to assure data abstraction accuracy.
At the completion of our review, we graded the quantity, quality and consistency of the best available evidence addressing Key Questions 1, 2, and 3 by adapting an evidence grading scheme recommended by the GRADE Working Group.12 We applied evidence grades to bodies of evidence on each type of objective (i.e., knowledge, attitudes, skills, practice behaviors, and clinical outcomes). We assessed the strength of the study designs with randomized controlled trials considered best, followed by non-randomized controlled trials, and observational studies. To assess the quantity of evidence, we focused on the number of studies with the strongest design. We also assessed the quality and consistency of the best available evidence, including assessment of limitations to individual study quality (using individual quality scores), certainty regarding the directness of the observed effects in studies, precision and strength of findings, and availability (or lack thereof) of data to answer the Key Question. We classified evidence bodies pertaining to Key Questions 1, 2, and 3 into four basic categories: (1) “high” grade (indicating confidence that further research is very unlikely to change our confidence in the estimated effect in the abstracted literature); (2) “moderate” grade (indicating that further research is likely to have an important impact on our confidence in the estimates of effects and may change the estimates in the abstracted literature); (3) “low” grade (indicating further research is very likely to have an important impact on confidence in the estimates of effects and is likely to change the estimates in the abstracted literature); and (4) “very low” grade (indicating any estimate of effect is very uncertain). We did not grade the body of evidence for Key Questions 4 and 5 since this is a subset of Key Questions 1 and 2. Also, we did not grade the body of evidence for Key Question 6 since the grading criteria do not apply to our questions about the validity and reliability of educational assessment methods.
A draft of the completed report was sent to the technical experts and peer reviewers, as well as to the representatives of AHRQ and the Scientific Resource Center. In response to the comments of the technical experts, peer reviewers, and AHRQ, revisions were made to the evidence report, and a summary of the comments and their disposition was submitted to AHRQ.
A summary of the search results for the primary literature review is presented in Figure 1
A summary of the search results for the review of systematic reviews is presented in Figure 2
Participant questionnaire was the most frequently used evaluation method (59 percent of the studies). Of those that used a participant questionnaire, over two-thirds (69 percent) used a written questionnaire. A few studies administered the questionnaire via computer (5 studies), orally (5), or over the phone (7). About half (47 percent) of the studies used a performance audit to evaluate the CME program. Performance audits were usually conducted through chart review (39 studies) and health plan databases (18). Patient questionnaires were used in 39 studies. Twenty studies included a qualitative evaluation. Seventeen studies evaluated the CME program through observer assessment. Most of these studies used a live standardized patient to assess the observer (11 studies).
One-third of the studies did not report the setting of the CME activity. In 52 studies, the CME activity occurred in the practice setting. The CME activity was not linked to a physical setting in 27 settings. Government agencies sponsored about 40 percent of the studies. Pharmaceutical agencies, professional societies, and insurance/health plan companies sponsored about 10 percent of the studies each. In about two-thirds of the studies, some type of physician, the majority of which were academic, taught the CME activity. The type of educator was not mentioned in about one-quarter of the studies.
| Type of CME activity | Number (%) of studies |
|---|---|
| Media method | |
| Single media used in CME activity | 50 (37) |
Live only media | 29 (21) |
Print only media | 14 (10) |
Internet only media | 6 (4) |
Other type of single media | 4 (3) |
| Multiple media used in CME activity | 67 (49) |
| Single vs. multiple media used in CME activity | 18 (13) |
| Type of media not reported | 1 (1) |
| Educational technique | |
| Single technique used in CME activity | 13 (10) |
| Multiple techniques used in CME activity | 95 (70) |
| Single vs. multiple techniques used in CME activity | 25 (18) |
| Type of technique not reported | 3 (2) |
| Amount of exposure | |
| Exposed to CME activity once | 44 (32) |
| Exposed to CME activity multiple times | 69 (51) |
| One vs. multiple exposures to CME activity | 12 (9) |
| Other amount of exposure | 7 (5) |
| Amount of exposure not reported | 4 (3) |
Most (70 percent) of the studies used multiple educational techniques in the CME activity. Twenty-five studies compared using a single educational technique to using multiple educational techniques. Thirteen studies evaluated a single technique. Six articles evaluated reading only,13–18 two evaluated only academic detailing,1920 and one evaluated each of the following: problem-based learning,21 conference calls,22 feedback,23 lecture,24 and lecture versus case-based learning.25
The participants were exposed to the CME activity only once in about a third of the studies and multiple times in about half. Twelve studies compared participants who were exposed once to a CME activity to participants who were exposed multiple times.
The CME activities were designed for individuals in 90 studies and practice settings/teams in 24 studies. For 17 studies, the CME activities were designed for both individuals and practice settings/teams. The CME activity was reported as being accredited in about a quarter (31 studies) of the studies.
In 41 studies, the CME activity was a part of a quality improvement project. CME activities that were a part of quality improvement projects used multiple media methods significantly more often than CME activities with no quality improvement project (63% versus 43%; p=0.03). CME activities with quality improvement projects were also significantly less likely to use single live media (10% versus 26%; p=0.03). CME activities with and without quality improvement projects were similar to each other in terms of educational techniques and amount of exposure (data not shown).
Based on our quality scoring system described in the Methods chapter, over three-quarters of the trials were rated at two points or less:
8 studies achieved a score of 5
11 studies achieved a score of 4
12 studies achieved a score of 3
38 studies achieved a score of 2
29 studies achieved a score of 1
36 studies achieved a score of 0.
Only two reviews performed quality assessment of the studies included in the review using a validated quality scale.2730 In addition, one review31 provided a descriptive assessment of study quality. Evaluation for publication bias was not reported by any review. All reviews, except one, synthesized evidence qualitatively. One review,26 that combined individual study results quantitatively to generate a summary effect size, used both fixed-effects and random-effects models.
Four reviews26273031 discussed the variation in the results of the original studies in a qualitative manner, one additional study29 discussed it only partially, and four studies did not discuss the variations in the results of the original studies. Qualitative evaluation of heterogeneity, either by subgroup analyses or meta-regression, was not performed by any of the reviews included in our study.
A notable example of a trial that received high ratings for all adult learning principles was Gerrity et al.32
A total of 22 studies addressing 23 objectives demonstrated improvements in knowledge.1933–53 This represents 79% of the 28 studies with an adequate control group. Of these, 6 studies33–38 addressing 6 objectives did not clearly report the duration of evaluation, 1 study39 addressing 1 objective demonstrated short-term improvement in knowledge, and 15 studies addressing 16 objectives demonstrated long-term improvements in knowledge. The studies were too heterogeneous too identify any global similarities, but any specific information related to the type of media, educational technique or exposure volume will be addressed in those subsequent sections.
Four studies addressing four objectives failed to show improvements in knowledge.3254–56 No study demonstrated a regression in knowledge. Of these four studies, one study32 did not clearly report on the duration of evaluation while the three remaining studies evaluated long-term knowledge changes. No study was identified that evaluated short-term knowledge and failed to show a change in knowledge. Of the three studies that considered long-term outcomes, the study by Elliott et al demonstrated trends toward improved knowledge but these slight improvements lacked statistical significance.55 In the remaining two studies, one failed to show improvements in knowledge regarding bioterrorist attacks through the voluntary participation in a web-based educational program and the other failed to show an improvement in knowledge regarding blood pressure control after a mailed CME program.
Two studies addressing two objectives demonstrated mixed results.5762 One study did not clearly report on the duration of evaluation and one reported on long-term effects. The study that did not clearly report duration evaluated the impact of an educational intervention aimed at medical care evaluation committees and demonstrated a statistical improvement in only one of the three committees.57 The study by Chodosh et al considered an evaluation nine months after the intervention. In this study, more intervention group physicians answered knowledge-related questions on capacity determination for patients with possible dementia, but found no differences between intervention group and control group on questions regarding dementia evaluation, patient safety, or depression treatment.
In summary, 78% of the studies with an adequate control group demonstrated that CME activities were effective at improving knowledge with the majority (68%) of these studies demonstrating long-term improvements in knowledge.
Short-term and long-term effects of CME media methods on knowledge. Studies were classified by objectives met and evaluation duration as described above and by media utilized for the educational intervention. The media classifications included multi-media, single media and single versus multi-media.
As stated in the above section, 22 studies addressing 23 objectives demonstrated improvements in knowledge. Seven studies addressing eight objectives evaluated a single media intervention.3940434548505356 Three of the studies utilized the internet and the remaining four utilized live media. The three studies that utilized the internet demonstrated improved short-term knowledge in one study39 and long-term knowledge in two studies.4350 All four of the studies that utilized live media demonstrated long-term improvements in knowledge. Twelve studies addressing 12 objectives utilized multimedia interventions to demonstrate knowledge benefits.1933–37414246495152 Five studies did not clearly report on the duration and seven studies demonstrated long-term improvements in knowledge. Three studies compared a single media-based intervention to a multimedia intervention.384447 All three studies included at least one concurrent control group that only utilized print material and in all three cases multi-media outperformed the print-based single media intervention group.
The four studies that did not demonstrate improvements in knowledge utilized single media (print) in one study56 that evaluated long-term knowledge, multimedia (live, audio and print) in one study32 in which evaluation duration was not clearly defined, multimedia (live and print) in one study55 that considered long-term knowledge changes, and one study54 that compared single to multi-media for long-term benefits. In this study the comparison was between a live intervention and a live internet with non-real time reading material.
Of the two studies that demonstrated mixed results, one study57 utilized a single media intervention (live) and the other study62 utilized multimedia intervention (live internet with not real time print).
When grouped solely by media classification, 9 studies addressing 10 objectives used a single media with 7 demonstrating benefits, 15 studies addressing 15 objectives considered multi-media interventions with 12 demonstrating benefits, and 4 studies addressing 4 objectives compared single to multi-media with 3 of these studies demonstrating that multi-media interventions had greater benefit.
Given the heterogeneity of the studies the only recognized trends were that multimedia seems better than a single media intervention and that print interventions are either not beneficial or very weak in their ability to lead to improved knowledge.
Short-term and long-term effects of CME educational techniques on knowledge.Studies were classified by objectives met and evaluation duration as described above and by technique utilized for the educational intervention. The technique classifications included multiple techniques, single technique and single versus multiple techniques. Two of the studies that have not been previously included because of a lack of a control group can be utilized in this section because the concurrent comparison group did indeed utilize a different technique permitting some head to head comparisons.2566 This means that for this section there are 30 studies addressing 31 objectives.
Only one study that met its objectives utilized a single technique, academic detailing, and this study demonstrated improvements in long-term knowledge.19 Eighteen studies addressing 18 objectives utilized multiple techniques to improve knowledge.33–373941–4648–53 Five studies33–37 that demonstrated improved knowledge did not clearly report the duration, 1 study39 demonstrated short-term knowledge improvement and 12 studies demonstrated long-term knowledge improvements. The majority of the studies that demonstrated improvement but did not clearly report the duration included case-based learning as a technique in combination with techniques ranging from discussion groups to independent reading. Of the 12 studies that demonstrated long-term improvements in knowledge the majority integrated multiple techniques and most commonly combined case-based learning with discussion groups with independent reading, several utilized standard lectures with readings, and some technique combinations ranged from lecture with a standardized patient to lecture with team-based training. Three studies addressing four objectives compared a single technique versus multiple techniques demonstrated improvements in knowledge.384047 One study did not clearly report duration and two studies addressing three objectives demonstrated long-term knowledge improvements. The study that did not clearly report duration demonstrated a greater benefit to the combination of case-based learning with readings when compared to readings alone.38 One study that addressed two objectives compared problem-based learning to the combination of lecture and discussion groups and demonstrated that that problem-based learning group was more effective regarding knowledge of diagnosis and management of headache. The other study compared discussion groups with readings to readings alone and the combination was more effective at increasing knowledge regarding compliance adherence.
Four studies of four objectives did not demonstrate improved knowledge.3254–56 One study that utilized multiple techniques that included lecture, discussion groups, role playing and feedback failed to demonstrate improved knowledge and did not clearly report the duration.32 Two studies failed to demonstrate improvements in knowledge despite utilizing multiple techniques.5556 One combined lecture with case-based learning with discussion groups and readings and failed to show improvements in pain knowledge, while the other combined readings with chart cue materials and failed to show improvements in knowledge of hypertension. One study compared a single technique, lecture to a combination of techniques that included lecture with case-based learning and readings and failed to show an improvement in knowledge of bioterrorism.
The two studies that were added to this section deserve a greater explanation. Greenberg et al compared lecture to case-based learning and the case-based learners demonstrated improvements in post-test questions regarding common pediatric problems as compared to 29% improvement in the lecture group.25 Unfortunately, these improvements seen in the case-based group were not significant at six and nine months after the intervention. Heale et al compared three groups, one receiving lectures, one case-based learning with discussion groups, and the final group received problem-based learning.66 This study also failed to demonstrate benefits of any technique either in short or long-term knowledge. Thus these two studies, when added to the four above studies that did not demonstrate improvements means a total of six studies did not show improved knowledge.
One of the two studies that demonstrated mixed results did not clearly report duration while the other considered long-term knowledge improvements.5762 The study57 that did not report duration compared a combination of case-based learning with discussion groups to control while the other compared lecture with discussion group to control.
When grouped solely by technique classification then two studies addressing two objectives utilized a single technique with one study showing improvements through the use of academic detailing and one showing no knowledge improvement through the use of readings alone. Twenty-three studies addressing 23 objectives utilized multiple techniques with 18 demonstrating benefits, 3 no benefits, and 2 mixed results. Five studies addressing six objectives compared single versus multiple techniques. Two studies addressing two objectives demonstrated benefits of multiple techniques as compared to single, one study addressing two objectives demonstrated benefits to single technique(problem-based learning) compared to multiple and two studies addressing two objectives did not demonstrate any benefit of single as compared to multiple techniques.
The outcomes from this section are also heterogeneous but it does appear that multiple techniques that most commonly include case-based learning seem to be more likely to be associated with improvements in knowledge.
Short-term and long-term effects of the amount of exposure on knowledge. Studies were classified by objectives met and evaluation duration as described above and by exposure volume for the educational intervention. The exposure volume classifications included multiple exposures, single exposure and single versus multiple exposures. The two studies added in the previous section do not apply to this section given the lack of control groups. In addition, one additional study38 did not adequately describe the exposure and thus is excluded from analysis in this section leaving 27 studies addressing 28 objectives.
There were 21 studies addressing 22 objectives which demonstrated an improvement in knowledge.1933–3739–53 Five studies addressing five objectives evaluated a single exposure volume with one study36 not clearly reporting the duration and four studies44454853 demonstrating long-term improvements. Three of these considered knowledge at 6 months after intervention and one at 24 months after intervention. Twelve studies addressing 12 objectives utilized multiple exposure volumes. Three studies did not clearly report duration,333537 one study demonstrated short-term knowledge gains,39 and the final eight studies all demonstrated long-term knowledge gains.1941434649–52 The shortest time interval to evaluation in these long-term beneficial studies was 3 months and the longest was 15 months. Four studies addressing five objectives compared a single exposure to multiple exposures and demonstrated knowledge gains. One study that did not clearly report duration demonstrated that multiple exposures were better than a single exposure at improving knowledge of office-based dermatologic procedures.34 The other three studies addressing four objectives all demonstrated that multiple exposures were better than a single exposure at improving knowledge.404247
Four studies addressing four objectives did not demonstrate knowledge improvements. No study was identified that evaluated a single exposure only. Three studies evaluated multiple exposures with one32 not clearly reporting the duration and the other two5556 failing to demonstrate long-term knowledge improvements. One study that compared single versus multiple exposures did not show any difference from baseline knowledge at one and six months after intervention in either group.54
No studies were identified that utilized a single exposure only or compared a single versus multiple that demonstrated mixed results. Two studies demonstrated mixed results and both utilized multiple exposures.5762 One study57 did not clearly report on duration and the other study demonstrated mixed results at the nine month evaluation.
When grouped solely by exposure volume then all five studies that evaluated a single exposure demonstrated improved knowledge. Twelve of the 17 studies that utilized multiple exposures demonstrated knowledge improvements with an additional 2 demonstrating mixed results. The majority (67%) of these were able to demonstrate long-term knowledge benefits. Of the five studies that compared a single exposure to multiple exposures, four (80%) demonstrated a greater benefit to multiple exposures as compared to a single exposure.
In summary, despite the heterogeneity of these studies it appears that despite the fact that all five studies that utilized a single exposure demonstrated benefit the head-to-head comparison studies imply that when possible multiple exposures produces better knowledge gains.
Summary of the effects of CME on knowledge. The heterogeneity of the studies precludes firm conclusions, but the trends demonstrated that CME is effective at producing both short-term and long-term knowledge gains and that when possible, multimedia, multiple techniques, and multiple exposures should be used.
A total of 22 studies addressing 26 objectives demonstrated improvements in attitude.1334–363940424750525368–78 This represents 71% of the 31 studies with an adequate control group. Of these, six studies addressing seven objectives did not clearly report the duration of evaluation,1334–366869 one study addressing two objectives demonstrated short-term improvement in attitudes,39 and 15 studies addressing 17 objectives demonstrated long-term improvements in attitude.40424750525370–78 The studies were too heterogeneous too identify any global similarities, but any specific information related to the type of media, educational technique or exposure volume will be addressed in those subsequent sections.
Four studies addressing four objectives failed to show improvements in attitude.37525562 No study demonstrated a regression in attitude. Of these four studies, one study37 did not clearly report on the duration of evaluation while the three remaining studies evaluated long-term attitudinal changes. No study was identified that evaluated short-term attitudes and failed to show a change in attitudes. Of the three studies that considered long-term outcomes, the study by Elliott et al demonstrated trends toward improved attitudes but these slight improvements lacked statistical significance.55 In the remaining two studies, one failed to show improvements in attitudes regarding providers' perceptions of quality of care nine months after intervention.62 The other failed to show an improvement in attitude regarding efficacy of cholesterol lowering practices.52
Two studies addressing two objectives demonstrated mixed results.6281 Both studies reported on long-term effects. The study by Chodosh et al considered an evaluation nine months after the intervention. In this study, more intervention group physicians endorsed the statement, “Older patients with dementia are difficult to manage in primary care,” but no other differences in attitudes regarding dementia were identified.62 The study by Norris et al utilized an evaluation six months after intervention and noted that intervention group providers had significant improvements in three of ten tested attitudes, including they felt by self report that they were more likely to counsel patients regarding physical activity than control providers.81
In summary, 85 percent of the studies with an adequate control group demonstrated that CME activities were effective at improving attitudes with the majority (68%) of these studies demonstrating long-term improvements in attitudes.
Short-term and long-term effects of CME media methods on physician attitudes.Studies were classified by objectives met and evaluation duration as described above and by media utilized for the educational intervention. The media classifications included multi-media, single media and single versus multi-media.
As stated in the above section, 22 studies addressing 26 objectives demonstrated improvements in attitudes. Seven studies addressing eight objectives evaluated a single media intervention.13394050537172 Two of the studies3950 utilized the internet, one13 utilized a computer-based program and the remaining four utilized print media.40537172 The two studies addressing three objectives that utilized the internet demonstrated improved short-term attitudes in one study and long-term attitudes in the other. All four of the studies that utilized print media demonstrated long-term improvements in attitudes. The one study that utilized computer-based education for its intervention did not clearly report the duration. Twelve studies addressing 15 objectives utilized multi-media interventions to demonstrate attitudinal benefits.34–36425268697375–78 Five studies addressing six objectives did not clearly report on the duration and seven studies addressing nine objectives demonstrated long-term improvements in attitudes. Three studies compared a single media-based intervention to a multi-media intervention.477074 All three studies included at least one concurrent control group that only utilized print material and in all three cases multi-media outperformed the print-based single media intervention group. All three studies showed long-term improvements in attitudes.
The four studies that did not demonstrate improvements in attitude all utilized multimedia.37525562 One study combined live with print and did not clearly report on duration.37 The other 3 studies considered long-term attitudinal changes at 9 months in one study and 15 months in the other 2 studies. The study that considered outcomes at nine months utilized a combination of live internet with print material while the other two studies combined live with print in one and live with video and print in the other.
Both of the two studies that demonstrated mixed results utilized multimedia approaches.6279 One study utilized live internet with print material and the other study utilized live in conjunction with print and a followup phone call.
When grouped solely by media classification, then 7 studies addressing 8 objectives utilized a single media with all 7 demonstrating attitudinal benefits, 18 studies addressing 26 objectives considered multimedia interventions with 12 demonstrating benefits, and 3 studies addressing 3 objectives compared single to multi-media with all 3 of these studies demonstrating that multimedia interventions had greater benefit.
Given the heterogeneity of the studies the only recognized trends were that multimedia appears better than a single media intervention and that print interventions are either not beneficial or very weak in their ability to lead to improve attitudes.
Short-term and long-term effects of CME educational techniques on physician attitudes. Studies were classified by objectives met and evaluation duration as described above and by technique utilized for the educational intervention. The technique classifications included multiple techniques, single technique and single versus multiple techniques. One study that had not been previously included because of a lack of a control group can be utilized in this section because the concurrent comparison group did indeed utilize a different technique permitting some head to head comparisons.66 This means that for this section there are 23 studies addressing 27 objectives.
Only one study that met its objectives utilized a single technique, reading, and this study did not clearly report duration.13 Seventeen studies addressing 21 objectives utilized multiple techniques to improve attitudes. Five studies34–366869 addressing 6 objectives that demonstrated improved attitudes did not clearly report the duration, 1 study39 addressing 2 objectives demonstrated short-term attitudinal improvement and 11 studies4250525371–7375–78 addressing 13 objectives demonstrated long-term attitudinal improvements.
The majority of the studies that demonstrated improvement but did not clearly report the duration included case-based learning as a technique in combination with techniques ranging from discussion groups to independent reading. Of the 11 studies that demonstrated long-term improvements in attitude the majority integrated multiple techniques and most commonly combined case-based learning with discussion groups with independent reading, several utilized standard lectures with readings, and some technique combinations ranged from lecture with a standardized patient to lecture with team-based training. Four studies addressing four objectives compared a single technique versus multiple techniques demonstrated improvements in attitudes.40477074 All four studies demonstrated long-term knowledge improvements. Three of these four studies utilized readings as the sole technique and one utilized a discussion group. All four studies demonstrated that multiple techniques were better than single techniques.
Four studies of four objectives did not demonstrate improved attitudes.37525562 One study that utilized multiple techniques that included live and print failed to demonstrate improved attitudes and did not clearly report the duration.37 Two studies failed to demonstrate improvements in attitudes despite utilizing multiple techniques.5262 One combined lecture with discussion group and failed to show improvements in provider perceptions about quality of care for patients with dementia, while the other combined case-based learning with discussion with readings, with standardized patients and failed to show improvements in attitudes regarding cholesterol lowering practices. Only one study compared a single technique versus multiple and failed to show an improvement in provider attitudes regarding pain.55
The one study that was added to this section deserves a greater explanation. Heale et al66 compared three groups, one receiving lectures, one case-based learning with discussion groups, and the final group received problem-based learning. This study demonstrated improvements in attitudes that were greatest in the problem-based learning participants.
Both studies that demonstrated mixed results assessed long-term attitudinal change6281 and both utilized multiple techniques. These included lecture with a point-of-care opinion leader in one study and lecture with discussion groups in the other.
When grouped solely by technique classification then one study addressing one objective utilized a single technique and it demonstrated attitudinal improvement. Twenty-two studies addressing 26 objectives utilized multiple techniques with 17 demonstrating benefits, 3 no benefits, and 2 mixed results. Five studies addressing five objectives compared single versus multiple techniques. Four studies demonstrated greater attitudinal change with the utilization of multiple techniques as compared to a single technique and one study showed no improvement.
The outcomes from this section are also heterogeneous but it does appear that multiple techniques that most commonly include case-based learning seem to be more likely to be associated with improvements in attitudes.
Short-term and long-term effects of the amount of exposure on physician attitudes.Studies were classified by objectives met and evaluation duration as described above and by exposure volume for the educational intervention. The exposure volume classifications included multiple exposures, single exposure and single versus multiple exposures. The one study added in the previous section does not apply to this section given the lack of control group. In addition, one additional study58 did not adequately describe the exposure and thus is excluded from analysis in this section leaving 24 studies addressing 25 objectives.
There were 22 studies addressing 25 objectives which demonstrated an improvement in attitude.1334–363940424750525368–78 Seven studies addressing eight objectives evaluated a single exposure volume with two studies3668 not clearly reporting the duration and five studies5370717477 addressing six objectives demonstrating long-term improvements. Eleven studies addressing 12 objectives utilized multiple exposure volumes. Three studies133569 did not clearly report duration, one study39 demonstrated short-term knowledge gains, and the final seven studies50527273757678 addressing eight objectives all demonstrated long-term knowledge gains. Four studies addressing five objectives compared a single exposure to multiple exposures and demonstrated attitudinal improvements.34404247 One study34 addressing two objectives did not clearly report duration, while the remaining three studies all demonstrated long-term improvements in attitudes.
Four studies addressing four objectives did not demonstrate attitude improvements. No study was identified that evaluated a single exposure only or that performed a comparison between single and multiple exposures. Three studies evaluated multiple exposures with one37 not clearly reporting the duration and the other two5562 failing to demonstrate long-term knowledge improvements.
No studies were identified that utilized a single exposure only or compared a single versus multiple that demonstrated mixed results. Two studies demonstrated mixed results and both utilized multiple exposures.6281 Both studies assessed long-term attitudes.
When grouped solely by exposure volume then all seven studies that evaluated a single exposure demonstrated improved attitudes. Eleven of the 17 studies that utilized multiple exposures demonstrated attitudinal improvements with an additional two demonstrating mixed results. The majority (64%) of these were able to demonstrate long-term attitudinal benefits. Of the four studies that compared a single exposure to multiple exposures all demonstrated a greater benefit to multiple exposures as compared to a single exposure.
In summary, despite the heterogeneity of these studies it appears that there is a trend toward multiple exposures being of greater benefit for attitudinal change then a single exposure, although it must be pointed out that all seven studies that evaluated a single exposure indeed demonstrated improvements in attitudes.
Summary of the effects of CME on physician attitudes.The heterogeneity of the studies precludes firm conclusions, but the trends demonstrated that CME is effective at producing both short-term and long-term attitudinal gains and that when possible, use of multimedia, multiple techniques, and multiple exposures should produce better attitudinal outcomes.
Of those 15 studies, 10 had skill outcomes that met the objectives of the study.32333640497284–87 One study had three skill outcomes that all met the objectives,84 and another study had two reported skill outcomes that met the objectives.33 The remaining studies each had one skill outcome. Two studies had skill outcomes that did not achieve the study objectives,1377 and one study had mixed results that did not clearly meet the study objectives.88 Two studies compared different methods and techniques of CME without a separate control group that did not receive CME;6080 therefore, the overall effectiveness of CME cannot be discerned from these studies. Given that 10 out of 13 studies that included control groups and reported on skills (13 out of 16 outcomes) met the objectives and given the varied nature of the studies, the literature does indicate that CME is effective in this area, particularly at developing cognitive skills. Little can be said about the effectiveness of CME for psychomotor skills given the paucity of data in this area.
Short-term effects of CME media methods on skills.The media methods of CME that were included in the studies that met the study objectives regarding skills included seven that used live media,32333640497284 four that included print materials,32334985 two that included video methods,3236 two that included audio methods,3249 two that used the Internet (not real time),8687 and one that used computers (off-line).33 However, in several cases the same media methods were used in all experimental arms and therefore no conclusions can be drawn from the study outcomes applying to specific CME methods. This situation applied to live methods in one study,40 audio methods in one study,49 and print materials in two studies.4985 In addition, one study was not clear about which groups received particular methods including live, print, and computer (off-line) media.33 The studies that did not clearly meet skills objectives included live,77 88 video,77 88 and computer (off-line) methods.13 Two studies directly compared different methods of CME but did not include a control group without CME.6080 One of these studies showed no difference between print, computer (off-line), and live methods.80 However, another study showed that live methods were superior to video and print combined.60 Given this result and the dominance of live methods among the studies that met their skills objectives, the data suggests that live methods have the greatest impact on the effectiveness of CME regarding skills outcomes. Given the paucity of data and the varied results, little can be said about the relative effectiveness of other CME media methods on affecting skills.
Long-term effects of CME media methods on skills.The six studies that addressed long-term skill outcomes beyond 30 days and met their skills objectives used a variety of media methods including four using live methods,40497291 two using print materials,4985 one using audio methods,49 and one using the Internet (not real time).86 However, the one study that used audio methods had audio in all experimental arms,49 the two that used print media did so in all groups,4985 and one of the four that used live methods did so in all groups.40 Therefore, the experimental effect on long-term outcomes in studies that met their skills objectives was seen only for three studies using live media,497284 and one study that used the Internet (not real time).86 The one study that did not meet its objectives regarding long-term skill outcomes used live and video as the methods.77 The studies that directly compared different methods of CME did not address long-term skill outcomes.60 80 Based on the limited data, it is difficult to draw conclusions on particular media methods of CME that have a greater or lesser impact on long-term skill outcomes.
Short-term effects of CME educational techniques on skills.Varied educational techniques were used in the studies that met their objectives regarding short-term skill outcomes. These included lectures in six studies,323640497284 discussion groups in five studies,3249728485 readings in five studies,3233498586 case-based learning in four studies,36498687 feedback in three studies,324986 role play in three studies,327284 and clinical experiences in two studies.3272 Listserv,85 programmed learning,86 problem-based learning,40 audio-taped encounters,72 and standardized patients,84 were seen in one study each in the studies that met their skills objectives. Of note, readings were used as a technique in both experimental and control groups in one study.85 In the studies where the skills objectives were not clearly met by the outcomes, the following techniques were used: demonstration and lecture in two studies,7788 and readings,13 discussion groups,88 feedback,88 programmed learning,88 and role play88 in one study each. Two studies that did not include a control group compared the techniques of readings versus readings with demonstration versus mentor/preceptor with simulation.80 No difference was seen with these different techniques. In another study without a control group, the techniques of demonstration and simulation were compared with demonstration and simulation plus feedback.60 The group that included feedback was significantly better when skills acquisition was assessed. Finally, two of the studies that met their objectives regarding skills and included control groups did a comparison of individual techniques of delivering CME. One study compared problem-based learning with lecture, and problem-based learning was significantly better.40 Another study compared discussion groups and readings to discussion groups and readings with feedback.85 In this study, feedback had no additional effect. Given the limited number of studies, the wide variety of techniques described, and the conflicting results, it is difficult to draw conclusions about the educational techniques that have the greatest and least effect on skills.
Long-term effects of CME educational techniques on skills.The educational techniques used in the studies that met the study objectives regarding long-term skill outcomes included discussion groups in four studies,49728485 lectures in four studies,40497284 readings in three studies,498586 role play in two studies,7284 case-based learning in two studies,4986 and feedback in two studies.4986 The techniques of clinical experiences,72 listserv,85 programmed learning,86 problem-based learning,40 audio-taped encounters,72 and standardized patients,84 were seen in one study each that met skills retention objectives. The one study that did not meet its objectives regarding long-term skill outcomes used the techniques of demonstration, lecture, and simulation.77 The two studies that compared different techniques in CME without a control group did not address long-term skill outcomes.6080 However, in two studies that did have a control group, one showed that problem based learning is superior to lectures in long-term skill outcomes,40 but the other showed no advantage of feedback over demonstration and readings alone.85 Given the limited number of studies and the varied techniques, it is difficult to draw conclusions about the educational techniques that that have a greater or lesser effect on long-term skill outcomes.
Short-term effects of the amount of exposure on skills.The majority of the studies that met their skills objectives had multiple exposures to the CME activity. Seven of these studies used multiple exposures,3240497284–86 while one used a single exposure.36 It was unclear how many exposures there were in two of the studies whose skills outcomes met the study objectives.3387 The studies that did not meet the skills objectives included two that used a single exposure,7788 and one that used multiple exposures.13 These results suggest that multiple exposures to CME for skills objectives is superior to a single exposure.
Long-term effects of the amount of exposure on skills.All six of the studies that addressed long-term skill outcomes and met the study objectives used multiple exposures to CME.40497284–86 The one study that did not meet its study objectives regarding long-term skill outcomes used a single exposure.77 These results support multiple exposures as having a greater impact on long-term skill outcomes.
Fifty studies with evaluation duration greater than 30 days met 58 objectives, suggesting long-term retention of CME effectiveness.1944–49525362707275–7881828295–125 Among these, evaluation duration ranged from 6 months or less after the educational intervention (17 studies) to 1 year or greater (30 studies).
A wide mix of objectives was studied. For example, 12 objectives were related to medication prescribing, eight were related to screening standards, 14 were related to physician counseling behaviors (mostly smoking cessation, but also dietary counseling, sexual practices counseling, etc.), 11 were related to guideline adherence, and the remainder were related to physician behaviors pertaining to other topics.
Twenty nine total studies, reporting on 38 objectives, did not meet objectives. Of these, twenty four studies reporting on 33 objectives were evaluated at greater than 30 days.141643567073749699101119122126–137 Two studies with evaluation duration of 30 days or less did not meet objectives.18138 Three studies did not report evaluation duration and did not meet objectives.1357139
Nine studies, evaluating 9 objectives, showed mixed results in terms of their objectives being met.2042717388137140–142 One study was unclear as to whether it met objectives.101 Fourteen studies, evaluating 17 objectives lacked a control group and therefore did not allow us to assess effectiveness.23–255964–6783143–147
Overall, CME interventions were effective in the short- and long-term achievement of practice behavior objectives.
Short-term and long-term effects of CME media methods on practice behavior. The different types of media evaluated included single-media live presentations (20 studies), single-media print materials (9 studies), internet (1 study), other single media (2 studies), multimedia (57 studies), and also single versus multimedia comparisons (15 studies).
Of 20 studies using single live media, 10 studies met 11 objectives, three studies did not meet objectives, three studies showed mixed results and four studies did not have a control group. Of these 20 studies, nine studies with evaluation duration of greater than 30 days met 10 objectives,4548537282102107121125 suggesting that the use of single live media had a favorable long-term effect on practice behavior objectives. One study had an evaluation duration of 30 days or less and met objectives using single live media.94 Three studies using single live media reported evaluation duration greater than 30 days but showed mixed results.2071140 Four studies lacked a control group, and no meaningful conclusions could be drawn with regard to the comparative effectiveness of single live media.24256466 Three studies did not meet practice behavior objectives using single live media. Of these, one did not report evaluation duration,57 one reported evaluation duration of 30 days or less,138 and one reported evaluation duration greater than 30 days.133
Out of nine studies with ten objectives that examined the impact of single print media, only one met objectives,15 but it did not report evaluation duration. One study did not meet objectives using single print media and did not report evaluation duration.139 One study with evaluation duration of 30 days or less did not meet objectives using single print media.18 Four studies (five objectives) with evaluation duration greater than 30 days did not meet objectives using single print media.1456130134 Two studies using single print media reported evaluation duration greater than 30 days, but lacked a control group and no meaningful conclusions could be drawn with regard to the comparative effectiveness of single print media.23143 The evidence suggests that single print media is not effective in the short- or long-term achievement of practice behavior objectives.
The only study using single internet media reported an evaluation duration greater than 30 days and it was unclear whether it met objectives.43 One study did not meet objectives using other single media and did not report evaluation duration.13 One study using other single media did not report evaluation duration and lacked a control group.147 One study with evaluation duration greater than 30 days did not report the media used and lacked a control group.145
Out of 57 studies (78 total objectives) using multimedia, 40 studies met 47 objectives, 14 studies did not meet 19 objectives, 4 studies showed mixed results, and 4 studies lacked a control group. Of the 40 studies which met objectives, 31 studies with 37 objectives were evaluated at greater than 30 days suggesting that multimedia-based CME has a favorable long-term effect on practice behaviors.194649526275–7881959899101103104106108–110112113116–120122–124148 Nine studies met 10 objectives using multimedia but did not report evaluation duration.323536686979879293 One study using multimedia did not report evaluation duration; it also lacked a control group.146 Four studies using multimedia reported an evaluation duration greater than 30 days, and showed mixed results.427388142 Three studies (six objectives) with evaluation duration greater than 30 days lacked a control group and precluded meaningful conclusions.6583144 Fourteen studies with 19 objectives with evaluation duration greater than 30 days did not meet objectives using multimedia.167399101119122126–129131132135136 The evidence suggests that multimedia may have a positive short- and long-term effect on practice behavior objectives.
Out of 15 studies comparing single media and multimedia, 10 studies met 11 objectives and all were evaluated at an interval of greater than 30 days after the educational intervention,4447709697100105111114115 suggesting that both single media and multimedia have a positive short and long-term effect on practice behavior objectives, and that multimedia have an advantageous effect. One study comparing single media and multimedia did not report evaluation duration and lacked a control group.59 Two studies comparing single media and multimedia reported evaluation duration greater than 30 days and showed mixed results.137141 One study reported evaluation duration greater than 30 days, but lacked a control group.67 Four studies with evaluation duration greater than 30 days did not meet seven objectives.707496137
One study did not report the type of media used, reported evaluation duration greater than 30 days, lacked a control group, and was inconclusive about meeting objectives.145
The evidence suggests that both single media and multimedia may have positive short- and long-term effects on practice behavior, with use of multimedia being advantageous.
Short-term and long-term effects of CME educational techniques on practice behavior.A total of 11 studies reporting on 12 objectives evaluated the impact of a single technique on practice behavior objectives. One study met objectives using a single technique but did not report evaluation duration.15 One study with evaluation duration of 30 days or less met objectives using a single technique.94 One study with evaluation duration greater than 30 days met objectives using a single technique.19 One study did not report evaluation duration and did not meet objectives using a single technique.13 One study with evaluation duration of 30 days or less did not meet objectives using a single technique.18 Two studies with evaluation duration of greater than 30 days did not meet objectives using a single technique.1416 One study was judged inconclusive because of mixed results.20 Three studies using a single technique reported evaluation duration of greater than 30 days, but lacked a control group.23–25 Two of these three studies did not reach statistical significance. This suggests that a using a single technique may not have a short- or long-term positive effect on practice behavior objectives.
A total of 76 studies with 98 objectives evaluated the short- and long-term impact of multiple techniques on practice behavior objectives. Eight studies met 11 objectives using multiple techniques but did not report evaluation duration3235366869799293 and thus did not allow us to distinguish between short- and long-term effects of CME. Thirty-nine studies with evaluation duration greater than 30 days met 45 objectives using multiple techniques.44–4648495253627276–78818282959899101–113116–120122–124
Two studies did not report evaluation duration and did not meet objectives using multiple techniques.57139 One study with evaluation duration of 30 days or less did not meet objectives using multiple techniques.138 Sixteen studies with evaluation duration of greater than 30 days reporting on 22 objectives did not meet objectives using multiple techniques.435673101119122126–129131–136
One study using multiple techniques did not report evaluation duration and lacked a control group.146 One study using multiple techniques did not report evaluation duration and showed mixed results.149 Seven studies reporting on 10 objectives using multiple techniques reported an evaluation duration of greater than 30 days but lacked a control group.64656783143–145 Six studies showed mixed results.42717388140142 One of the six showed a positive effect which was lost 6 months after the intervention.73
The evidence suggests that the use of multiple techniques in CME may have an overall positive short- and/or long-term effect on practice behavior objectives.
A total of 18 studies reporting on 24 objectives compared the use of single and multiple educational techniques in CME.4759667074879697100114115121125130137141147148 Ten studies with evaluation duration greater than 30 days met 12 objectives comparing single and multiple techniques,47709697100114115121125148 and indicated that multiple techniques may have an advantageous short- and long-term effect on practice behavior objectives. Two studies using single versus multiple techniques did not report evaluation duration and neither had a control group, precluding further meaningful conclusions.59147 Two studies using single versus multiple techniques reported an evaluation duration of greater than 30 days and showed mixed results,137141 One study with evaluation duration greater than 30 days lacked a control group. Five studies with evaluation duration of greater than 30 days reporting on eight objectives did not meet objectives and thus were unable to identify a difference when using single versus multiple techniques.707496130137
Short-term and long-term effects of the amount of exposure on practice behavior.A total of 37 studies reporting on 41 objectives evaluated the impact of single exposure to the CME activity. Two studies met objectives using single exposure, but did not report evaluation duration.3668 One study with evaluation duration of 30 days or less met objectives using a single exposure to the CME activity.94 Sixteen studies with evaluation duration greater than 30 days met 18 objectives using single exposure to the CME activity.4445485370778298100102–104106107111114 This suggests that single CME exposure may have a positive short- and long-term effect on practice behavior objectives. Two studies with evaluation duration of 30 days or less did not meet objectives using single exposure to the CME activity.18138 Only six studies with evaluation duration greater than 30 days did not meet objectives using single exposure to the CME activity.14167074128129 Five studies with evaluation duration greater than 30 days lacked a control group,242564–66 although one of the five did not meet objectives for either intervention group.24 Six studies with evaluation duration greater than 30 days showed mixed results.207188140–142
A total of 55 studies (72 objectives) evaluated the impact of multiple exposures to the CME activity. Five studies met objectives using multiple exposures to the CME activity, but did not report evaluation duration.1532356993 Thirty studies with evaluation duration greater than 30 days met 36 objectives using multiple exposures to the CME activity.19464952627275767881959799101105108–110112113115–121123125148 One study using multiple exposures to the CME activity did not report evaluation duration and lacked a control group.147 Five studies of evaluation duration greater than 30 days lacked a control group242564–66 whereas two studies showed mixed results.73137 No meaningful conclusions could be drawn from these studies. Two studies did not meet objectives using multiple exposures to the CME activity, and did not report evaluation duration.1357 Sixteen studies with evaluation duration greater than 30 days did not meet 21 objectives using multiple exposures to the CME activity.43567399101119126127130132–137150 Overall, we conclude that multiple exposure volume to the CME activity may have positive short- and long-term effects on practice behaviors.
A total of eight studies reporting on 17 objectives did a head-to-head comparison between single and multiple exposures.424759798396122144 Whereas four of these studies indicated that multiple exposures may be better than single exposure, six showed mixed or negative results, thus not allowing us to draw any strong conclusions. One study met objectives using single versus multiple exposures to the CME activity, but did not report evaluation duration.79 Three studies with evaluation duration greater than 30 days met objectives using single versus multiple exposures to the CME activity.4796122
One study using single versus multiple exposures to CME activity did not report evaluation duration and had no control group.59 Two studies using single versus multiple exposures to CME activity reported evaluation duration greater than 30 days but did not have a control group.83144 No meaningful conclusions could be drawn from these studies. One study using single versus multiple exposures to CME activity reported evaluation duration greater than 30 days and showed mixed results.42
Two studies with evaluation duration greater than 30 days did not meet objectives using single versus multiple exposures to the CME activity.96122
One study did not meet objectives and did not report exposure to CME activity or evaluation duration.139
Two studies met objectives using other exposures to the CME activity, but did not report evaluation duration.8792 One study with evaluation duration greater than 30 days met objectives using other exposures to the CME activity.124 One study using other exposure to CME activity did not report evaluation duration and lacked a control group.146
Long-term effects of CME on clinical outcomes.Thirty-three studies, reporting on 42 clinical outcomes, measured the long-term effect of CME, i.e., more than 30 days following the educational intervention.22434755566572–75777881849599109111–115117120131132135137152–156 Fourteen of these studies were successful in achieving the desired effect of the CME intervention on clinical outcomes.22434774788495109111115152–155 One study showed mixed results, impacting frequency of office visits but not emergency room visits or hospitalizations.120 In the remaining 23 studies, either no effect of CME was observed, or the effect was uncertain due to ambiguous results or problems in study design. Of the 14 studies that did show a long-term effect, six reported on direct measures of health status of the target patient population. These outcomes were arthritis pain and disability,95 depression,22152 general health and function,109 emotional distress,84 and lost work due to back pain.153 Eight studies reported on health-related behaviors or attitudes. These outcomes were: percent of patients taking medication,152 patient adherence with antibiotics,47 patient satisfaction with care,74 frequency of physician visits,111 hospitalizations,78 hospital length of stay,155 and smoking cessation rates.115154 One study reported a mixed outcome, “quality of practice,” which combined direct measures of patient health status, such as whether the blood pressure was below 130/80 mm Hg, with behavioral or physician-related outcomes, such as whether the physician had recorded a family history of diabetes.43
Short-term effects of CME media methods on clinical outcomes. Only one study was available to assess the relative effectiveness of different types of media (live, print, internet, or multiple) on short-term clinical outcomes (less than 30 days). In this study, a print intervention improved adherence with beta-blocker use.18 Five studies did not report on duration of clinical outcome.32 33 146 151 157 Thus, no conclusions could be drawn about the differential effectiveness of CME media in the short term.
Long-term effects of CME media methods on clinical outcomes. Of the studies that had information about the effectiveness of different single media forms of CME on long-term clinical outcomes, five used a live CME intervention;7284152154155 four of these five achieved the stated goal of the study. Two studies used print media and neither achieved its objective.56157 Another used Internet-based CME and did achieve its objective.43 Most of the studies, however, used multiple CME media. Twenty-two studies used multiple media CME in comparison to a control.5565737375757778819599109112113117120120131131132135156 In four of these, the study achieved its stated aim. Seven studies compared multiple media CME to single media CME.4774111114115137153 Six of these achieved the stated aim; each found multiple media CME to be more effective than single media in improving clinical outcomes.
Short-term effects of CME educational techniques on clinical outcomes. A total of 15 different educational techniques were identified in the studies that reported clinical outcomes: readings, conference calls, academic detailing, discussion groups, lectures, point of care CME, feedback, physician visits, case-based learning, role-play, standardized patients, demonstrations, clinical experiences, simulation, and problem-based learning. Only one CME technique was evaluated singly in comparison to control for short-term clinical outcomes: provision of educational readings was associated with increased use of beta-blockers.18
Long-term effects of CME educational techniques on clinical outcomes. Only one study evaluated the effect of a single CME technique in comparison to control for long-term clinical outcomes. In this study, the use of conference calls was associated with improvements in depression.22 With only one such study, no conclusions could be drawn regarding the comparative effectiveness of single CME techniques on long-term clinical outcomes. Most of the studies evaluated used multiple CME techniques. Thirty-eight studies reported on the use of multiple simultaneous CME techniques in comparison to control. Twelve of these reported that the desired clinical outcome of the CME intervention was achieved.434774788495109111115152153155 Two studies yielded mixed results, i.e., some of the outcomes showed improvement while others did not.120151 No individual CME techniques were common to the studies that did or did not achieve their stated objective; thus, one cannot draw any conclusions regarding the differential effectiveness of specific educational techniques. Five studies compared single to multiple CME interventions.4774115137152 In three of the five studies, the use of multiple simultaneous CME techniques was superior to the use of a single CME technique (readings).
Short-term effects of the amount of exposure on clinical outcomes. Only one study assessed the short-term effect of CME on clinical outcomes (less than 30 days). In this study, a one-time print intervention improved adherence with beta-blocker use.18 Five studies did not report on duration of clinical outcome,3233146151157 and all other studies reported long-term outcomes. Thus, no conclusions could be drawn about the differential effectiveness of amount of CME exposure (one-time vs. multiple exposures) on short term outcomes.
Long-term effects of the amount of exposure on clinical outcomes. Seven studies evaluated the long-term effect of a single CME exposure on clinical outcomes.657477111114152154 Four of these studies reported that the CME objective had been met.74111152154 In one study, the objective was not met.114 Of the remaining two studies, one lacked a well-defined control group65 and one yielded unclear results.77 Most studies employed multiple CME exposures. In 24 studies, the multiple CME exposures were compared to a control with no CME.22435556727375757881849599109112113115117120131132135137156 In seven of the studies, the objectives were met.2243788495109115 In 16 studies, the objectives were not met or it was unclear if they were met.5556727375758199112113117131132135137156 One study produced mixed results, as described above.120 In one study, a single printed CME intervention and a combination of printed material plus tutorial were compared with control.47 While the CME intervention was deemed successful in changing clinical outcomes, the use of multiple CME exposures (tutorial plus reading) outperformed a single CME exposure (reading alone) in only 2 of 5 outcomes studied. In summary, both one-time and multiple exposure CME interventions have produced changes in clinical outcomes in about half of the studies, but it is unclear whether multiple exposure CME produces better results than one-time CME.
Types of simulation studied. A wide variety of simulation-based methods were identified in these reviews. Two reviews2630 included studies that had evaluated virtual reality, two reviews31159 included studies with full simulation, three reviews28158160 included studies with standardized patients or role play, five reviews29–31158159 included studies with partial task simulation, and six reviews28–31158159 included studies with computer simulation. All reviews had studies that compared simulation-based training with another type of simulation-based training, other educational intervention, standard training, no education, or no training.
Study populations and study designs. Three reviews26158160 restricted inclusion criteria to studies that had enrolled only medical students or physicians-in-training. Other reviews also included studies that enrolled, in addition to medical students and physicians-in-training, fully-trained physicians, nurses, allied health professionals, and non-medical personnel. One review31 included only randomized controlled trials, while other reviews included non-randomized controlled trials and prospective trials in addition to randomized controlled trials.
| Learning objectives | Number of reviews which addressed learning objective | Number of studies which addressed effectiveness of simulation as an educational method | Direction of evidence |
|---|---|---|---|
| Psychomotor Skills | 62629–31158159 | 63 | Favors simulation |
| Communication Skills | 2158160 | 14 | Favors simulation |
| Cognitive Skills | 228159 | 37 | Mixed results |
Three reviews evaluated the effectiveness of virtual reality in teaching surgical skills. In virtual reality, the surgical field is represented in three dimensions which may help in learning more accurate surgical planning and procedures. One meta-analysis26 found that training in a virtual reality environment significantly decreases the total amount of time required for task completion. There was also a trend toward a decreased error rate which did not reach a statistically significant level. Another systematic review29 found four studies that evaluated the role of virtual reality in training surgical techniques. Two studies in this review found improvement in surgical skills after training with a virtual reality simulator while two other studies found no significant improvement after training with virtual reality simulators. A third systematic review31 found that trainees trained on computer simulation perform better than those who received no training, however, studies found an inconsistent benefit of computer simulation if simulation-trained students were compared with those who received standard training. This systematic review found only one study in which computer simulation was found to be superior to a physical training model. On the other hand, this review found that physical or model simulation may be superior to no training and standard training as instructions from mentors or manuals.
Video simulation was studied in one review.31 Video simulation was not superior to standard training or no training, and there was insufficient evidence to support the superiority of computer simulation to video simulation.
One review30 evaluated the effectiveness of simulators for training in gastrointestinal endoscopy and concluded that flexible sigmoidoscopy can be applied for clinical training of residents and fellows for better patient comfort only. However, this review did not find enough evidence to support the use of simulators for clinical training in gastrointestinal endoscopy to improve clinical outcomes.
Another review159 found one study with a cross-over design that reported a better post-test score (on a 22-object written test) of the anesthesia residents who were trained on a simulator as compared to those who were not trained.
Two systematic reviews evaluated the effectiveness of simulation in teaching physical examination. One review158 found that use of standardized patients to teach breast examination to medical students was associated with better performance in a clinical skills examination. This review further found that use of standardized patients or breast examination models was associated with improved ability of the students to detect breast lumps. The second review159 found that use of a patient simulator was associated with improved practical skills as measured on a post-test examination.
Two reviews evaluated the effectiveness of simulation in teaching communication skills. One review158 found that when students were taught communication skills by patients with cancer, including training for giving bad news, students were more likely to respond empathetically to patients and better able to communicate bad news than students who were not taught communication skills by patients with cancer. This review further found that role-playing may be important in teaching oncology-specific communication skills as well as in communicating bad news. The second review160 found that use of standardized patients and role-play was effective in teaching medical students smoking cessation counseling skills. This review further found that the use of standardized patients in teaching tobacco cessation skills to medical students was associated with increased confidence of students in their smoking cessation counseling skills.
We found two reviews28159 that addressed the effectiveness of simulation in knowledge acquisition. Hmelo28 found that computer assisted models are effective in teaching pathophysiologic principles to medical students. The pooled effect size, which measures the combined magnitude of the effect of intervention across 33 studies, was 0.63 for use of computer assisted learning (in favor of simulation). One study included in Ravert159 looked at computer-based trauma simulation to teach trauma management; individual-study effect sizes ranged from -0.04 to 0.35 (did not favor simulation or were neutral). This review159 did not report on the pooled effect size.
One review27 systematically evaluated the features of high-fidelity simulators essential for effective learning. This review found that the following features of a high-fidelity simulator were important for effective learning: should provide feedback during learning experience, allow repetitive practice, can integrate in overall curriculum, has increasing levels of difficulty, is adaptable to multiple learning strategies, can allow clinical variation in a simulated environment, provides controlled environment to make and detect mistakes without consequences, provides individualized and standardized learning, defines outcomes clearly, and must have face validity-realism of simulator.
Overall the direction of evidence points to the effectiveness of simulation training, especially in psychomotor skills (e.g., procedures or physical examination techniques) and communication skills, but the strength of the evidence was considered low, due to the small number of appropriate studies, the scarcity of quantitative data, and other limitations. Several factors may be responsible for the inadequate quality of evidence in support of this method. In our view the most important factor is the lack of widely-accepted and standardized methods to quantify the competency in procedural or communication skills. In addition, the high cost of simulation methods and difficulty in introducing clinical realism in a simulated environment are other factors that may be responsible for inadequate quality of evidence in this field.
| Internal Audience Characteristics | External Factors |
|---|---|
|
|
Six studies examined the effect of years in practice on the educational intervention.3353798092140 Beaulieu, et al. suggested that physicians with less than 11 years of experience ordered fewer unnecessary screening tests than those with more experience, however, these results were not stratified by the educational technique actually employed by individual physicians.140 Two other studies suggested that physicians with greater experience who underwent educational interventions had improvements in attitudes33 and self-reported practice behavior.53 However, none of the studies examined revealed a relationship between years in practice and acquiring knowledge, acquiring skills, or changing practice outcomes. Similarly, age had no influence on the outcomes of educational interventions in the 6 studies that reported on the effects of age.397980101105161
Of the five studies39537980161 that analyzed the effect of gender, only one showed a significant association. Leopold, et al. showed that women improved more than men in confidence with an objective performance of knee joint injections after an educational intervention consisting of printed material, hands-on instruction, or video instruction.80 Only one paper described the influence of race on the effectiveness of the educational intervention.79 Grady, et al. suggested that non-whites who underwent a presentation on mammography screening followed by cue enhancement (i.e., chart stickers and clinic posters) improved their screening rates more than whites.79 This study also suggested that the intervention had a greater effect on solo practitioners compared to those in other practice settings. Three studies that examined the influence of board certification on educational outcomes primarily focused on internists and family practitioners and failed to show an association between certification and the desired outcome.5379105
Conclusions and Limitations. We cannot reach definitive conclusions regarding the influence of audience characteristics on the effectiveness of specific educational techniques due to the heterogeneity of the educational interventions and characteristics examined. Furthermore, there are very limited data regarding any specific characteristic and the overall quality of the existing data on these questions are suboptimal.
The literature is limited about the role of external factors, by themselves or in combination with other factors, in reinforcing the effects of CME on changing behavior. Very few studies explicitly stated that such factors were examined independently or collected data regarding these factors. Only one study rigorously examined external factors as a primary outcome.79 Grady et al. studied whether token monetary rewards in addition to an educational intervention and chart cues increased the rate of mammography referral. While chart cues were effective in increasing mammogram referral and completion rates, the addition of a token monetary incentive provided no added benefit.
The offering of CME credit would not intuitively appear to be an external motivating factor for behavior change. Yet, two studies specifically examined the role of offering CME credit for this purpose, so it was included in our analysis. Both of these studies looked at results in association with earning CME credit for the educational activity.106156 Chassin and colleagues examined in a subgroup analysis the effect of offering CME credit for attendance at educational programs designed to decrease inappropriate x-ray pelvimetry rates in 64 study hospitals.162 They found no significant difference among intervention participants with respect to the offering of CME credit; both groups had a comparable decrease in pelvimetry use. Messina et al. found a potential association of offering CME credit for a physician educational program with an increase in the use of screening mammography in women who had never undergone mammography, but it did not reach statistical significance. The trend did not hold true for previous mammogram users.156 These findings may be because CME credit may be an inducement to attend a CME activity but may not be sufficient to engage the participant in active learning.
Some CME courses utilize the signature of a “commitment-to-change” statement as an external motivating factor to improve clinical outcomes. Two studies examined the effectiveness of such a practice.161163 Mazmanian and colleagues randomized 110 physicians to signature versus non-signature groups. While they found that those expressing an intent to change were more likely to change practice behavior (as documented by self-report on a follow-up survey), the act of signing such a commitment-to-change statement had no effect.161 In a much smaller study of 16 physicians attending a geriatrics course, Pereles et al. found that the physicians who were asked to make written commitments for practice changes (n=7) made more changes than controls at both one and three months followup. The results are of unclear statistical significance given the small study numbers.163
Conclusions and Limitations. There are several barriers to collecting data on external factors. First, it is methodologically difficult to offer incentives (such as CME credit or financial reward) in a controlled fashion. Second, most evaluation of external factors is based on self-report. Finally, small study sizes often preclude a valid analysis of external factors in subgroup analyses due to lack of adequate power. Consequently, it is difficult to draw conclusions regarding the effectiveness of external factors in enhancing CME effects on behavior.
Valid and reliable evaluation tools are necessary to demonstrate the effectiveness of CME interventions. The validity of the evaluation method is “the degree to which the method truly measures what it is intended to measure.”11 A valid evaluation method accurately measures achievement of the stated objective of the educational intervention, whether it involves knowledge, attitudes, skills, practice behaviors, or clinical outcomes. The reliability of the evaluation method is “the consistency or reproducibility of measurements.”11 A reliable evaluation method allows educators to have confidence in their assessments of learning across multiple measurements. As one measurement expert emphasizes, “small amounts of unreliability may cause misclassification errors and large score differences on retesting.”164
An evaluation method may be statistically reliable without being valid for the objective intended by the investigators. However, a method cannot be valid without being reasonably reliable. Thus, Downing argues that “reliability is a necessary but not sufficient condition for validity, and reliability is a major source of validity for all assessments.”164
An educational study may employ a newly created evaluation method or one that previously has been shown to be valid and/or reliable in another study population. The creation of a new evaluation method consumes time and resources for pilot testing, cognitive testing, and psychometric analyses to determine the validity and reliable of the new method. However, a previously used method may not be a valid measure for a new educational intervention if it does not map appropriately to the stated objective. Also, the reliability of a method changes as it is applied in different populations and ideally should be re-measured each time.
We found reports of the validity and/or reliability of at least one evaluation method in 46 out of 136 total articles (33.8 percent). Among these 46 articles, 11 reported on the validity and/or reliability of more than one method: eight studies described two methods;424348526072137144 two studies described three methods;5584 and one study described four methods.32 Thus, 61 evaluation methods were accompanied by validity or reliability data. For the results below, percentages are based on the total number of methods - rather than articles - since some articles reported on multiple methods.
Among these 61 evaluation methods, 30 (49.2 percent) were drawn from previous studies and 28 (45.9 percent) were created for the current studies. For 3 methods (4.9 percent), it was not clearly reported whether the method was newly created or previously used. For 22 of the 30 previously used methods, the authors reported that reliability had been assessed: 13 within the current study population, 8 within previous study populations, and 1 within current and previous study populations. However, only 14 methods were presented with specific statistical data to support this reliability. For 12 of the 30 newly created methods, the authors reported that pilot and/or cognitive testing was performed.
Knowledge or cognitive skills were evaluated by 15 methods (24.5 percent).
Attitudes were evaluated by seven methods (11.5 percent). Two methods focused exclusively on attitudes, while five methods evaluated a combination of attitudes and knowledge / cognitive skills.
Skills (communication, psychomotor, or procedural) were evaluated by 11 methods (18.0 percent). One method evaluated physical exam skills in an educational setting. A combination of skills (communication, psychomotor, or procedural) and practice behaviors were measured by 10 methods, using standardized patients to visit physicians at their practice setting or analyzing interactions with real patients.
Practice behaviors (without clinical outcomes) were evaluated by 20 methods (32.8 percent). Seven methods used self-report by physicians of their practice behaviors. Three methods used patients' report of their physicians' behaviors in their medical care. Ten used chart review of medical records and/or claims data.
Clinical outcomes (with or without practice behaviors) were evaluated by 8 methods (13.1 percent). Two studies used chart review. One used reports by patients or families of their attitudes. One study used patient satisfaction. Two studied patient reports of their own behavior — including medication adherence and participation in preventive screening — as the outcome. One study used patients' reports of preventative services provided by their physicians. Three used measures of the patient's health.
The following articles provide notable examples for reporting of the validity or reliability of educational outcome measures:
Knowledge or cognitive skills: Fordis et al42 describes the development of a knowledge test for cholesterol management, including: content validation by experts; description of the test and response options; piloting and item number reduction; and high internal consistency reliability (Cronbach alpha = 0.79).
Attitudes: Mann et al52 describes the development of a knowledge and attitudes test about cholesterol management, including: content validation by experts; need for >90 percent agreement on question inclusion and consistency; description of the test and response options; pilot testing with internal consistency testing in the pilot population (KR20 = 0.60); and test-retest reliability analysis using control group test scores.
Skills (communication, psychomotor, or procedural): Roter et al84 describes a coding system to rate physicians' proficiency in managing standardized patients' emotional distress, including: blinding of coders; dichotomous coding system; and internal consistency reliability (Cronbach alpha = 0.20–0.62 for shorter scales, 0.76–0.81 for longer scales, and 0.62 for overall score).
Practice behaviors: Sibley et al136 describes chart abstraction for quality of care, including: content validity of pre-determined criteria by experienced clinicians; rating system; training of blinded nurse-abstractors; and high interrater and intrarater reliability (kappa >0.8).
Clinical outcomes: Roter et al84 describes use of the General Health Questionnaire-28 to detect emotional distress in patients, including: citation of original source for this previously validated questionnaire; repeated reliability testing with the study population, yielding high internal consistency (Cronbach alpha = 0.90–0.92).
| Type of Validity | Definition* | # of Methods | Knowledge or cognitive skills† | Attitudes† | Skills (communication or psychomotor)† | Practice Behaviors† | Clinical Outcomes† |
|---|---|---|---|---|---|---|---|
| Content | Degree to which an instrument accurately represents the skill or characteristic it is designed to measure, based on people's experience and available knowledge | 16 | 15213236384042464851–545785142 | 351–53 | 0 | 253136 | 0 |
| Concurrent criterion | Degree to which an instrument produces the same results as another accepted or proven instrument that measures the same variable | 8 | 1142 | 182 | 125 | 616257282137144 | 247137 |
| Predictive criterion | Degree to which a measure accurately predicts expected outcomes | 1 | 0 | 0 | 0 | 178 | 0 |
| Construct | Degree to which an instrument measures the theoretical construct it intends to measure | 5 | 3364385 | 0 | 0 | 253104 | 0 |
| Type of Reliability | Definition | # of Methods | Knowledge or cognitive skills† | Attitudes† | Skills (communication or psychomotor)† | Practice Behaviors† | Clinical Outcomes† |
| Internal consistency | How well items reflecting the same construct yield similar results | 19 | 12364042454849525355636485 | 8454952535562–64 | 23284 | 6324549537284 | 355‡84 |
| Inter-rater | Degree to which measurements are the same when obtained by different persons | 16 | 185 | 0 | 632608488138140 | 133242768488108118132134136138140144 | 2132137 |
| Intra-rater | Degree to which measurements are the same when repeated by the same person | 2 | 0 | 0 | 0 | 248136 | 0 |
| Equivalence | Degree to which alternate forms of the same measurement instrument produce the same results | 4 | 3173654 | 0 | 0 | 153 | 0 |
| Test-retest | Degree to which the same test produces the same results when repeated under the same conditions | 5 | 352‡53 | 25253 | 0 | 32253134 | 0 |
| Validity: not specifically reported | 4 | 0 | 172 | 3324372 | 33272121 | 181 | |
| Reliability: not specifically reported | 5 | 160 | 0 | 132 | 232121 | 181 | |
Definitions were obtained from Reed, et al.11
Methods referenced may target more than one type of outcome and are listed under each applicable column.
Two assessment methods within the same article.
Validity was reported for 31 of 61 evaluation methods (50.8 percent). Content validity was reported for 16 methods. The specific “experts” who reviewed the assessment were reported for 11 of these 15 methods. Concurrent criterion validity was reported for 8 methods. Predictive criterion validity was reported for only 1 method, which involved a comparison between physicians' reports of asthma management behaviors and patients' reports of physician behaviors. Construct validity was reported for 5 methods, usually through “known-group validity” (establishing construct validity by demonstrating better scores among those with higher levels of training or clinical experience). High statistical validity was only demonstrated for two methods.72137 Five methods were described as valid without specific details. Thus, the vast majority of CME studies offered no or limited psychometric data for the validity of their evaluation methods.
Reliability was reported for 43 of 61 evaluation methods (70.5 percent). Internal consistency reliability was reported for 19 methods, including 13 learner instruments, 1 observer instrument for audio-taped interactions, 1 standardized patient instrument, and 3 clinical patient instruments. Inter-rater reliability was reported for 16 methods, including 9 medical data abstractions and 6 skills assessments. Intra-rater reliability was assessed for 2 medical data abstraction studies. Equivalence reliability was reported for 4 methods, and test-retest reliability for 5 methods. Four methods were described as reliable without specific details. When reported, statistical tests yielded primarily modest evidence of reliability based on Cronbach-alpha, Kappa, or correlation statistics.
Forty-six of 136 articles (34 percent) reported the validity and/or reliability of at least one evaluation method for assessing the effectiveness of CME.
Thirty methods were drawn from previous studies, and 28 were created for the current studies. For 3 methods, the source was unclear. Authors did not commonly report reliability testing within the new study population for methods found to be reliable in other populations.
The most common type of outcome evaluated by valid and/or reliable evaluation methods involved practice behaviors, for 20 out of 61 methods (34 percent).
Of 61 evaluation methods with validity or reliability reported, 16 (26 percent) included descriptions of validity alone, 29 (48 percent) included descriptions of reliability alone, and ten (16 percent) had descriptions of both validity and reliability. For six methods (10 percent), the methods were described as valid and/or reliable, but the specific type of validity or reliability was not reported.
Among these 61 methods, content validity was the most commonly reported type of validity (26 percent).
Among these 61 methods, internal consistency (31 percent) and inter-rater (28 percent) were the most common types of reliability reported.
Although many studies of the effectiveness of CME have considered the validity or reliability of their evaluation methods, relatively few studies have used methods that have strong evidence of both construct and criterion validity. In addition, relatively few studies have used evaluation methods that have strong evidence of each of the specific types of reliability (internal consistency, inter-rater, intra-rater, equivalence, and test-retest). We therefore conclude that the overall strength of evidence on the effectiveness of CME is limited by weaknesses in the evaluation methods that have been used. To strengthen the evidence base on the effectiveness of CME, it will be necessary to commit additional resources to the development of valid and reliable evaluation methods. This may be quite challenging because of the limited resources that generally are available to clinician-educators. Where appropriate, educators may save time and resources by using previously validated and reliable methods, but they must demonstrate the validity of these methods for their specific educational outcomes and the reliability of these methods for their particular study populations.
We conducted a systematic review of the medical literature to evaluate the effectiveness of CME in improving knowledge, attitudes, skills, physician behavior and clinical outcomes. Overall, despite the generally low quality of the evidence, most of the studies reviewed suggest that CME is effective, at least to some degree, in not only achieving, but also in maintaining the objectives studied. Despite the wide variety of CME techniques, media, exposures used, and despite the heterogeneity of the studies reviewed, we found common themes among studies which applied across objectives. For example, when assessing the effectiveness of CME across domains, print media seem to be less effective than live media, and multimedia generally seem to be more effective than single media. In addition, interactive techniques seem to be more effective than non-interactive ones, and multiple exposures to the CME activity seem to be more effective than single exposure. Thus, the evidence supports consideration of these attributes of effective educational interventions when designing a CME course.
To ascertain whether broader lessons could be drawn from the non-CME medical education realm, we evaluated the effect of simulation methods in medical education by conducting a review of systematic reviews. Although we found that simulation training generally was effective, especially in the dissemination of psychomotor skills (e.g., procedures or physical examination techniques), studies which examined simulation did not review outcomes along the entire continuum of domains (i.e., knowledge through clinical outcomes), and were heterogeneous enough that few other conclusions could be drawn.
We also studied whether certain internal (audience) and external characteristics or factors, special to the environment, the participants or the course, may affect the effectiveness of CME. We found that the small and heterogeneous studies available did not allow us to reach definitive conclusions regarding the influence of audience characteristics or external factors on the effectiveness of CME. This is an area where further study might yield useful results in asking whether it might be important to marry the CME activity offered with those particular characteristics which might enhance its effectiveness and value.
This evidence report has a number of limitations. First, the heterogeneous nature of the studies inhibits a quantitative summary of the effectiveness of CME. There is a lack of standardization of the definition of CME or associated performance improvement. The educational interventions studied targeted different types of audiences, using multiple types of objectives, across diverse content areas. Thus, comparing the effectiveness of educational methods and techniques across studies is challenging. Even if multiple studies shared comparable objectives, we found that authors did not use standardized reporting of results - such as effect sizes - prohibiting a quantitative meta-analysis of the results. Given these limitations, we had to pursue a qualitative synthesis of the available data.
Second, the generally low quality of study designs limits our ability to draw firm conclusions about the effectiveness of CME. Although we limited the review to studies with comparison groups, including a large number of randomized control trials, many studies lacked adequate descriptions of randomization methods or techniques for adjusting for baseline group differences. Moreover, too many of the articles we studied were published with comparison groups but did not have a control group, which did not allow us to evaluate effectiveness. In addition, only one-fifth of the studies described blinding of those evaluating the outcomes, leaving open the potential for biased assessment.
Third, the quality of reporting was variable. Authors rarely described the study design and the interventions in enough detail to allow reproducibility. In particular, studies rarely described specific learning objectives, prohibiting assessment of whether objectives matched appropriately to the evaluation methods/outcomes.
Fourth, the lack of valid and reliable CME evaluation tools leaves open the possibility of overestimation or underestimation of the effectiveness of CME. Most studies lacked psychometric data regarding their evaluation methods. Thus, the evaluation methods may not have truly assessed the outcomes targeted by the educational interventions. In addition, evaluation methods with poor reliability may fail to detect actual improvements in outcomes.
Fifth, our search strategy may be subject to some publication bias. Our search was limited to published English-language articles about educational studies within the United States and Canada. Our review does not include studies from other countries where high quality CME studies have been conducted. As indicated in the Methods chapter, this methodological choice was made because the medical education systems in other countries are very different from the system in the United States, thereby limiting the applicability of such studies to CME in the United States. Also, educational studies with negative findings are less likely to be published, potentially leading to overestimation of the effectiveness of CME.
Sixth, there is a lack of standardization of approaches to CME research in general, i.e., no Phase 1, Phase 2, Phase 3-like process that crafts an organized approach to how aims are set up and how comparative groups are organized. This includes the lack of standardization for definitions of controls.
Seventh, there is general lack of standardization of terminology related to media, techniques, exposure volume, etc, which makes studying the impact of different methods, techniques, exposures etc. on the effectiveness of CME difficult.
Finally, several limitations were specific to particular key questions:
For Key Questions 1 and 2, this report does not systematically review the effectiveness of quality improvement interventions. Although we included quality improvement studies if they included a physician education component, our search strategy did not systematically target all quality improvement studies. Thus, we cannot draw definitive conclusions comparing the effectiveness of physician education in quality improvement interventions versus quality improvement interventions more generally.
Moreover, our conclusions are limited secondary to the heterogeneity of studies included along multiple domains. In addition, many of the studies lacked a clear control group, which did not allow effectiveness of CME to be determined, but only different effectiveness across different interventions.
For Key Question 3, conclusions are limited due to the weaknesses of the systematic reviews available, the poor quality of many of the included studies, the heterogeneity of included studies, and the rapidly evolving nature of computerized simulation.
For Key Questions 4 and 5, there was lack of standard definitions of internal and/or external factors that might impact CME. In addition, conclusions are limited due to small sample sizes prohibiting analysis based on within-group characteristics and infrequent collection of data on external and internal motivating factors. In addition, lack of standardization of tools to assess the efficacy of CME inhibited our ability to draw firm conclusions.
For Key Question 6, conclusions are limited due to inconsistent reporting of validity and reliability for evaluation methods drawn from previous studies; we may have missed some valid or reliable evaluation methods that were not described as “valid” or “reliable” and for which psychometric data was not reported.
Additional limitations for Key Question 3 include:
With the exception of Hmelo,28 these reviews were quite recent (2002-2006), and point to evolving educational methods. Virtual reality in 2001 may be difficult to compare to virtual reality in 2006. The computer assisted instruction described in the review by Hmelo28 is already dated.
No review included tests for publication bias, which would be highly anticipated with any new technology.
Simulation can include a variety of tasks and procedures, with varying lengths and complexities. Some studies included partial task simulators with complex surgical and endoscopic procedures; pooling such disparate skills may be inappropriate.
One of the major advantages of simulation over standard medical education training for procedures should be the opportunity to practice and receive feedback in a shorter period of time. No study explored this important contributor to effectiveness, i.e., the frequency and intensity of the simulation method and whether there is a “dose-response” effect with the use of simulation and the outcome of clinical skills competence. This aspect was assessed by a systematic review published after our literature search which found an association between hours of practice on high-fidelity simulators and standardized learning outcomes.165
There is no consensus on the appropriate outcome measures for effectiveness of simulation. Haque,26 Sutherland,31 Gerson,30 Aucar,29 and Issenberg27 all included validity studies within their reviews, but heterogeneity of tasks and simulators again makes it difficult to pool results.
Although nearly every review included a careful description of search methodology, most fell short in nearly every quality measure of a systematic review. The exception, Issenberg27 is an example of a high quality systematic review of an educational topic; the authors unfortunately did not address the outcome of interest to Key Question 3.
For Key Question 3, systematic reviews that addressed the use of simulation in CME educational activities were excluded as this aspect was covered by our other Key Questions. It is possible that systematic reviews of the efficacy of simulation in CME activities may reach a different conclusion.
We believe that assessing those factors that make CME more or less effective will be important for the planning of effective CME activities in the future. Although the overall quality of the studies was low, there were a few important trends. CME appears to be generally effective not only in the acquisition or achievement of knowledge, attitudes, skills, behaviors, and clinical practice outcomes, but also in their retention, and there are certain techniques, methods or exposures which seemed to be better than others. Unfortunately, most studies did not describe multiple evaluation points after the intervention, which did not allow us to determine at what point the CME effect, when persistent, became extinguishable and might have needed reinforcement. To enable future systematic reviews of CME, study researchers should refer to an excellent review by Reed et al, which summarizes guidelines for standardization in the conduct and reporting of educational interventions.11
Simulation, as a teaching tool, has the potential to affect patient safety and clinical outcomes, but no study included in this review used a patient-based clinical outcome as a measure of effectiveness. Future research should seek to determine the impact of simulation in improving clinical outcomes.
We believe that educators should develop strategies for identifying and prioritizing the gaps in our knowledge about CME that should be the focus of additional research. Future research should include high quality randomized controlled studies of CME with clear intervention and control groups and measurement of effectiveness at multiple points post-intervention. Such studies should focus on high priority areas given the resource constraints that educators typically face in conducting research on CME. Educators will need to use a variety of study designs, including qualitative research methods, because it will not be feasible to perform randomized controlled trials on many of the issues. Indeed, it will be difficult to rely too heavily on randomized controlled trials given the difficulty of creating and maintaining effective control groups.
To advance research on CME, leaders in medical education could develop a national agenda on what is needed to improve the effectiveness of CME. Such an agenda should include a clear definition of what constitutes CME. For example, whether quality improvement or practice improvement alone should have been included in our evaluation presented a dilemma. We decided that there needed to be a well-defined educational intervention for us to include quality improvement or practice improvement studies in our review of the effectiveness of CME. The agenda for future research should include development of more standardized approaches to the description of CME interventions, media, techniques, and exposure volumes. Ideally, the agenda would be based on a sound conceptual model of what influences the effectiveness of CME, including participating physician perspectives. Given the large amount of time, effort and money invested in CME, it seems reasonable to invest in a national consensus conference that could help to lay the foundation for a comprehensive research agenda for CME. In addition, greater resources should be devoted to funding educational researchers to design higher quality CME studies as well as the tools to evaluate CME outcomes.
| ABMS | American Board of Medical Specialties |
| ACCP | American College of Chest Physicians |
| AHRQ | Agency for Healthcare Research and Quality |
| AMA | American Medical Association |
| CI | Confidence interval |
| CME | Continuing medical education |
| EPC | Evidence-based Practice Center |
| KQ | Key questions |
| KR20 | Kuder-Richardson 20 |
| MOC | Maintenance of Certification |
| NIH | National Institutes of Health |
| SD | Standard deviation |
| US | United States |
Alejandro Aparicio, MD, FACP
Director, Division of Continuing Physician Professional Development,
American Medical Association
Vice President for Medical Affairs,
Ballard Health Care
Chicago, IL
Michael H. Baumann, MD, MS
Professor of Medicine
Division of Pulmonary and Critical Care Medicine
University of Mississippi Medical Center
Jackson, MS
Frank C. Berry
Continuing Medical Education Director
MedChi, The Maryland State Medical Society
Baltimore, Maryland
Nancy L. Davis, PhD
Director, Division of Continuing Medical Education
American Academy of Family Physicians
Leawood, KS
Robert Galbraith, MD, MBA, FACP
Executive Director of the Center for Innovation,
National Board of Medical Examiners
Philadelphia, PA
James C. Hebert, MD, FACS
Chair, Committee on Continuous Professional Development
Associate Dean for Graduate Medical Education
Vice Chair for Education, Department of Surgery
American College of Surgeons
University of Vermont, College of Medicine
Burlington, Vermont
S. Barry Issenberg, MD, FACP
Associate Professor of Medicine
Assistant Dean, Research in Medical Education
Director, Division of Research and Technology
Assistant Director, Center for Research in Medical Education
University of Miami Miller School of Medicine
Miami, Florida
Jocelyn Lockyer, PhD
Director, Continuing Medical Education and Professional Development
Associate Professor, Department of Community Health Services
University of Calgary
Calgary, Alberta Canada
Mary Martin Lowe, MA
Director, Education and Improvement
Accreditation Council for Continuing Medical Education
Chicago, Illinois
Don Moore, Jr., PhD
Professor of Medical Education and Administration
Director, Division of Continuing Medical Education
Vanderbilt University School of Medicine
Nashville, TN
LTC Lisa K. Moores, MC, USA
Former Chair, Council of NetWorks
Vice Chair, Continuing Education Committee
Member, Task Force on Performance Measurement
Walter Reed Army Medical Center
Gaithersburg, MD
Charles Willis, MBA
Former Director, Department of AMA PRA Standards & Policy Liaison Activities
American Medical Association
Administrative Director, Division of Continuing Physician Professional Development
Chicago, IL
February 2005–February 2006
Academic Medicine
The American Journal of Managed Care
American Journal of Preventive Medicine
Annals of Internal Medicine
Chest
Canadian Medical Association Journal
The Journal of Continuing Education in the Health Professions
Journal of General Internal Medicine
Journal of Medical Education
The Journal of the American Medical Association
Medical Care
Medical Education
| Terms | Returns |
|---|---|
| MEDLINE Strategy | |
| ((((“Education, Continuing”[MeSH] OR “Education, Medical”[MeSH]) NOT (“Education, Dental, Continuing”[MeSH] OR “Education, Nursing, Continuing”[MeSH] OR “Education, Pharmacy, Continuing”[MeSH] OR “Education, Medical, Undergraduate”[MeSH] OR “Internship and Residency”[MeSH]) OR (“continuing medical education”[tiab] OR CME[tiab]) OR ((educat*[tiab] OR train*[tiab] OR curriculum[tiab]) AND (physician* OR Family practi*[tiab] OR Family medicine[tiab] OR General practi*[tiab] OR internist*[tiab] OR Surgeon*[tiab] OR Primary care[tiab] OR Allergist*[tiab] OR Immunologist*[tiab] OR Anesthesiology*[tiab] OR Dematolog*[tiab] OR Emergency medicine[tiab] OR Forensic medicine[tiab] OR Hospitalist*[tiab] OR Internal medicine[tiab] OR Cardiolog*[tiab] OR Endocrinolog*[tiab] OR Gastroenterolog*[tiab] OR Hematolog*[tiab] OR Oncolog*[tiab] OR Nephrolog*[tiab] OR Pulmonolog*[tiab] OR Rhematolog*[tiab] OR Neurolog*[tiab] OR Patholog*[tiab] OR Pediatric*[tiab] OR Psychiatr*[tiab] OR Radiolog*[tiab] OR Obstetrician*[tiab] OR Gynecolog*[tiab]))) AND (behav*[tiab] OR practice*[tiab] OR evaluat*[tiab] OR assess*[tiab] OR learn*[tiab] OR skill*[tiab] OR outcome*[tiab] OR effective*[tiab] OR analy*[tiab] OR intervention*[tiab] OR examin*[tiab])) NOT (dental*[tiab] OR dentist*[tiab] OR student*[tiab] OR undergraduate*[tiab] OR athlet*[tiab])) AND English[lang] NOT (animal[mh] NOT human [mh]) AND (1981:2006[dp]) NOT (review[pt] OR meta-analysis[pt] OR editorial[pt] OR comment[pt] OR letter[pt]) | 38174 |
| EMBASE Strategy | |
| ((((‘medical education’:de) NOT (‘clinical supervision’:de OR ‘dental education’:de OR ‘medical school’:de OR ‘physician assistant education’:de OR ‘residency education’:de)) OR (‘continuing medical education’:ti,ab) OR cme:ti,ab OR ((educat*:ti,ab OR train*:ti,ab OR curriculum:ti,ab) AND (physician*:ti,ab OR (family:ti,ab AND practi*:ti,ab) OR (family:ti,ab AND medicine:ti,ab) OR (general:ti,ab AND practi*:ti,ab) OR internist*:ti,ab OR surgeon*:ti,ab OR (primary:ti,ab AND care:ti,ab) OR allergist*:ti,ab OR immunologist*:ti,ab OR anesthesiolog*:ti,ab OR dermatolog*:ti,ab OR (emergency:ti,ab AND medicine:ti,ab) OR (forensic:ti,ab AND medicine:ti,ab) OR hospitalist*:ti,ab OR (internal:ti,ab AND medicine:ti,ab) OR cardiolog*:ti,ab OR endocrinolog*:ti,ab OR gastroenterolog*:ti,ab OR hematolog*:ti,ab OR oncolog*:ti,ab OR nephrolog*:ti,ab OR pulmonolog*:ti,ab OR rhemaolog*:ti,ab OR neurolog*:ti,ab OR patholog*:ti,ab OR pediatric*:ti,ab OR psychiatr*:ti,ab OR radiolog*:ti,ab OR obstetric*:ti,ab OR gynecolog*:ti,ab))) AND (behav*:ti,ab OR practice*:ti,ab OR evaluat*:ti,ab OR assess*:ti,ab OR learn*:ti,ab OR skill*:ti,ab OR outcome*:ti,ab OR effective*:ti,ab OR analy*:ti,ab OR intervention*:ti,ab OR examin*:ti,ab)) NOT (dental*:ti,ab OR dentist*:ti,ab OR student*:ti,ab OR undergraduate*:ti,ab OR athlet*:ti,ab) AND [English]/lim NOT ([animals]/lim NOT [humans]/lim) AND [1981-2006]/py NOT ([conference paper]/lim OR [editorial]/lim OR [erratum]/lim OR [letter]/lim OR [note]/lim OR [review]/lim) | 44765 |
| The Cochrane Central Register of Controlled Trials (CENTRAL) | |
| (((((Continuing medical education):ti,ab,kw OR (CME):ti,ab,kw) OR ((educat* OR train* OR curriculum):ti,ab,kw NEAR (physician* OR Family practi* OR Family medicine OR General practice OR internist* OR Surgeon* OR Primary care OR Allergist OR Immunologist OR Anesthesiolog* OR Dematolog* OR Emergency medicine OR Forensic medicine OR Hospitalist* OR Internal medicine OR Cardiolog* OR Endocrinolog* OR Gastroenterolog* OR Hematolog* OR Oncolog* OR Nephrolog* OR Pulmon* OR Rhematolog* OR Neurolog* OR Patholog* OR Pediatric* OR Psychiatr* OR Radiolog* OR Obstetrician* OR Gynecolog*):ti,ab,kw)) AND (behav* OR evaluat* OR assess* OR learn* OR skill* OR outcome* OR effective* OR analy* OR examin* OR intervention*):ti,ab,kw) NOT (dental* OR dentist* OR student* OR undergraduate* OR athlet*):ti,ab,kw), LIMIT DATE RANGE from 1981 to 2006 | 1843 |
| PsycINFO | |
| (((((MM “Continuing Education” OR MM “Inservice Training” OR MM “Medical Education”) NOT (MM “Medical Internship” OR MM “Medical Residency”)) OR (TI “continuing medical education” OR TI “CME” OR AB “continuing medical education” OR AB “CME”) OR ((AB educat* OR AB train* OR AB curriculum) AND (AB physician* OR AB Family practi* OR AB Family medicine OR AB General practi* OR AB internist* OR AB Surgeon* OR AB Primary care OR AB Allergist OR AB Immunologist OR AB Anesthesiology* OR AB Dematolog* OR AB Emergency medicine OR AB Forensic medicine OR AB Hospitalist* OR AB Internal medicine OR AB Cardiolog* OR AB Endocrinolog* OR AB Gastroenterolog* OR AB Hematolog* OR AB Oncolog* OR AB Nephrolog* OR AB Pulmonolog* OR AB Rhematolog* OR AB Neurology* OR AB Patholog* OR AB Pediatric* OR AB Psychiatr* OR AB Radiolog* OR AB Obstetrician* OR AB Gynecolog*))) AND (AB behav* OR AB evaluat* OR AB assess* OR AB learn* OR AB skill* OR AB outcome* OR AB effective* OR AB analy* OR AB examin*)) NOT (AB dental* OR AB dentist* OR AB student* OR AB undergraduate* OR AB athlet*)) AND (LA English NOT (PO animal NOT PO human) AND DT 198101–200602 NOT (PZ abstract collection OR PZ bibliography OR PZ column/opinion OR PZ comment/reply OR PZ editorial OR PZ erratum/correction OR PZ letter OR PZ obituary OR PZ all chapters OR PZ original chapter OR PZ reprinted chapter OR PZ reprinted journal article OR PZ publication information OR PZ review)) [FURTHER LIMITED TO ALL JOURNALS] | 8738 |
| ERIC | |
| (((((MM “Continuing Education” OR MM “Inservice Training” OR MM “Medical Education”) NOT (MM “Medical Internship” OR MM “Medical Residency”)) OR (TI “continuing medical education” OR TI “CME” OR AB “continuing medical education” OR AB “CME”) OR ((AB educat* OR AB train* OR AB curriculum) AND (AB physician* OR AB Family practi* OR AB Family medicine OR AB General practi* OR AB internist* OR AB Surgeon* OR AB Primary care OR AB Allergist OR AB Immunologist OR AB Anesthesiology* OR AB Dematolog* OR AB Emergency medicine OR AB Forensic medicine OR AB Hospitalist* OR AB Internal medicine OR AB Cardiolog* OR AB Endocrinolog* OR AB Gastroenterolog* OR AB Hematolog* OR AB Oncolog* OR AB Nephrolog* OR AB Pulmonolog* OR AB Rhematolog* OR AB Neurology* OR AB Patholog* OR AB Pediatric* OR AB Psychiatr* OR AB Radiolog* OR AB Obstetrician* OR AB Gynecolog*))) AND (AB behav* OR AB evaluat* OR AB assess* OR AB learn* OR AB skill* OR AB outcome* OR AB effective* OR AB analy* OR AB examin*)) NOT (EL “Early Childhood Education” OR EL “Preschool Education” OR EL “Elementary Secondary Education” OR EL “Elementary Education” OR EL “Primary Education” OR EL “Adult Basic Education” OR EL “Intermediate Grades” OR EL “Secondary Education” OR EL “Middle Schools” OR EL “Junior High Schools” OR EL “High Schools” OR EL “High School Equivalency Programs” OR EL “Postsecondary Education” OR EL “Two Year Colleges”)) AND (LA English AND DT 198101–200602 AND PT Journal Article NOT (PO animal NOT PO human)) | 2002 |
| Terms | Returns |
|---|---|
| MEDLINE Strategy | |
| (((“Education, Medical”[MeSH] OR ((educat*[tiab] OR train*[tiab] OR curriculum[tiab]) AND (medical*[tiab] OR resident*[tiab] OR residenc*[tiab] OR physician*[tiab] OR surgery[tiab] OR surgeon*[tiab] OR surgical*[tiab])))) AND (“Patient Simulation”[MeSH] OR Computer Simulation[MeSH] OR Manikins[MeSH] OR simulation*[tiab] OR simulat*[tiab] OR mannikin[tiab] OR manikin[tiab] OR mannequin*[tiab] OR virtual[tiab] OR computer-based[tiab] OR “standardized patient”[tiab] OR “standardized patients”[tiab])) AND ((review[tiab] or review[pt] or meta-analys*[tiab] or meta-analysis[pt]) AND English[lang] NOT (letter[pt] or comment[pt] or editorial[pt])) NOT (animal[mh] NOT human [mh]) AND (“1990/01/01”[pdat] : “2006/02/28”[pdat]) | 466 |
| EMBASE Strategy | |
| ((((‘medical education’/exp) NOT (‘clinical supervision’/exp OR ‘dental education’/exp OR ‘physician assistant education’/exp)) OR ((educat*:ti,ab OR train*:ti,ab OR curriculum:ti,ab) AND (medical*:ti,ab OR resident*:ti,ab OR residenc*:ti,ab OR surgery:ti,ab OR surgical*:ti,ab OR physician*:ti,ab OR (family:ti,ab AND practi*:ti,ab) OR (family:ti,ab AND medicine:ti,ab) OR (general:ti,ab AND practi*:ti,ab) OR internist*:ti,ab OR surgeon*:ti,ab OR (primary:ti,ab AND care:ti,ab) OR allergist*:ti,ab OR immunologist*:ti,ab OR anesthe*:ti,ab OR anaesthe*:ti,ab OR dermatolog*:ti,ab OR (emergency:ti,ab AND medicine:ti,ab) OR (forensic:ti,ab AND medicine:ti,ab) OR hospitalist*:ti,ab OR (internal:ti,ab AND medicine:ti,ab) OR cardiolog*:ti,ab OR endocrinolog*:ti,ab OR gastroenterolog*:ti,ab OR hematolog*:ti,ab OR oncolog*:ti,ab OR nephrolog*:ti,ab OR pulmonolog*:ti,ab OR rhemaolog*:ti,ab OR neurolog*:ti,ab OR patholog*:ti,ab OR pediatric*:ti,ab OR psychiatr*:ti,ab OR radiolog*:ti,ab OR obstetric*:ti,ab OR gynecolog*:ti,ab))) AND (‘skill’/exp OR ‘simulator’/exp OR ‘simulation’/exp OR ‘virtual reality’/exp OR simulation*:ti,ab OR simulat*:ti,ab OR mannikin:ti,ab OR manikin:ti,ab OR mannequin*:ti,ab OR virtual:ti,ab OR computer-based:ti,ab OR (standardized:ti,ab AND patient*:ti,ab))) AND (review:ti,ab,it OR ‘meta analysis’:ti,ab,it OR metaanalysis:ti,ab,it) NOT (letter:it OR comment:it OR editorial:it OR ‘conference paper’:it OR erratum:it OR note:it) AND [English]/lim NOT ([animals]/lim NOT [humans]/lim) AND [1981-2006]/py | 2359 |
| The Cochrane Database of Systematic Reviews and the Cochrane Database of Abstracts of Reviews of Effects (DARE) | |
| (Simulation or simulator or manikin or mannikin or mannequin or virtual or computer-based or “standardized patient” or “standardized patients”) AND education in title, abstract or keywords restricted to reviews | 3 |
| PsycINFO | |
| ((((MM “Medical Education”) OR ((AB educat* OR AB train* OR AB curriculum) AND (AB physician* OR AB Family practi* OR AB “Family medicine” OR AB General practi* OR AB internist* OR AB Surgeon* OR AB “Primary care” OR AB Allergist OR AB Immunologist OR AB Anesthesiolog* OR AB Dematolog* OR AB “Emergency medicine” OR AB “Forensic medicine” OR AB Hospitalist* OR AB “Internal medicine” OR AB Cardiolog* OR AB Endocrinolog* OR AB Gastroenterolog* OR AB Hematolog* OR AB Oncolog* OR AB Nephrolog* OR AB Pulmonolog* OR AB Rhematolog* OR AB Neurolog* OR AB Patholog* OR AB Pediatric* OR AB Psychiatr* OR AB Radiolog* OR AB Obstetrician* OR AB Gynecolog* OR AB Medical OR AB resident* OR AB residenc* OR AB surgery OR AB surgical*))) AND (MM “Simulation” or MM “Virtual Reality” or MM “Human Machine Systems” or AB “simulation” or AB simulat* or AB manikin or AB mannikin or AB mannequin or AB virtual or AB “standardized patient” or AB “standardized patients”)) AND (LA English NOT (PO animal NOT PO human) AND DT 199001–200602 NOT (PZ abstract collection OR PZ bibliography OR PZ column/opinion OR PZ comment/reply OR PZ editorial OR PZ erratum/correction OR PZ letter OR PZ obituary OR PZ all chapters OR PZ original chapter OR PZ reprinted chapter OR PZ reprinted journal article OR PZ publication information))) AND (PZ review OR AB review OR AB meta-analys*) | 34 |
| ERIC | |
| ((((DE “Medical Education”) OR ((AB educat* OR AB train* OR AB curriculum) AND (AB physician* OR AB “Family practice” OR AB “Family practitioner” OR AB “Family medicine” OR AB “General practice” OR AB “General practitioner” OR AB internist* OR AB Surgeon* OR AB “Primary care” OR AB Allergist OR AB Immunologist OR AB Anesthesiolog* OR AB Dematolog* OR AB “Emergency medicine” OR AB “Forensic medicine” OR AB Hospitalist* OR AB “Internal medicine” OR AB Cardiolog* OR AB Endocrinolog* OR AB Gastroenterolog* OR AB Hematolog* OR AB Oncolog* OR AB Nephrolog* OR AB Pulmonolog* OR AB Rhematolog* OR AB Neurolog* OR AB Patholog* OR AB Pediatric* OR AB Psychiatr* OR AB Radiolog* OR AB Obstetrician* OR AB Gynecolog* OR AB Medical OR AB resident* OR AB residenc* OR AB surgery OR AB surgical*))) AND (DE “Simulation” or DE “Virtual Reality” or AB “simulation” or AB simulat* or AB manikin or AB mannikin or AB mannequin or AB virtual or AB “standardized patient” or AB “standardized patients”)) AND (LA English AND DT 198101–200602 AND PT Journal Article NOT (PO animal NOT PO human))) AND (AB review* OR AB meta-analys* OR TI review* OR TI meta-analys*) | 14 |
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]To see the Data Abstraction Forms, please select the link below. This link will take you to a PDF version of the forms.
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]Appendixes cited in this report are provided electronically at http://www.ahrq.gov/clinic/tp/cmetp.htm
Appendixes cited in this report are provided electronically at http://www.ahrq.gov/clinic/tp/cmetp.htm