NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Kuntz K, Sainfort F, Butler M, et al. Decision and Simulation Modeling in Systematic Reviews [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2013 Feb.
Introduction
In this chapter, we review past Evidence-based Practice Center (EPC) reports that have incorporated models and outline the specific reasons for incorporating models, the outcomes examined, and model contributions to the conclusions of the report. To complement the review of EPC reports, we also interviewed relevant EPC members about lessons learned from incorporating decision models in EPC reports.
Review Methods
Search Strategy
We searched each of the 193 evidence reports available on the Agency for Healthcare Research and Quality (AHRQ) EPC Web page (www.ahrq.gov/clinic/epcix.htm) using the keyword “model.” Surrounding text was read to distinguish between statistical models, which are excluded from this review, and decision analytic (simulation) models. We also queried EPC staff participating in interviews for modeling performed in conjunction with EPC or other AHRQ projects. (See section below for reports in which models were used.) Our search was targeted towards identifying models developed by EPC members in conjunction with a systematic review.
Abstraction
Report title and identifiers, date published, EPC, model type, reported reason for incorporating the models, outcomes examined, and model contributions to report conclusions were abstracted into a summary table by one reviewer. The table was quality checked by a second reviewer. Any disagreements were resolved through consensus discussion by the reviewers.
Review Results
Out of 193 evidence reports, 10 reports and 1 supplement to a technology assessment were identified through the search process. Details of the 11 reports are provided in Table 13. Tufts Medical Center was the most frequent modeling EPC, with four reports covering the period of 1999 to 2007.29-32 The Duke University EPC was the next most frequent, with three reports during 2006 to 2007,33-35 followed by one each from Southern California RAND (2003),36 University of Alberta (2004),37 and Stanford-UCSF (2009).38 All but two developed new models as part of the study leading to the evidence report. One evidence report adapted a previously published model,34 and later refined the model further for a second evidence report.35 Seven reports modeled diagnostic tests or screening strategies along with subsequent treatments,29-34 while three reports modeled treatments only.36-38
Only three reports used models as the prime methodology to answer key questions36,37 or address the main research aim.32 The remaining seven reports used models to augment systematic review results in cases where the preliminary literature search results suggested the literature would be unable to address the key question directly. The report language often did not clearly state the purpose of incorporating decision-analytic models into the systematic review. One of the main stated purposes of incorporating models into the evidence reports was to provide a link between intermediate outcomes and clinical, or patient-centered, outcomes. Other stated reasons included: simulating head-to-head comparisons otherwise unavailable in the literature, examining cost-effectiveness, and modeling a novel hypothesis for disease progression not previously mentioned in the literature to determine the impact on the effectiveness of screening.
Models contributed to conclusions through a few main paths. Seven evidence reports concluded from model results either more optimal practices or no clinically important distinguishable differences.29,31,33-36,38 The conclusion made from one analysis was that there was insufficient evidence to state anything conclusive about the optimal strategy.37 Models that relied on evidence that was considered to be of low quality were reported as exploratory.38 Two modeling exercises were performed to promote understanding of the interactions between the variables of an analytic framework, rather than to provide a basis for clinical recommendations.31,32
One evidence report summarized an attempt to perform a decision model to evaluate which diagnostic modalities were useful in differentiating seizure types commonly mistaken for epilepsy from true epileptic seizures.39 Diagnostic performance data from multiple sources were to be pulled together to accurately model the clinical differential diagnosis. However, the model was not developed because of a stated lack of available evidence with which to build the model.
Interview Methods
We contacted all EPC directors and arranged telephone interviews to (1) discuss whether EPC activities have involved any decision modeling activities, whether “successful” (i.e., incorporated in reports) or not and (2) identify key informants (name, current affiliation, and contact information) who were instrumental in considering, developing, and completing/abandoning modeling activities; whether those individuals are in the same institution, are past/existing partners/collaborators, and/or have moved somewhere else.
The rationale for interviewing members from all EPCs (as opposed to focusing only on EPCs that have incorporated models) was that lessons learned are more complete and informative if we also interview EPC members who have considered and attempted to incorporate modeling but decided (for reasons we wanted to discover) not to complete such tasks; EPC members who have not considered developing or incorporating models at all; or those not familiar with modeling at all.
We developed a semistructured interview guide to be used in conducting all interviews, whether by phone or face to face. The final interview guide was developed after review of the EPC reports that incorporated models with iterative participation of the Technical Expert Panel and Task Order Officers (TOOs). The interview guide is listed as Appendix B of this report. Twenty potential respondents were identified as shown in Table 2. Telephone interviews were conducted from December 15, 2009 through March 2010. In several cases, as shown in Table 14, EPC directors included additional EPC staff in the interview, requested that we speak to other EPC staff members in addition to themselves, or referred us to EPC staff members who were better able to represent that EPC's experience with the topic.
After three interviews were completed, the interview guide was shortened into a discussion guide with four main questions that focused on four themes as an organizing principle. This revised discussion guide is provided in Appendix C and summarized below.
- What research questions are most appropriate for inclusion of a decision model?
- What model outputs deliver the greatest utility to stakeholders?
- What is your working definition of a model?
- How do you determine the value of a model?
Tables of verbatim quotes for key themes were created to organize the material. The tables are provided in Appendix D.
Interview Results
Nineteen out of 20 (95 percent) individuals contacted agreed to be interviewed, representing 12 out of 13 EPCs (92.3 percent). Seven main themes emerged from the discussions with the EPC directors and designated staff.
These themes are:
- Attitudes Toward Modeling and Appropriateness of Modeling in Systematic Reviews
- Research Questions and Contexts Best Suited for Modeling
- Definitions of Decision and Simulation Models
- Evaluation of Models and Assessment of Model Outcomes
- Decision and Simulation Models Results as Evidence
- Impact of Decision and Simulation Models on Systematic Reviews
- Training Needs
Most, but not necessarily all, themes were addressed across all EPC discussions, depending on the EPC and respondents' experience with modeling. Overall, interviewees' opinions and responses tended to fall into one of two groups:
- Fifteen interviewees with experience, including individuals with personal modeling experience/expertise and those with no personal modeling experience but who are members of an EPC with modeling experience; and
- Four interviewees without experience, interviewees with no personal experience/expertise who belong to an EPC with no modeling experience as well.
Thus, interviewees with any degree of familiarity with models (first group), whether firsthand or secondhand, tended to respond more similarly than those without experience or exposure (second group). For convention, we will refer to the former group as those “with experience” and the later the group as those “without experience” as we discuss these findings. Table 15 summarizes the key differences between interviewees with and without modeling experience with respect to the seven major themes that emerged from the interviews.
Attitudes Toward Modeling and Appropriateness of Modeling in Systematic Reviews
Those interviewees with modeling experience unanimously held positive attitudes towards modeling with respect to its benefit for systematic reviews. All reported that modeling was an important set of techniques and strategies that were applicable to the work they were engaged in, and were generally supportive of incorporating these techniques into systematic reviews. Among these interviewees, some mentioned a struggle in considering whether models were “appropriate” within a systematic review, as opposed to models being developed after a systematic review has been completed, and as a separate project. They seemed to struggle with two main issues concerning models within systematic reviews. First, is the development of a model within or beyond the scope of a systematic review of the literature? Second, should published models and related output constitute valid information that could be included in systematic reviews, and if so, how does one go about incorporating and evaluating the evidence provided by such modeling studies?
Regarding the first issue, development of a model as part of a systematic review, interviewees without experience felt that a systematic review should be limited to the synthesis and meta-analysis of all available empirical and observational evidence regarding the key questions set forth for the review. For those respondents, models are perceived to go beyond this scope of synthesizing the literature and, as a result, they prefer to make conclusions based solely on the empirical and observational evidence. While most felt modeling is a worthwhile endeavor and can play a unique role, sometimes making recommendations even possible, some felt that modeling is in fact beyond the scope of the “spirit” of a systematic review and should be a separate endeavor. Specifically, they pointed to the limited number of reviews with models, the infrequent inclusion of model requests, and the absence of a standard methodology for modeling in systematic reviews.
Interviewees with experience, however, were very supportive of including models in systematic reviews and felt modeling is a natural extension or augmentation of the purpose and intent of systematic reviews. In fact, many described situations in which modeling greatly improved the systematic review by addressing gaps in the literature, extending benefits and harms beyond intermediate outcomes to terminal outcomes and offering comparisons of strategies. (This will be discussed in more detail when addressing the research questions that models are best prepared to address.) These interviewees discounted the “scope” issues, identified above as the result of limited EPC experience with models, and the recent focus in the methods manual and in methodology discussions of evidence review, grading, and meta-analysis. They all felt that, as attention switches to modeling, reference materials and standards could be created. In fact, most interviewees looked to the results of this project as the first step toward such an end. Thus, several interviewees indicated that models, when needed and appropriate, can be essential tools that belong to the realm of systematic reviews and that provisions should be made to enable the development and utilization of models within that context. They pointed to important hurdles that need to be addressed to make this possible and effective. Those hurdles are addressed within other themes below.
With respect to the second issue, considering the output generated by published models as a potential source of evidence in systematic reviews, the consensus among the interviewees seemed to be that such information should probably be treated and graded differently. Respondents were unsure how to approach this situation. While model outputs cannot be used in meta-analyses, they can still be incorporated and discussed in the review.
Research Questions and Contexts Best Suited for Modeling
In addressing the research questions and contexts where modeling offered the most benefit, interviewees with experience again demonstrated high concordance while those without experience, for the most part, did not offer suggestions regarding this theme. Models are well suited to address gaps in the literature and to synthesize literature from differing sources and contexts into a single representation of the empirical evidence. Often research questions involve harms or benefits that are measured with intermediate outcomes as opposed to the terminal outcome of interest, such as survival or disease prevention. In many cases, studies demonstrate quantitative findings for intermediate outcomes, but studies of the long-term or terminal outcomes are underway, inconclusive, or even not feasible to conduct. These present opportunities for modeling to link intermediate outcomes with estimates of terminal benefits and harms, thus allowing systematic reviews to make conclusions about terminal outcomes of interest. The comparison of testing, prevention, and diagnostic strategies was also noted as a primary area in which modeling can be of great benefit. Most remarked that the comparison of strategies and the establishment of net benefit, that is, benefit less harms, can only be determined through the use of a decision model. More generally, models are well suited for research questions in which there is a high degree of uncertainty in assumptions or input parameters or in situations in which there is a great amount of discordance between estimates in empirical studies. In many cases, large randomized controlled trials or observational studies have not focused on specific subpopulations. In these situations, modeling can be used to simulate findings where subpopulation characteristics are believed to impact or change conclusions for a specific subpopulation. Modeling affords a timelier and less expensive option to address subpopulations, than repeating an empirical study for the subpopulation. Lastly, there is great interest in the benefits modeling can bring to determining the value of information, and specification of research priorities and directions. In many cases, systematic reviews conclude with recommendations for further research and models can be used to quantify the “value” that an additional research recommendation would contribute to the key questions of interest.
Definitions of Decision and Simulation Models
The definition of what constitutes a model also had a high degree of similarity among those interviewees with experience, although the specifics about where statistical inference ends and decision modeling begins were a source of some controversy. Interviewees without experience did not have a consistent view of model definitions and had difficulty distinguishing them from statistical techniques. Most converged on a general definition of decision modeling and simulation as the mathematical representation of a decision (or series of decisions) based on empirical input parameters, supported by a specified framework or mechanism (e.g., a particular representation of the natural history of a disease), and subject to a set of identifiable assumptions. While the majority reported a similar definition of a model, there was greater disparity in the ability to differentiate between modeling and the domain of traditional statistics. These responses exhibited some of the greatest variance within the “with experience” group. Some drew distinct lines between any statistics used for “inference” and the set of techniques used in decision analysis. This group believed that the distinction between modeling and other mathematical techniques was the intended use, that is, inferential statistics versus recommending decisions between options. Others made more specific comments; such as decision models begin with Bayesian statistics and metaregression and extend to the techniques more commonly employed in decision models, such as Markov modeling and simulation techniques. There was high agreement for the general definition, but the distinctions among techniques seemed to represent the interviewees' experience with particular techniques in particular situations.
Evaluation of Models and Assessment of Model Outcomes
The evaluation of a model or the determination of the quality of a model had high agreement among the group with experience. Albeit qualitative, the majority reported that their opinion about the “quality and expertise” of the actual modeler who developed the decision model weighed heavily on their overall and initial assessment of the model. Most identified the lack of defined standards and methods as a major problem in the evaluation of models, and again hoped that this initiative would bring about some initial draft evaluation standards. When pressed to describe the methodology used in their evaluation of a model, most reported that they routinely inspected the quality and reliability of the input parameters; the reasonableness of the assumptions; and, if available, (usually in a technical appendix) the structure of the model, for example, the representation of the natural history of the disease.
The discussion of model evaluation, in most cases, transitioned to a discussion of model outputs and the methods to assess these outputs. All interviewees (with and without experience) described the need for standardization of model outputs, as an important factor in accepting the models, but also in the practical usage of them across research questions and policy issues. Quality adjusted life years (QALYs) were the most frequently mentioned standard output, but interviewees were quick to critique its merits, especially the fact that it represents a population level output and is not immediately applicable at the individual level as a tool for practitioners and patients faced with important clinical decisions. Interviewees with experience also discussed the need for standardization of the way model characteristics and model results are presented independently of which specific outcome metrics are reported. They typically referred to the need for standard tables and graphs that would take on a specific form and contain a standardized set of information regarding the model, sensitivity analyses, and reported outputs. It was clear from the comments that many believed that some standardization was the best first step toward making models more accessible to a greater audience, and again saw this project as a critical step in creating such recommendations for EPCs to adopt. With regard to specific outcomes, while many of the interviewees with experience were critical of QALYs as a measure, there was no immediate response or direction towards any other measures that would be of greater value across a wide range of research questions and decisions. The strength of QALYs was the ability to use the measure to compare across diseases, treatments, and clinical issues. However, interviewees also suggested that additional outputs that were more actionable at the practitioner-clinician level needed to be reported and discussed. In conclusion, most interviewees reported that the output measures need be “tuned” to the decision to be made, and the same parameters and model structure may need to produce outputs at varying “levels.”
Decision and Simulation Models Results as Evidence
An interesting, unanticipated discussion point from these interviews, among both groups with and without experience, was the consideration of evidence, and where modeling fit into the continuum of evidence, or did not fit. We can parse this feedback into two general components: (1) the use of models and simulation results as evidence in systematic reviews of the literature, where models may or may not then be developed to address the key questions, and (2) whether the results from a model are evidence, which can be evaluated and graded alongside more traditional sources such as observational studies and randomized control trials, or should be considered as an orthogonal or even unrelated empirical finding. Although not included in the original interview guide, and thus not covered in all interviews, this was a rich topic even within the two groups and was mentioned in approximately two-thirds of the interviews. Beginning with the latter point, whether models produced by systematic reviews should be considered evidence, generally, those with experience stated that the outputs from a model that are included in a systematic review should be treated as evidence. The rigor of the systematic review methodology ensured high quality parameter inputs to these models, as well as sensitivity analyses and model assumptions that were consistent with the state of the science. Further, because modeling offers specific benefits (as mentioned earlier in this section), such as addressing literature gaps, subpopulations, extension of intermediate outcomes to terminal outcomes of interest, et cetera, such evidence would not be possible without the utilization of modeling and simulation. Many interviewees with experience made a distinction that this evidence was “manufactured” or “model produced” evidence, possibly indicating the need to categorize or somehow identify this as a different type of evidence.
With respect to incorporating modeling and simulation results into systematic literature reviews as evidence, both groups noted the lack of standards and direction, by the methods manual or the literature, in terms of how model and simulation results should be graded as evidence. Interviewees did not believe that the current evidence-grading methodologies addressed the issues that model and simulation evidence present to a reviewer. Those with experience recommended that this issue be linked to the model evaluation and assessment of outputs topic and saw this project as an opportunity to draft an initial set of grading standards, or at least initiate such a process as a next step. Interviewees without experience pointed to this lack of evidence standards as the principal reason to exclude any modeling studies from systematic reviews. Additionally, many reported that even if standards existed, the incorporation of models as evidence was beyond the scope of systematic review, which is charged with the compilation of all the available empirical evidence, and thus by definition excludes “modeled” or “simulated” data. Those without experience explicitly stated that models and simulations were not on the same “continuum of evidence” as other studies and sources and in fact represented a very different data source, which, when merged with traditional evidence, created a number of issues with respect to the validity of the reviews, and thus should not be included.
Impact of Decision and Simulation Modeling on Systematic Reviews
Feedback and comments regarding how modeling and simulations potentially altered the process, scope, and conduct of systematic reviews were also addressed. The most frequent issue mentioned by interviewees with experience was the ability to determine the opportunity or need for a model and/or simulation before the project has started, and specifically before the question refinement phase has been completed and before early stage literature review has been conducted. If modeling is considered a part of the review, this could present a major barrier to EPCs in competing for and then conducting systematic reviews. While individuals in EPCs with modeling experience were most vocal on this issue, even those interviewees with modeling experience but in EPCs that have not conducted modeling studies, reported this limitation. Often the ability to create a model is based on the availability of parameters and assumptions in the literature, the identification of which cannot be fully completed until the literature review is underway. This makes it difficult to include in a proposal without significant effort in the proposal stage with no guarantee of contract award. In other cases, the need for a model is not fully understood and/or identified until the question refinement phase has been completed. Interviewees report that this is the natural phase to identify the actual question(s) of interest for the review and then the assessment of the best methods to address those questions.
Two general solutions were offered. The RTOP process could be augmented to include a more collaborative question refinement prior to proposal submission. Alternatively, many interviewees thought that the ability to amend a project if and when the opportunity or need for a model was identified would help mitigate these issues.
An essential issue is the resource intensiveness of models and modeling efforts. Most interviewees with experience with models in EPC reports responded that modeling efforts could easily consume 20–40 percent of the budget for a systematic review, and thus could not be accomplished without either inclusion in the budget at project inception, or an increased budget and timeline after the question refinement phase. If the ability or need for a model could be determined in the proposal stage it would be included in the proposal. Since even the most experienced EPCs and interviewees have only conducted models on a few recent EPC reports and projects, it was difficult for them to estimate the frequency models could be included in proposals. For this reason, and to not impact their EPC's competitiveness in proposal competitions, they defer models from most proposals and hope to convince TOOs on projects as to the merits of including a model after the projects have commenced, and rationale can be clearly established and communicated.
Training Needs
All interviewees reported the desire for training resources. Those with experience, and in EPCs with experience, reported a number of possibilities for training, including seminars for those who conduct systematic reviews, the identification of resources to train other staff members assisting with reviews, and availability of training grants to increase the capacity of model and simulation expertise through pre- and post-doctoral support. Interviewees without experience, and in EPCs without experience, felt that the training issue needed to be subordinated to a decision or “edict” by AHRQ to include more modeling and simulation into EPC projects and systematic reviews (or more favorably, to these interviewees, as separate projects after systematic reviews are completed). Once this is determined by AHRQ, then training in how to work with modelers and how to interpret models, along with other issues such as evidence grading and new methods manual chapters, need to be addressed to support the EPCs in this work. EPCs without experience have reported a reluctance to hire and develop model and simulation talent because of the lack of clarity about whether these skills are required by AHRQ and how it might impact their competitiveness among EPCs, as well as whether acquiring these skills should be prioritized above others.
Summary
The responses to interviews showed a high degree of consistency among interviewees that had experience, either personally or within their respective EPCs. There was more variability in responses among interviewees without experience. Further, interviewees without experience responded quite differently from interviewees with experience, both in the content of their responses and in their ability to respond to some of the questions and themes presented. In the above discussion, much of the reporting of opinions and responses is focused on the group with experience, simply because that is where the majority of the responses on the topics came from. Overall, most interviewees had experience, either personally or within their EPC, with models and/or modeling techniques. Those with no experience, whether personally or within their EPC, expressed interest in modeling but as a separate activity beyond the scope of systematic reviews. Not surprisingly, they did not support the inclusion of modeling in systematic reviews and did not support the inclusion of model results as a potential source of evidence in the conduct of systematic reviews. The role of decision analytic results on the continuum of evidence deserves further research.
Overall, the EPCs, almost universally, seem to be seeking guidance about how to best handle models and simulations. All report an awareness of the growing popularity and application of models and simulations and many have ideas and opinions on how best to implement within systematic reviews. There was general consensus on the need for guidelines and extension of the methods training.
- Use of Modeling in Systematic Reviews: The EPC Perspective - Decision and Simula...Use of Modeling in Systematic Reviews: The EPC Perspective - Decision and Simulation Modeling in Systematic Reviews
Your browsing activity is empty.
Activity recording is turned off.
See more...