PubMed Health. A service of the National Library of Medicine, National Institutes of Health.

Adam SS, McDuffie JR, Ortel TL, et al. Comparative Effectiveness of Warfarin and Newer Oral Anticoagulants for the Long-Term Prevention and Treatment of Arterial and Venous Thromboembolism [Internet]. Washington (DC): Department of Veterans Affairs (US); 2012 Apr.



This review was commissioned by the VA’s Evidence-based Synthesis Program. The topic was nominated after a topic refinement process that included a preliminary review of published peer-reviewed literature, consultation with internal partners and investigators, and consultation with key stakeholders. We further developed and refined the key questions (KQs) based on a preliminary review of published peer-reviewed literature in consultation with VA and non-VA experts.

The final key questions (KQs) were:

  • Key Question 1. For patients with chronic nonvalvular AF, what is the comparative effectiveness of long-term anticoagulation using newer oral anticoagulants versus warfarin on stroke incidence, mortality, health-related quality of life (HRQOL), and patient treatment experience?
  • Key Question 2. For patients with venous thromboembolism, are there differential effects of newer oral anticoagulants versus warfarin or low molecular weight heparins on recurrent thromboembolism, mortality, HRQOL, and patient treatment experience?
  • Key Question 3. For patients with mechanical heart valves, what is the comparative effectiveness of newer oral anticoagulants versus warfarin on the incidence of thromboembolic complications, mortality, HRQOL, and patient treatment experience?
  • Key Question 4. When used for long-term anticoagulation treatment, what is the nature and frequency of adverse effects for newer oral anticoagulants versus warfarin?


We followed a standard protocol for all steps of this review; certain methods map to the PRISMA checklist.56 Our approach was guided by the analytic framework shown in Figure 1.

Figure 1. Analytic framework for the comparative effectiveness of newer oral anticoagulants.

Figure 1

Analytic framework for the comparative effectiveness of newer oral anticoagulants. Abbreviations: DTI = direct thrombin inhibitors; FXa = factor X inhibitors; HRQOL = health-related quality of life; KQ = key question


We searched MEDLINE® (via PubMed®), Embase®, and the Cochrane Database of Systematic Reviews for peer-reviewed publications comparing the newer oral anticoagulants to standard care (usually VKAs) from January 2001 (the year newer oral anticoagulants were introduced) through May 2011. Our search strategy used the National Library of Medicine’s medical subject headings (MeSH) keyword nomenclature and text words for newer oral anticoagulants, the conditions of interest, and validated search terms for randomized controlled trials.57 Our final search terms included new or novel oral anticoagulants; DTIs, including dabigatran, and ximelagatran; FXa inhibitors, including edoxaban, rivaroxaban, apixaban, betrixaban, YM150; and the names of the conditions of interest—atrial fibrillation, venous thromboembolism, and mechanical heart valves. We limited the search to articles published in the English language involving human subjects 18 years of age and older. The full search strategy is provided in Appendix A. Following peer review of the draft report, we conducted a supplemental search of PubMed to identify observational studies or systematic reviews that addressed adverse effects of the newer oral anticoagulants. We also examined the FDA Web site, Drugs@FDA, to identify safety concerns. These included Drug Alerts and Statements ( and Drug Safety Communications ( in addition to the Advisory Committee Briefing Documents, the Center for Drug Evaluation and Research Summary Review, and the medical and statistical summary reports on the two newer oral anticoagulants (dabigatran and rivaroxaban) that have been FDA-approved. These supplemental searches along with an updated search for RCTs in PubMed were conducted in February 2012. We developed our search strategy in consultation with an experienced search librarian.

We supplemented the electronic searches with a manual search of citations from a set of key primary and review articles.5870 The reference list for identified pivotal articles was manually hand-searched and cross-referenced against our library in order to retrieve additional manuscripts. All citations were imported into two electronic databases (EndNote® Version X5; Thomson Reuters, Philadelphia, PA, for referencing and DistillerSR for data abstraction). As a mechanism to assess the risk of publication bias, we searched for completed but unpublished studies.


Using prespecified inclusion and exclusion criteria, two reviewers assessed titles and abstracts for relevance to the KQs. Full-text articles identified by either reviewer as potentially relevant were retrieved for further review. Each article retrieved was examined by two reviewers against the eligibility criteria (Appendix B). Disagreements on inclusion, exclusion, or major reason for exclusion were resolved by discussion or by a third reviewer.

The criteria to screen articles for inclusion or exclusion at both the title-and-abstract and full-text screening stages are detailed in Table 2. We modified these criteria for observational studies of adverse effects to include noncomparative studies (i.e., case reports, case series), nonrandomized comparative studies (i.e., cohort studies, case-control studies, controlled pre–post studies), and studies of any treatment duration. Studies excluded at the full-text review stage are listed with the reasons for exclusion in Appendix C.

Table 2. Summary of inclusion and exclusion criteria.

Table 2

Summary of inclusion and exclusion criteria.


Before general use, the abstraction form templates designed specifically for this report were pilot tested on a sample of included articles and revised to ensure that all relevant data elements were captured and that there was consistency and reproducibility between abstractors. Select data from published reports were then abstracted into the final abstraction form (sample form is in Appendix D) by one trained reviewer. All data abstractions were confirmed by a second reviewer. Disagreements were resolved by consensus or by obtaining a third reviewer’s opinion when consensus could not be reached. We abstracted the following key information for each included study:

  • age
  • sex
  • indication for anticoagulation
  • baseline bleeding risk or factors associated with increased risk (e.g., creatinine >1.5, history of gastrointestinal bleeding)
  • study drug and dosage
  • comparator and quality of INR control
  • length of treatment
  • study design
  • number of subjects and retention data
  • outcomes/adverse effects
  • for case studies, the sequence of clinical events

In addition, we examined included articles for subgroup analyses of particular relevance to the population served by VHA.


Data necessary for assessing quality and applicability, as described in the Agency for Healthcare Research and Quality’s (AHRQ’s) Methods Guide for Effectiveness and Comparative Effectiveness Reviews,71 also were abstracted. For RCTs, these key quality criteria consisted of (1) adequacy of randomization and allocation concealment, (2) comparability of groups at baseline, (3) blinding, (4) completeness of follow up and differential loss to follow up, (5) whether incomplete data were addressed appropriately, (6) validity of outcome measures, and (7) conflicts of interest. Using these quality criteria, we assigned a summary quality score (good, fair, poor) to individual RCTs studies as defined by the AHRQ Methods Guide.71 The criteria were applied for each study by the reviewer abstracting the article; this initial assessment was then over-read by a second reviewer. Disagreements were resolved between the two reviewers or, when needed, by arbitration from a third reviewer. Observational studies consisted only of case studies and were not quality rated.


We critically analyzed studies to compare their characteristics, methods, and findings. We then determined the feasibility of completing a quantitative synthesis (i.e., meta-analysis) by exploring the volume of relevant literature, the completeness of the results reporting and the conceptual homogeneity of the studies. When a meta-analysis was appropriate, we used random-effects models to synthesize the available evidence quantitatively. For three-arm studies that included more than one dose of the newer oral anticoagulant, we used data from the treatment arm using the standard FDA-approved dose. We conducted sensitivity analyses by (1) including the studies that evaluated ximelagatran, a newer anticoagulant that is not available, (2) using the other dose of the newer anticoagulant in three-arm studies, and (3) using revised data on adverse effects from the trial by Eikelboom et al.72 When there were sufficient studies, we conducted a mixed-effects analysis to compare treatment effects by drug class. These later analyses should be considered hypothesis-generating because they consist of indirect comparisons (across studies that may differ in ways other than the drug class) and thus are subject to confounding. Heterogeneity was examined among the studies using graphical displays and test statistics (Cochran’s Q and I2); the I2 describes the percentage of total variation across studies due to heterogeneity rather than to chance.73 Heterogeneity was categorized as low, moderate, or high based on I2 values of 25 percent, 50 percent, and 75 percent respectively.

The outcomes for this report were binary; we therefore summarized these outcomes by a weighted-effect measure for proportions (e.g., risk ratio). We present summary estimates and 95 percent confidence intervals (CIs). When there were statistically significant we estimated the absolute treatment effect by calculating the risk difference. Risk difference was calculated using the median event rate from the control treatments and the summary risk ratio.74 These results are presented in the strength of evidence tables (in the Summary and Discussion section).

Because AF, venous thromboembolism, and mechanical heart valve replacement are distinct clinical entities with distinct primary endpoints, we examined the groups of studies as they pertained to these diagnoses separately. For KQ 4 (adverse effects), we analyzed common outcomes (e.g., death, major bleeding) across treatment indications. All analyses were conducted using Review Manager (RevMan) 5.1.4. (Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2011).


In addition to rating the quality of individual studies, we evaluated the overall quality of the evidence for each KQ as described in the Methods Guide.71 In brief, this approach requires assessment of four domains: risk of bias, consistency, directness, and precision. Additional domains considered were strength of association (magnitude of effect) and publication bias. For risk of bias, we considered basic (e.g., RCT) and detailed study design (e.g., adequate randomization). We used results from meta-analyses when evaluating consistency (forest plots, tests for heterogeneity), precision (CIs), strength of association (odds ratio [OR]), and publication bias ( survey). Optimal information size and consideration of whether the CI crossed the clinical decision threshold using a therapy were also used when evaluating precision.75 These domains were considered qualitatively, and a summary rating of high, moderate, low, or insufficient strength of evidence was assigned after discussion by two reviewers. This four-level rating scale consists of the following definitions:

  • High—Further research is very unlikely to change our confidence on the estimate of effect.
  • Moderate—Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate.
  • Low—Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate.
  • Insufficient—Evidence on an outcome is absent or too weak, sparse, or inconsistent to estimate an effect.

When a rating of high, moderate, or low was not possible or was imprudent to make, a grade of insufficient was assigned.76 We also considered the risk of publication bias. Publication bias was addressed through a careful search of (March 2012) for identification of any study completed but unpublished or ongoing. We did not use graphical (e.g., funnel plots) or test statistics (e.g., Beggs test) because these methods do not perform well with fewer than 10 studies.


A draft version of the report was reviewed by technical experts and clinical leadership. A transcript of their comments can be found in Appendix E, which elucidates how each comment was considered in the final report.

Cover of Comparative Effectiveness of Warfarin and Newer Oral Anticoagulants for the Long-Term Prevention and Treatment of Arterial and Venous Thromboembolism
Comparative Effectiveness of Warfarin and Newer Oral Anticoagulants for the Long-Term Prevention and Treatment of Arterial and Venous Thromboembolism [Internet].
Adam SS, McDuffie JR, Ortel TL, et al.
Washington (DC): Department of Veterans Affairs (US); 2012 Apr.


PubMed Health Blog...

read all...

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...