NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Vesco KK, Whitlock EP, Eder M, et al. Screening for Cervical Cancer: A Systematic Evidence Review for the U.S. Preventive Services Task Force [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2011 May. (Evidence Syntheses, No. 86.)

Cover of Screening for Cervical Cancer

Screening for Cervical Cancer: A Systematic Evidence Review for the U.S. Preventive Services Task Force [Internet].

Show details

4Discussion

Summary of Review Findings

Cervical cancer screening’s impact on reducing cervical cancer rates has been well-established by epidemiological evidence.144 Evidence to evaluate the most efficient and effective screening approaches, however, has changed substantially since the 2003 USPSTF review and recommendation.145 At that time, there was insufficient evidence to evaluate newer technologies, including LBC and high-risk HPV DNA screening. Largely within the past 5 years, results from eight RCTs evaluating HPV-enhanced screening strategies have been reported, with ongoing results as additional screening rounds are completed.112–117,119,120 Another updated body of evidence addresses whether LBC and CC are generally equivalent. A large RCT compared LBC to CC,108 and another large RCT compared these two cytological approaches using data from an HPV-cytology co-testing trial.107 Data from trials for newer technologies are supplemented by well-done observational studies evaluating absolute test performance. When well-done, observational studies can be viewed as superior in some ways, since they compare test performance in the same women. However, since their results represent only cross-sectional histological findings, longitudinal followup with rescreening (as in trials) is needed to determine whether any differences in detected cervical lesions represent true (likely to progress) predisease.

The USPSTF began formulating its update in 2006 with a focus primarily on evidence for newer cervical cancer screening technologies. This report also focuses primarily on studies applicable to the United States or other countries with well-developed, population-based cervical cancer screening. Thus, while some promising trials and studies have been performed in India118,129 and China,130–132 their results have not been discussed, nor do they inform our discussion and conclusions.

Table 15 presents a summary of evidence for each KQ in order, which we briefly discuss next.

Table 15. Summary of Evidence By Key Question.

Table 15

Summary of Evidence By Key Question.

Initiation of Cervical Cancer Screening

The available evidence from five studies (four of fair quality and one of good quality) cumulatively suggests no benefit to cervical cancer screening for women before the age of 21 years. The goal of cervical cancer screening is detecting and treating preinvasive lesions, and incidence of CIN2 and CIN3 does not begin to peak until women reach their late 20s. The findings of Woodman and colleagues106 and Peto and colleagues32 confirm the findings of other studies146 indicating that the prevalence and incidence of HPV infections in women younger than age 20 years is high, but most infections and cytologic abnormalities are transient. Moreover, a study by Insinga and colleagues found that the risk of false-positive smears is higher for women younger than age 25 years than for women aged 25 to 29 years (3.1 to 3.5% vs. 2.1%, respectively).104 U.S. incidence data demonstrate that ICC is rare in women younger than age 20 years.17 Overall, between 2000 and 2008, the age-adjusted incidence rate of cervical cancer among women younger than age 20 years was 0.05 cases per 100,000 U.S. women.17 By comparison, the annual age-adjusted incidence rate for breast cancer in men of all ages was 1.1/100,000.147 The high prevalence of HPV, the transient nature of cytologic abnormalities, and the rare occurrence of cervical cancer in adolescents argue against cytologic screening for women younger than age 20 years, irrespective of timing of coitarche or presence of high-risk sexual practices. In fact, screening in this population may be harmful, as it could lead to unnecessary intervention. Since CIN1 and CIN2 are likely to regress, overtreatment could potentially occur.68 Colposcopy and biopsy, which are currently the gold standard for evaluation of cervical cytologic abnormalities, and treatment of CIN may be associated with anxiety, pain, and cervical bleeding.84,85,148,149 Furthermore, certain types of CIN treatment procedures may affect subsequent reproductive outcomes. Two systematic evidence reviews of obstetric outcomes in women with a history of CKC to treat CIN demonstrate a significantly increased risk of preterm birth (at less than 30-, 34-, and 37-weeks’ gestation) and low birthweight in infants (less than 2,000 grams and less than 2,500 grams).86,87 The two reviews differed in the impact of LEEP on obstetrical outcomes. In one review, pooled estimates demonstrated a 1.7-fold increased risk of preterm birth prior to 37 weeks and a 1.8-fold increased risk of birthweight less than 2,500 grams.86 In the other, pooled estimates demonstrated no impact of LEEP on preterm birth prior to 34 weeks or birthweight less than 2,000 grams.87 Other harms to consider are the psychological impact of labeling a woman as HPV positive, especially in a population in which HPV infections are highly prevalent and likely to regress.150–152

Whether initiation of screening in the United States should begin later than age 21 is unclear. The UK NHS Cervical Screening Programme does not commence cervical cancer screening until age 25. The large case-control study by Sasieni and colleagues was designed to determine whether screening should begin prior to age 25 in the United Kingdom.23 While the authors concluded that screening women aged 20 to 24 years would have little or no impact on rates of ICC up to age 30, there was still some uncertainty regarding its impact on advanced stage tumors (IB+) in women younger than 30.23 In June 2009, the UK Advisory Committee on Cervical Screening reviewed the practice of initiation of screening at age 25 years, and there was unanimous agreement that there should be no change in their current policy.153 However, whether this practice should be adopted in the United States is uncertain. The Icelandic study by Sigurdsson and colleagues105 supports initiation of screening in women in their early 20s, whereas the UK study was limited in power to definitively determine whether screening among this group of women is beneficial.23 Neither study provided sufficient detail to allow determination of a specific age at which screening should be initiated. Furthermore, no studies were identified that provided information on age at which to initiate cervical cancer screening using U.S. data.

Liquid-Based Cytology Compared to Conventional Cytology for Primary Cervical Cancer Screening

The studies we reviewed demonstrated that LBC and CC do not differ in relative sensitivity or absolute sensitivity and specificity. False-positive rates varied among studies. They were not significantly different between LBC and CC in the nonrandomized trials. False-positive proportions in randomized trials were slightly lower in one study and slightly higher in the other, and both results bordered on statistical significance. The randomized trials included over 130,000 women combined and, thus, were well powered to detect significant differences. Our findings that LBC and CC do not differ in sensitivity and specificity are consistent with two recently completed systematic evidence reviews of LBC with more liberal inclusion criteria.154,155 However, the systematic evidence review by Davey and colleagues performed in 2006, prior to the release of data from the NTCC and NETHCON trials, found that LBC did not reduce the proportion of unsatisfactory slides compared to CC.155 Data from the NTCC and NETHCON trials, in which thousands of women were randomized to LBC or CC, has since been published and demonstrates that LBC yields fewer unsatisfactory slides than CC.107,108 We were unable to identify any studies that identified direct harms resulting from collecting the cervical sample for LBC.

Studies of clinical practice in the United States suggest that LBC has been widely adopted despite lack of available data to support greater accuracy with LBC testing, compared to CC.94 One potential reason for the adoption of LBC is the ability to add reflex HPV testing without requiring an additional examination and specimen collection. Currently, the FDA has approved HC2 for testing patients with ASC-US cytology to determine the need for referral to colposcopy, and for use in women aged 30 years or older in conjunction with cytology to assess the absence or presence of high-risk HPV types. Since specimens for HPV testing can be collected at the time of cytologic testing without the use of LBC, sophisticated decision analysis models would need to be developed to determine whether or not the use of LBC is preferable to CC when HPV testing is desired, as there appears to be no advantage in terms of test performance to the use of LBC over CC in the absence of HPV testing. An editorial commentary by Schiffman and Solomon noted that other factors now influence the choice between LBC and CC, including issues related to laboratory productivity (LBC specimen slides are easier and quicker to scan under the microscope), slide adequacy (impact of fewer unsatisfactory slides), relative cost (LBC is more expensive than CC), and ease of ancillary molecular testing.156

HPV-Enhanced Primary Cervical Cancer Screening

The most extensive new data for cervical cancer screening technologies evaluate four potential roles for HPV in primary cervical cancer screening. However, despite recent detailed reports from five large RCTs within national screening programs in Italy, England, Finland, Sweden, and the Netherlands, available data are not yet complete, consistent, or relevant enough to determine a clear role for HPV testing as a primary cervical cancer screening method in the United States. One trial (NTCC Phase II) compared HC2 screening alone to CC alone (49,196 women screened; 27.9% younger than age 35 years),113 four trials (NTCC Phase I, POBASCAM, Swedescreen, ARTISTIC) compared co-testing (with HC2 or PCR and CC or LBC) to CC or LBC alone (127,149 women; 13.4% younger than age 30 to 35 years),113–115,117 and one trial (Finnish trial) compared primary HPV screening with cytology triage to CC alone (71,337 women; 16.2% younger than age 35).120

While all but one120 of these trials of primary HPV-enhanced screening have reported results after two rounds of screening, data needed to determine benefit, harms, and net benefit remain incompletely reported. As shown in Table 5c, reported benefits (for CIN3+ detection, Tables 16a and b) as a cumulative or second-screening round outcome (Table 17) are considered possible surrogates for cancer; however, these data also represent incomplete followup of a significant proportion of study participants in three of four co-testing trials (POBASCAM, Swedescreen, ARTISTIC).114,115,117 In addition, a planned second screening round is not yet conducted or reported in one trial (Finnish trial),120 and recent reporting of a third round in ARTISTIC does not rectify data or other concerns affecting its validity.135

Table 16a. Relative Detection Ratio By Screening Round for RCTs of HPV Screening Strategies in Cervical Cancer Screening (Women ≥30 or 35 Years).

Table 16a

Relative Detection Ratio By Screening Round for RCTs of HPV Screening Strategies in Cervical Cancer Screening (Women ≥30 or 35 Years).

Table 16b. Relative Detection Ratio By Screening Round for RCTs of HPV Screening Strategies in Cervical Cancer Screening (Women <30 or 35 Years).

Table 16b

Relative Detection Ratio By Screening Round for RCTs of HPV Screening Strategies in Cervical Cancer Screening (Women <30 or 35 Years).

Table 17. European Perspective in Interpreting Comparative HPV Screening Trials.

Table 17

European Perspective in Interpreting Comparative HPV Screening Trials.

Regarding potential burden or harms, four of six trials (NTCC Phase I and II, Swedescreen, Finnish trial) representing all types of HPV-enhanced primary screening do not include data for each screening round and cumulative data, as would be necessary to interpret screening burden and potential harms (Table 18).112,113,115,120 Missing data include: proportion referred and receiving colposcopy immediately or after retesting protocols, proportion referred for retesting, compliance with retesting referrals, proportion receiving treatment, and, ideally, proportion experiencing diagnostic and treatment-related harms. Because age-specific data are critical in HPV-enhanced screening, lack of complete age-specific reporting for important benefit and harm-related measures in two of three trials including women younger than age 30 or 35 years (ARTISTIC, Finnish trial) further limits their current interpretation.117,120 Reporting of these data will more fully inform the balance between potential benefits and harms from HPV-enhanced primary screening strategies, which will be particularly important since some available metrics (i.e., colposcopy) may appear “worse” after one round of HPV testing (compared with cytology), but may look better over time if the more sensitive HPV test detected and treated earlier disease. It will also be particularly important to consider these trials’ applicability, since none of their screening strategies mimics recommended U.S. practice.

Table 18. What Data Are Reported in RCTs of HPV Screening Strategies in Cervical Cancer Screening.

Table 18

What Data Are Reported in RCTs of HPV Screening Strategies in Cervical Cancer Screening.

How can we have so much data and yet still not know enough? The answer lies in our inability to answer two critical questions: 1) how much benefit does incorporating the more sensitive HPV test into routine screening approaches for cervical cancer provide? and 2) what are the tradeoffs in order to achieve this benefit? These issues also must be framed in a programmatic screening perspective focused specifically on cervical cancer. We illustrate these considerations using one trial, NTCC Phase II (Appendix E).

The Rationale and Potential Pitfalls of HPV-Enhanced Screening

Fair- or good-quality test performance studies (without verification or other serious biases) of one-time screening test performance clearly indicate that HC2 testing is much more sensitive than cytology alone for detecting CIN2+ (and CIN3+, based on more limited data). These data come primarily from women aged 30 to 69 years, within countries with well-developed cervical cancer screening programs. In the case of one-time co-testing (combined HPV-cytology screening), sensitivity is also superior to cytology alone, but not clearly better than HPV alone. For co-testing, test performance studies are fewer and more variable, and each study reflects a somewhat different test combination for a positive result (Tables 9a and b). There is also a potential bias toward inflated sensitivity when an adjunctive test is added to a conventional test and this combination is compared to the conventional test in the same women.14 Therefore, based on test performance studies alone, some improvement in sensitivity compared with cytology is likely if HPV testing were substituted for (or added to) cytology in primary cervical cancer screening, but the magnitude of increase is uncertain.

While some improvement in sensitivity with primary HPV screening may be likely, the degree of benefit in preventing invasive cancer cannot be determined from test performance studies alone for a number of reasons.14 First, the cross-sectional data suffer from determining sensitivity, specificity, and related predictive values for a surrogate outcome (CIN2+) and not true disease (ICC). Cervical cancer has a long preclinical period with predisease (CIN) regression, as well as progression that cannot be easily or directly studied. Regression can happen in any preclinical lesion, but appears much more likely in CIN2 or milder abnormal histological findings than in CIN3.55 If a disease that is destined to regress is detected, it represents true overdiagnosis and potentially overtreatment. As a surrogate, we can be more confident in the detection of CIN3+, given that it includes carcinoma in situ, adenocarcinoma, or ICC, and is more likely to progress and less likely to regress than CIN2+.55 Nonetheless, all CIN3+ is not clearly destined to quickly progress, leaving some uncertainty about whether increased detection and treatment confers a clear benefit in preventing ICC.55 Since cervical cancer screening consists of a program of repeated screening over time, earlier detection of precancerous lesions that would not have progressed and could be detected at a subsequent screening is not a clear benefit. Thus, for many reasons, one-time comparative test performance studies cannot provide full information on benefit, and complete data from repeated screening over time are needed. On the other hand, very high sensitivity (and corresponding negative predictive value [NPV]) is informative when considering screening interval. This concept will be covered more thoroughly in the section titled “Potential Subgroup Considerations With HPV-Enhanced Cervical Cancer Screening.”

While we are confident there is a meaningful potential benefit from HPV screening, we also recognize the potential for harms. The same test performance studies suggesting increased sensitivity also show specificity is generally reduced (between 2.8 and 4.5%). Given that screening test specificity is critically important when the prevalence of disease is low (as is the case with cervical cancer overall, but particularly in younger age groups),17 test performance studies suggesting any decrease in specificity demand further research.157 For example, even a 2 percent decrease in specificity for a one-time screening test in 10,000 U.S. women (with 0.8/1,000 CIN2, 0.7/1,000 CIN3, 0.1/1,000 cervical cancer) would result in 200 additional women receiving further unnecessary and even harmful testing and/or treatment, compared with cytology alone.104 No more than one case of cervical cancer could be detected (even with increased sensitivity), although more predisease would be detected and treated. Given that the reduced specificity with HPV testing is for a surrogate outcome (CIN2), it cannot be determined whether any (or how much) of the presumed false-positives actually represent predisease that was appropriately detected and prevented through ongoing enhanced surveillance stimulated by a positive screening test and negative colposcopy. This is particularly possible since colposcopy is also an imperfect test. Colposcopy is the accepted reference standard, but one that can generate a false-negative or false-positive result, leading to overtreatment. A study of 1,176 community histology CIN1 or CIN2+ diagnoses from the NTCC trial suggested a 15 percent estimate of overtreatment, since 15 percent of CIN2 or worse diagnoses were downgraded to CIN1 or better after blinded review of all surgical and histological samples available within 1 year of colposcopy referral.158 Similarly, in the United Kingdom, possible overtreatment occurred in 26 percent of histologically confirmed CIN1 and in 18 percent of women with biopsy showing less than CIN1 findings.159 Thus, additional cumulative disease detection results, along with more complete reporting of retesting, colposcopies, treatments, and related harms from RCTs could help answer important questions about the comparative impact on benefits and harms of different screening strategies for cervical cancer in a program of repeated screening.

Interim Conclusions About HPV-Enhanced Screening From Available Data

While incomplete, trial results to date—in combination with results from rigorous test performance studies in applicable populations—allow us to draw a few conclusions and point out some important caveats to interpreting trial results as they are reported going forward

First, HPV-enhanced primary screening strategies appear most promising when focused in women aged 30 or 35 years and older, but not younger women. Women older than age 30 or 35 years represent the primary age of study participants, and also show a better balance between improved test sensitivity and reduced test specificity than do younger women (Appendix E).

Second, some HPV-enhanced screening strategies look more promising and more relevant to U.S. practice than others. Although it is premature to determine which HPV-enhanced protocol(s) might be preferable, some trial designs are more directly relevant to U.S. practice (NTCC Phase I and II, Finnish trial), primarily due to the colposcopy referral thresholds employed. According to NTCC Phase II, HPV screening alone in women aged 35 years and older may provide a benefit relative to a cytology-only strategy, but this benefit would require some initial increase in colposcopy (Appendix E). Whether some of this increase will be offset by fewer tests in subsequent screening rounds, and determining what proportion of excess colposcopy is due to increased false positives (and their related harms), cannot be determined with available data (Table 18). Also, it remains to be determined whether proportional benefits and harms reported from this trial will be directly applicable to the United States, given this study tended to use a lower cytology threshold for immediate colposcopy referral and also referred all women to colposcopy for a single HPV positive test. Based on possibly reducing the degree of relative increase in colposcopy in HPV screening versus cytology, HPV testing followed by cytology triage appears promising given its superior specificity for CIN2+ or CIN3+ lesions, compared to cytology screening alone, in women of all ages. For women aged 35 years and older only, simulations suggest relative PPV for HPV with cytology triage was the same or significantly greater than with cytology alone, while HPV screening showed significantly reduced relative PPV.121,133,160 These simulated data are interesting but preliminary, since they reflect only baseline screening results and not a full screening round (with ongoing rescreening and colposcopy referral) or cumulative screening rounds. Also, data from the HPV-cytology triage trial come from cytology referral protocols that are similar but not identical to U.S. practice—that is, immediate colposcopy referral threshold for LSIL+ cytology with HPV+ (ASC-US or normal cytology) managed through repeat testing. Thus, more complete results from this trial could be relatively applicable to the United States.

Third, and in contrast to the other HPV-enhanced strategies, it is not clear if any co-testing strategy reviewed here offers a clear potential for additional benefit, particularly compared with primary HPV screening (alone or followed by cytology triage). Test performance data suggest no additional benefit above primary HPV screening alone for co-testing using cytology thresholds similar to U.S. practice, although increased cost would be expected for the additional test. European trials compared co-testing strategies to cytology alone (never to HPV screening), although indirect comparisons in the NTCC Phase I and II trials in women older than age 35 years suggest HPV-cytology co-testing did not detect more CIN that HPV testing alone, but did require twice the number of colposcopies at baseline. Finally, none of these trials employed strategies to directly evaluate the other main potential benefit from co-testing, which would be a prolongation in screening interval for cytology negative, HPV negative women (as recommended in U.S. practice). Unless co-testing is completely superior to HPV testing in appropriately determining those at lowest risk for prolongation in screening interval, it is difficult to see how administering both tests will ultimately be more valuable than other HPV-enhanced screening strategies. The issue of NPV is discussed more thoroughly below (“Potential Subgroup Considerations With HPV-Enhanced Cervical Cancer Screening”).

Fourth, there is no current consensus on how to interpret these comparative effectiveness trials of cervical cancer screening. Their interpretation is impacted by the many years and large sample sizes necessary to determine true disease outcomes (cancer). Thus, available data primarily represent surrogate outcomes (precancer or combined precancer and cancer). The European trials have offered considerable expertise and perspective on the acceptability and hierarchy of program outcomes—including surrogates—which is informative (Table 17). These experts suggest that reduced CIN3+ in Round 2 or beyond may be an acceptable surrogate measure for screening program benefit,161 while also clearly acknowledging the preference for demonstrating an impact on invasive cancer incidence or mortality.162 However, these perspectives may be most applicable in countries with uniform national screening policies. The degree of confidence that U.S. clinicians and policymakers are willing to place in surrogate outcomes is key.

Similarly, interpretation of round-specific and cumulative trial results is complex. As suggested by experts, Round 1 of screening detects prevalent disease and predisease, and increased detection of predisease in one strategy relative to another may represent early diagnosis and/or overdiagnosis of regressive predisease.117 In Round 2 of screening, incident, missed, or progressive disease and predisease are detected.117 Over at least two rounds, therefore, there is some way to compare the patterns of disease and predisease detection and infer overall program performance, as well as to compare round-specific patterns between trials to explain different results. Longer followup of Round 2 results (or additional screening rounds) may be necessary, particularly to allow for more complete ascertainment in both arms and to detect an impact on cancer. Some experts evaluate the pattern of screening results by round, suggesting that increased relative CIN2+ detection in Round 1 followed by decreased relative CIN3+ in Round 2 suggests prevention of disease progression.163 Others suggest that similar cumulative CIN2+ disease detection between arms after at least two screening rounds would indicate lack of overdiagnosis144 if the same screening test (ideally including HPV) was applied in both study arms at Round 2 and after.163 Only one trial applied HPV testing using PCR to both arms in the second screening round.114 Many other differences between trials (besides whether the second round applied HPV testing or not)—including type of HPV screening strategy, colposcopy referral and repeat screening protocols, and approaches to compiling and reporting outcomes—complicate applying these types of theoretical interpretations to the current body of evidence. Some commentators point out that the co-testing trials (with the exception of NTCC Phase I) actually test primary HPV screening with cytology triage, since all the trials use a cytological referral threshold only for immediate colposcopy.163 However, these trials actually have a safety-net in place for women with HPV-cytology-positive lesions, since all women receive both tests with referral for high-grade cytology alone.

Fifth, there are a number of important potential biases that will need to be carefully considered when interpreting more complete reporting from trials. Large comparative effectiveness trials of cervical cancer screening embedded in national screening programs use a pragmatic design that offer many advantages.164 However, there are several important biases to consider in their ultimate interpretation. As is well recognized, there is a potential for verification bias in any screening study that does not apply the gold standard to all who are screened, regardless of outcome.102 In these real-world trials, only those screening positive possibly receive the diagnostic colposcopic evaluation. Therefore, as with observational studies, their main value in terms of estimating test performance is limited to relative test performance results. Similar, any outcome interpretation is affected by the proportion receiving the diagnostic test. Possible ascertainment bias could occur if there are between-arm differences in the proportion complying with the recommended diagnostic test. Sufficient time for followup is also critical, given that diagnostic tests can be recommended immediately after screening or after a year or so of retesting and confirmation of initially abnormal screening results. Finally, the comparison of two tests (HPV vs. cytology) or the use of adjunctive tests (HPV plus cytology vs. cytology) in a randomized design can still be complicated by asymmetry bias in ascertainment if results do not represent sufficient long-term followup.14 Between-arm differences in predisease detection can occur even if the new test performs at random, when more women are selected for colposcopy due to the detection of incipient lesions that would not otherwise have been found. Thus, sufficient long-term followup and use of outside registry data to get a better estimate of the rates of true disease and predisease is important.

Potential Subgroup Considerations With HPV-Enhanced Cervical Cancer Screening

Beyond the impact on disease detection, there may be other subgroup considerations for an HPV-enhanced screening strategy. HPV screening introduces potential individual patient-level as well as population-level benefits, such as using negative test results to stratify women into low-risk groups in which screening intervals may be safely lengthened. International experts have noted that the NPV of adding an HPV test to cytology (or substituting HPV for cytology) may be a major utility of HPV-enhanced primary screening.165 Thus, this is an important endpoint for ongoing European trials, which has been partially reported to date.166 On the other hand, issues of how to best manage women with mixed results—particularly those who are HPV positive but cytology negative—are equally critical. For all women with inconclusive testing results, safety of any tailored screening strategies along with data on psychological effects, including compliance with rescreening, will be critical.

HPV negative/cytology negative subgroup considerations (Table 19). Meta-analyses of cross-sectional results have confirmed the high NPV of negative results for combined HPV/cytology testing.165,166 Some European trials have reported longitudinal results for this subgroup. The POBASCAM trial estimated that after a combined negative high-risk HPV test result and negative cytology, the 5-year cumulative risk of CIN3+ lesions per woman screened was 0.1 percent (95% CI, 0.1% to 0.2%), which was lower than the risk for women who did not receive an HPV test at baseline but had negative cytology (0.8% [95% CI, 0.6 to 1.0]).114 Almost half of CIN3+ cases (3/8) detected in the subsequent screening round 5 years later in those initially HPV negative/cytology negative were in women who tested HPV+ in the second round.114 Post hoc analyses demonstrated little prognostic benefit for co-testing above HPV testing alone, since the 5-year cumulative risk of CIN3+ after a negative high-risk HPV test was 0.2 percent (95% CI, 0.1 to 0.3). Two other trials (ARTISTIC, Swedescreen) have reported interim data that are consistent with a very low risk of CIN3+ in those negative for HPV and cytology at rescreening after 2 to 3 years.115,117 However, a lower proportion of HPV negative/cytology negative women completed Round 2 screening in ARTISTIC (60%) than women with at least one positive test did. This affects assessment of true CIN3+ risk, but also raises questions about whether women who test double-negative might not comply with future screenings. Reporting from a third screening round in ARTISTIC confirms a longer-term (6-year) reduced risk of CIN3+ (0.28%) in those women who were HPV negative that is indistinguishable from those who were HPV negative/cytology negative.135 Since these data are reported only in those women undergoing three rounds of testing, however, they represent only 36.2 percent of the original cohort, and could represent selective ascertainment due to incompletely reported data.

Table 19. Cumulative Incidence of CIN3+ By Baseline Testing Status of RCTs and Cohort Studies With Long-Term Followup Data.

Table 19

Cumulative Incidence of CIN3+ By Baseline Testing Status of RCTs and Cohort Studies With Long-Term Followup Data.

These short-term, trial-specific data are supplemented by large, longitudinal cohort studies and pooled data. A multinational European joint cohort study with pooled data on 24,295 women examined cumulative incidence of CIN3+ among women with adequate cytology and HPV testing at baseline and at least one followup cytological or histological test.167 During 6 years of followup, 1.6 percent of women developed histologically-confirmed CIN3+. The cumulative CIN3+ incidence rate among women that tested negative for HPV (generally HC2) and on cytology (less than ASC-US) was 0.28 percent (95% CI, 0.12 to 0.45). There was little difference in CIN3+ development between women with negative results on both tests and women negative for HPV only. The rate of CIN3+ development over 6 years in women who were HPV negative was significantly lower than among women who had negative cytology results (0.97% developed CIN3+ over 6 years). Results for CIN2+ were essentially the same, but with a higher number of cases. These data are limited by verification bias (only test positives according to initial and rescreening protocols were uniformly assessed for disease outcomes), with between-study differences in protocol, as seen in trials in this review. Nonetheless, CIN3+ detection rates were generally consistent and low across studies in HPV negative/cytology negative women, despite their likely participation in ongoing cervical cancer screening.

In a prospective study of 20,810 women (mean age, 35.9 years) in Kaiser Permanente Northwest, the risk of CIN3+ was 0.16 percent (95% CI, 0.08 to 0.24) after almost 4 years of followup in 17,592 women with negative cytology and high-risk HPV tests.168 In women who were HPV negative, the 10-year cumulative incidence of CIN3+ was 0.87 (95% CI, 0.62 to 1.12) and lower than the cumulative incidence in women with ASC-US+ baseline cytology (1.38 [95% CI, 1.10 to 1.67]).

Among Danish women who tested negative for high-risk HPV, only 8 percent of those aged 22 to 32 years and 7 percent of those aged 40 to 50 years developed an abnormal Pap smear over 10 years, with each woman receiving a median of three tests. For both age groups, most abnormal smears were atypia only, with about one-third reflecting severe dysplasia.169 The absolute risk of CIN3+ in HPV negative/cytology negative women at 3 years was 0.2 percent in younger women and 0.08 percent in older women, at 5 years it was 0.8 percent in younger women and 0.4 percent in older women, and by 10 years it was 3.1 percent in younger women and 1.7 percent in older women. Compared with women with two negative tests, age cohorts that were cytology negative but HPV positive had markedly increased relative risk for CIN3+ at 3 years (younger women: RR, 11.0; older women: RR, 53.8), 5 years (younger women: RR, 6.9; older women: RR, 23.3), and 10 years (younger women: RR, 4.4; older women: RR, 12.5).

Among 8,735 women aged 30 to 60 years participating in the United Kingdom HPV in Addition to Routine Test (HART) trial, a randomized evaluation of management strategies for women who tested positive after co-testing, the high NPV of a negative HPV test was confirmed.159 After a minimum of 5 years, cumulative CIN2+ was about half as common in women who were HPV negative at baseline compared with those who were cytology negative (0.23% and 0.48%, respectively). Since most differences between the two tests occurred in the first year, differences may reflect poorer sensitivity of cytology. The hazards ratio for cumulative CIN2+ increased dramatically with the HPV relative light unit (RLU) levels. Compared with a typical negative result (<1 pg/ml), the hazards ratio for an HPV RLU of 1–10 pg/ml was 5.4 (95% CI, 1.7 to 18.2) and 25.2 (95% CI, 13.6 to 47.9) for an HPV RLU ≥10 pg/ml (p<0.00l for trend). Data were not reported for CIN3+.

These trial and cohort data clearly indicate that one potential value of a screening program with initial HPV testing could be reduced screening intervals for the majority of women who test negative. Among participants in co-testing trials, this group represents a very significant proportion of those screened at baseline: from 78 percent in both arms (combined) in ARTISTIC, to 88 to 93 percent in the co-testing arms of Swedescreen, NTCC, and POBASCAM. Thus, if a reduced interval for repeat screening is shown to be safe and effective—as well as workable within the clinical, social, and political realities of cervical cancer screening in the United States—it would be appropriate for the vast majority of women aged 35 years and older after a single round of screening that included HPV testing.24

HPV negative subgroup considerations. Based on the data discussed above, women screening negative on HPV testing have a nearly identical reduced long-term risk of developing CIN3+ as women that are HPV negative/cytology negative. The high NPV associated with HPV negative testing alone, particularly in older women, might inform extended screening intervals for such women, with no need for cytology testing at all in HPV negative women.

HPV positive/cytology negative subgroup considerations. A concern among programs that involve combined HPV-cytology screening initially or in sequence is how best to manage HPV positive/cytology negative individuals. Data from large cohort studies show that women with HPV positive/cytology negative results experience a continuously increasing cumulative incidence rate that reaches 10 percent (95% CI, 6.2 to 15.1) after 6 years.167 As indicated in Table 5b, trials varied in their approach to management of these individuals in terms of timing of repeat screening, rescreening tests utilized, and colposcopy referral thresholds. A detailed analysis and comparison of these differences and associated outcomes in this important subgroup would be important, but our review (consistent with others’ findings)170 suggests that additional details beyond those currently published would be needed to fairly compare different protocols. Any modeling of co-testing would need to carefully consider between-study details about rescreening protocols, compliance, and the impact on results. Furthermore, research continues to identify the role of specific HPV subtypes (particularly 16, but also 18, 31, and 33) and persistent infection by these types in further specifying high risk for CIN3+.171 More specific management of this subgroup could be informed by better risk prediction. Similarly, genotyping may also play an important role in the future for risk-stratification into tailored screening strategies.172

Cytology Screening With HPV Triage (Reflex HPV) for ASC-US or LSIL Cytology

Overall, results from observational studies suggest that HC2 is somewhat more sensitive than repeat cytology at a colposcopy referral threshold of ASC-US+ for the detection of CIN2+ (but not clearly CIN3+) lesions among women with ASC-US referral cytology, with no further advantage when CC is added to HPV triage, but a possible increase in false positives. Age-stratified results were generally not available, but many studies (besides ALTS) represent women primarily older than age 30 years. Our findings from a much more limited meta-analysis agree with previous meta-analysis results reported by Arbyn and colleagues.173,174 HPV testing was more sensitive and equally specific for detection of CIN2+ for the triage of ASC-US+ results, compared to repeat cytology, with no benefit for HPV triage of LSIL+ cytology.

Trial results suggest reduced specificity (more false positives and colposcopies) for CIN2+ or CIN3+ with HPV compared with CC triage—particularly, but not exclusively, in women younger than age 30 years. The higher prevalence of transient HPV infections in younger women may play a role here. In contrast, the use of an HPV triage test clearly provided no substantial advantage for referring women with LSIL to colposcopy. This may reflect a high prevalence of HPV among women with LSIL cytology results (58.9 to 94.8%). Other studies have suggested potential value for HPV triage of LSIL in women older than ages 45–50 years, if they represent a group in whom the co-occurrence of HPV is lower and if the HPV negative group has a low-risk of CIN3+ over the time period until the next screening.175 In women aged 30 years and older, one small study suggests an HPV triage strategy for ASC-US or LSIL would produce three false positives for every two with repeated cytology, and four false positives with HPV triage for every two with cytology in younger women (Appendix C Table 3).119 These estimates are imprecise due to the small number of women in the study and because the authors include both ASC-US and LSIL triage in their calculation, which inflated the number of referrals in the HPV arm. Trials reviewed here reported simulations of various triage and repeat testing strategies following primary HPV screening or primary cytological screening.117,119,121 None of these had (or reported) cumulative screening round data to simulate different triage strategies within a program of screening.

The studies we included to evaluate HPV triage of abnormal cytology included women from a broad age range (range, 15 to 78 years; mean or median range, 27 to 35 years), but provided minimal age-stratified data. While these studies found that overall HPV testing was not useful for the triage of LSIL cytology due to the high prevalence of HPV among women with LSIL cytology, one might postulate that the low HPV prevalence among older women could potentially render HC2 useful for triage of LSIL cytology in the older age groups. However, the study by Peto and colleagues, included for KQ1, demonstrated that there was no trend of decreasing high-risk HPV prevalence with age among women with abnormal cytology.32 In an article from the ALTS trial, the authors concluded that HPV triage of LSIL cytology was not useful at any age range, since the proportion of HPV positivity among women with LSIL did not decline dramatically with age.176

Harms of HPV Testing

In addition to concerns about false-positive testing and related harms, we identified four studies that described potential psychological harm from HPV testing.139–142 In the short term (first few weeks after receiving test results), women who test positive for HPV had higher levels of anxiety and distress and greater concerns about their health and health risks. In the long term, however, these results did not persist. In fact, when considering triage of ASC-US cytology with an HPV test versus repeat cytology, long-term followup suggests greater satisfaction with care and less distress among women undergoing HPV testing. This may be because women who undergo repeat cytology have to wait for additional results before it is determined whether or not they need colposcopy, whereas women undergoing HPV testing are triaged much more quickly.

The evidence about harms of HPV testing is limited. Only two of the four included studies present long-term followup, there was a small number of women included in the followup, only one study administered questionnaires prior to cytology and HPV testing, and all studies had large proportions of women who did not return the study questionnaires. Larger studies with longer-term followup, assessment of psychological measures pre- and post-test, and adjustment for baseline psychological measures and appropriate confounders are needed to determine the psychological impact of HPV testing.

Are All HPV Tests the Same?

HPV testing has been approved by the FDA for cervical cancer screening in women older than age 30 years as co-testing and for triage of ASC-US cytology. Whether HPV testing was a more sensitive indicator than cytology for recurrent or residual CIN following treatment was not included in this review.166

The vast majority of data reviewed in this report, from both trials and observational studies, reflects a clinically validated commercial assay, HC2, with a much smaller body of evidence evaluating PCR testing using GP5+/6+ probes. Newer FDA-approved tests were not part of our original scope, although we did not exclude any trials for that reason. As evidence on HPV testing is translated into practice—particularly into screening programs—users should consider whether tests other than HC2 will produce similar results as shown in research. In widespread screening, even small differences in test performance may have large detrimental impact.177 HPV is a very complex molecular diagnostic assay whose analytic and clinical validity are affected by issues such as the number of HPV genotypes tested,177 number of viral copies required, and other factors.178 Users should be aware of potential differences in expected test performance between validated well-studied tests and other, less-well-studied tests. Those choosing to use a less-well-studied test should ensure the minimal performance standards of these tests, as discussed below.

Some data suggest that PCR may not be equivalent to HC2 in absolute test performance122 or have shown heterogeneous sensitivity and specificity estimates when pooled, perhaps due to use of different primers in detection of amplified sequences.144 Although differences may be amenable to better quality control, care should be taken to ensure expected test performance before substituting another HPV assay for proven tests in large-scale screening programs. Furthermore, as outlined in a recent article, FDA approval of newer HPV technologies may not always include a complete consideration of its comparative performance relative to HC2, or its overall clinical performance (both sensitivity and specificity) in a program of screening.179 Kinney provides a cogent argument, with examples taken from package insert data for one recently approved HPV test, illustrating that HPV tests with good analytic sensitivity should not be assumed to have clinically equivalent test performance as HC2, and that differences in clinical performance, particularly related to specificity, could have a large impact on cervical cancer screening programs in terms of costs and potential harms.179

An international group of experts has proposed minimum relative sensitivity (0.90) and specificity (0.98) thresholds to be determined in direct test performance comparisons with HC2 before clinical use of newer high-risk HPV tests in cervical cancer screening. Newer tests should also be highly reproducible (agreement >87%, minimum 500 samples).162,180 U.S. experts have made similar recommendations.163 Criteria have also been articulated to guide policymakers about when good clinical test performance data can allow substitution of a diagnostic test into proven clinical use without new RCTs.157,162 These same standards should apply to the substitution of different screening tests than those proven in RCTs or convincing epidemiological evidence.

Age at Which to Stop Cervical Cancer Screening

We did not systematically evaluate the literature regarding the age at which cervical cancer screening should be discontinued. This topic was systematically reviewed in the previous USPSTF evidence review.99 Based on fair-quality evidence obtained from 12 cohort studies, the review reported the following conclusions.

  1. The incidence and prevalence of high-grade cervical lesions and cancer decreased with age. The peak incidence or prevalence varied with type of lesion (e.g., CIN1 and CIN2 versus CIN3), but in general, women older than age 65 years had the lowest burden of disease.
  2. The age-related decrease in cervical disease was similar in previously unscreened women.
  3. There was no difference in the aggressiveness of invasive cancer in older women compared with younger women.
  4. Repeat screening after negative smears was associated with a reduced risk of high-grade cytologic abnormalities.

Evidence identified during the course of our review confirms the previous review’s findings of reduced rates of abnormal cytology and detection rates for CIN3+ as women age and with subsequent screenings.32,104,181 Data from two rounds of cervical cancer screening (750,591 cytology tests from the first round and 373,851 from the second) from the CDC’s National Breast and Cervical Cancer Early Detection Program (NBCCEDP) demonstrate that the percentage of abnormal cytology results decreases with age and with subsequent screenings.181 The percentage of cytology results that were classified as abnormal on first screening decreased fairly linearly with increasing age, from 33 percent of cytology tests in women aged 18 to 29 years to 14 percent in those aged 65 years and older. The percentage with HSIL or SCC also decreased with age from a high of 2.4 percent in 18- to 29-year-olds, but plateaued and was similar among those aged 40 years and older (0.4 to 0.6%). Age-specific detection rates for CIN3+ decreased linearly with age from 14.6 per 1,000 cytology tests in women aged 18 to 29 years to 2.0 per 1,000 in those aged 65 years or older. CIN3+ rates were fairly similar among all women aged 50 years and older. For all ages, rates of abnormal cytology or histology were reduced on second screening, but the age gradient was maintained, with relatively higher rates of cervical abnormalities in younger women than older women. Women aged 40 years and older, particularly those aged 65 years or older, experienced a smaller proportional reduction from first to second screening in rates of abnormal cytology and biopsy-confirmed CIN than younger women.

In a study from a UK cohort screened between 1988 and 1993 and less than 5 years after a normal screening smear, the annual incidence of CIN3+ decreased as women aged, from a high of 4.07 per 1,000 per year for ages 25 to 29 years to 0.19 per 1,000 per year for ages 60 to 64 years.32 Incidence of CIN3+ in those women aged 65 to 69 years was somewhat higher (1.39 per 1,000 per year), but was comparable to the incidence in young women aged 15 to 19 years (1.56 per 1,000 per year). Similarly, in the Kaiser Permanente Northwest population, the highest incidence of CIN3 (6 per 1,000 routine smears) was in women aged 25 to 29 years, with 0 to 1 CIN3 cases per 1,000 routine smears in women aged 60 to 79 years, which was lower than the 15- to 19-year-olds (2 per 1,000).104 In this study, there was a sharp decline in the yield of CIN2 and CIN3 with screening in women older than age 30 years, with only 2 cases of high-grade CIN identified in 5,488 routine smears in women aged 60 years and older.104 Incidence of cervical cancer after three consecutive negative screening tests was found to be the same after 10 years followup in 445,000 women aged 30 to 44 years compared to 219,000 women aged 45 to 54 years, suggesting that the risk among well-screened women is the same among middle-aged women (30 to 65 years).182

At present, there remains no consensus regarding the age at which to discontinue cervical cancer screening,183,184 and countries with screening policies recommend stopping after an adequate screening history at different ages: ages 59 to 60 years (Sweden, Finland, Japan), ages 64 to 65 years (England, Spain), and age 69 years (Australia, Canada, Norway).183 The United States may be the only country consistently screening some women older than age 65 years (an estimated 43 to 66% during one 3-year period), and one epidemiologist has recently noted that ecologic data from all of these countries suggest that the United States is also the only one of these countries that has achieved a relative downward trend in the incidence of cervical cancer in women older than age 65 years.183

However, improving the burden of cervical cancer on older women is likely best achieved by focusing on screening those who have not been adequately screened. In a recent review on screening intervals and age limits, Sasieni and Castanon note that a Markov model for disease progression produced by Fahs and colleagues determined that screening women older than age 65 years with previously adequate screening history would be inefficient;185 in contrast, screening women who have not been adequately screened triennially would reduce mortality by 74 percent.183 Sasieni and Castanon state that the inefficiency is primarily because more smears are required, less CIN is detected as women age, and there are other competing causes of death. In addition, disease progression from CIN to cancer is believed to be relatively slow, and only a proportion of CIN cases will progress to cancer (20 to 30% within 5 to 10 years).183 These authors point out that most guidelines around the world suggest that screening should cease by age 65 years, provided women have an adequate screening history.183

Defining an “adequate screening history” is not entirely clear-cut, except among those who have never been screened. Published reviews suggest that about half of all invasive cervical cancer cases are diagnosed in women who have never been screened or have not been screened within 5 years.18,186,187 Given this, the NBCCEDP program has shifted its focus to target women older than age 40 years who are at greater risk for never or rarely having been screened.181 Among the 465 cases of ICC detected between 1995 and 2001 in the NBCCEDP program, 31 percent reported no prior screening before entry into the program. Among women aged 18 to 29 years, 25 percent reported no previous screening, compared to 42 percent among those aged 65 years and older. Data from the UK Audit of Screening Histories also suggest that older women with cervical cancer are less likely to have ever been screened than younger women with cervical cancer or age-matched controls.183 According to the UK data, approximately 70 to 80 percent of women aged 20 to 49 years with cervical cancer had ever been screened, compared to fewer than 50 percent of women aged 60 to 69 years. The proportion ever screened among the young women with cervical cancer did not appear to differ from their age-matched controls, whereas the proportion of women aged 60 to 69 years with cancer who had ever been screened was 20 percent less than age-matched controls. Only about 25 percent of women aged 60 to 69 years with invasive cancer had a negative smear within 5 years, compared with 60 percent of age-matched controls.

The results of previous screening episodes may also be associated with risk. As already discussed, a large observational study in the Netherlands found the same cumulative incidence of ICC after three consecutive negative smears in women aged 45 to 54 years as in women aged 30 to 44 years.182 Another observational study in Italy found nearly an eight-fold lower cumulative risk of CIN2+ in women aged 50 to 64 years compared to those aged 25 to 49 years after three previous negative screens.188 The effect of a history of negative screening results on risk in older women is not clear from these studies, although differences between older and younger women may be less for cervical cancer than for precancerous lesions. Researchers are beginning to factor in considerations such as new sexual partners in increasing risk for HPV infection (or reinfection) in older adults.189

Women previously treated for CIN have a higher risk of later cervical cancer. A cohort study in Finland found increased risk of cervical cancer in women treated for any CIN, compared to a standard population (standardized incidence ratio, 2.8 [95% CI, 1.7 to 4.2]),83 although no increase in cervical cancer mortality was found in the same cohort.190 Another cohort study in Sweden found increased cervical cancer risk after CIN3 treatment (standardized incidence ratio, 2.34 [95% CI, 2.18 to 2.50]), with greater risk for women aged 50 years and older, compared to younger women.191

Older women are currently disproportionately represented in the unscreened and underscreened population—with 83.1 percent of those aged 60 to 64 years receiving recommended screening versus 87.6 percent overall, according to 2008 Behavioral Risk Factor Surveillance System data—as are some minority (American Indian/Alaska Native and Asian/Pacific Islander) and non-English speaking women.184,192 Black women, despite having slightly higher than average rates of compliance with recommended cervical screening,192 have increased age-specific cervical cancer incidence that does not peak but continues to increase with age193,194 to about 26 per 100,000 women at ages 85 years and older (Table 2 and Figure 1). Both black and Hispanic women have higher age-adjusted incidence rates for cervical cancer than nonHispanic whites,17,193,194 and these minority groups, along with American Indian/Alaskan Natives, also have higher age-adjusted cervical cancer death rates.17 Therefore, these groups remain important populations in which to ensure adequate screening, both for older and younger women. Age-adjusted incidence rates for Asian and Pacific Islander women were somewhat higher than for nonHispanic white women from 2000 to 2008, but mortality was similar between the two groups.17

In summary, newly available data do not contradict current USPSTF recommendations to discontinue routine cervical cancer screening for women older than age 65 years who have had adequate screening with negative results and who are not otherwise at high risk for cervical cancer. Older women with a history of treatment for CIN represent one high-risk group who could continue screening. In the future, factors such as the use of HPV testing, HPV genotyping, and sexual history might help further define a cohort of older HPV negative women for whom screening could be safely discontinued.32,195,196

Limitations

This review has several limitations. While our literature search was extensive and the included studies covered an international population of women, we only included studies that were written in the English language. We further focused our results and discussion to primarily consider studies most relevant to the United States, which excluded countries without well-developed population screening for cervical cancer in place. Most included studies addressed women aged 30 to 60 years, with almost no data in women older than age 65 years and limited data in younger women. Age-specific data were not always reported or did not always use the same thresholds when reported. Thus, women aged 30 to 34 years were variously grouped with older or younger women, depending on study reporting. We did not systematically review data related to screening intervals, age at which to stop screening, or automated cytologic screening technologies, of which the latter two were covered in the previous review by Hartmann and colleagues.99 Automated cytologic screening technologies were excluded from this review due to the limited audience for these data among primary care providers. Furthermore, HC2 was the only HPV test available in the United States when the scope of this review was determined, and thus we limited our review to use of HC2 and PCR only.

Two large studies, one evaluated for inclusion in KQ2 (Guanacaste study) and one for KQ3 (HART), did not meet eligibility criteria for this review. Appendix D delineates the rationale for their exclusion. Briefly, the final histologic diagnosis in the Guanacaste study included results of the screening tests. Additionally, the reference standard of colposcopy and biopsy was not systematically applied. The main limitation of the HART study is that it is a randomized trial of management options after co-testing or HPV with cytology triage rather than a test of an HPV-enhanced screening strategy compared with cytology. HART also has risk of verification bias, given that there was differential loss to followup for colposcopy referral among the study arms. Other issues in using results to estimate absolute test performance include uncertainty about the timeframe within which colposcopy and biopsy were provided and lack of blinding of colposcopists to cytology results (with perhaps the ability to guess HPV results). Longer-term followup with linkage to registries can overcome some of these limitations, particularly for examining NPV.

Our review made a dedicated effort to consistently analyze and report the most policy-relevant data from recent trials of screening programs involving HPV for the USPSTF’s recommendation process. However, there are many publications associated with each of these trials, with updated results coming out over time. Some of the data that we indicate as not reported might have been missed in an ancillary publication or could become available through author requests or soon-to-be-available publications. Thus, findings from this report will need frequent updating with more complete data from trials. We found little data on age at which to begin screening or risk factors that may modify when screening should begin, such as age at first intercourse. While the available studies did not present data with sufficient granularity to make a specific age recommendation at which to commence screening, they do suggest that screening women younger than age 20 years is of little value, given the low incidence of cervical cancer in this age group and the potential harms of unnecessary evaluation and treatment.

Providing data related to the cost-effectiveness of HPV in any screening strategy was beyond the scope of our review. ARTISTIC investigators have conducted an extensive economic evaluation associated with that trial.197 Results suggest it would not be cost-effective to screen with cytology plus HPV (co-testing) compared with cytology alone. In this analysis, however, simulated primary HPV screening with cytology triage (or HPV triage of cytology) was cheaper than cytology screening without any HPV. A head-to-head trial comparing these two strategies is currently under way in Canada, with results expected in 2014 (Appendix F).198 Studies of HPV triage of ASC-US and LSIL cytology were limited by the lack of age-stratified results, and only two studies provided data for the outcome of CIN3+. The results of the ALTS trial were limited by a study design that does not mirror current clinical practice. In the ALTS trial, women were referred for colposcopy if their cytologic diagnosis was HSIL, which is a higher threshold for referral than what is commonly used in clinical practice.68 In addition, the immediate colposcopy arm would represent the results of colposcopy after one abnormal cytology result. In the clinical setting, among women with no prior history of CIN, colposcopy is usually performed after two ASC-US cytology results have occurred.

Another potential limitation of this review is that most trials and studies used colposcopy and/or biopsy as the reference standard. In some included studies, the biopsy was taken at standard cervical positions, but in many studies only abnormalities visible on colposcopy were biopsied, with a negative colposcopy interpreted as absence of disease. Colposcopically-directed biopsy is not 100 percent sensitive for the detection of preinvasive disease. The Shanxi Province Cervical Cancer Screening Study, for example, found that colposcopically-directed biopsy was more accurate in detecting large lesions compared to small ones, and identified 62.5 percent of lesions covering zero to two quadrants of the cervix and 100 percent of lesions involving three to four quadrants.199 In addition, only 62 of 83 women with CIN2 were detected by colposcopically-directed biopsy: 19 were detected by random biopsy and 2 solely by endocervical curettage. Analysis of data from the placebo arms of Merck’s GARDASIL trials also showed low correlation between results of colposcopically-directed biopsy and excisional specimens. The trial included women who were referred based on concerning cytology, biopsy and/or endocervical curettage results for LEEP, or other definitive therapy, and who had a cervical biopsy taken within 6 months before treatment (about 7% of all those in the placebo arms). The biopsy and definitive diagnosis (negative, CIN1, CIN2, or CIN3/AIS) coincided for just 42 percent of these participants; biopsy underestimated disease for 21 percent and overestimated (or removed) disease for 36 percent.200

Finally, the use of detected disease without full ascertainment of undetected disease does not accurately reflect sensitivity or true test performance. However, in the context of trials, it reflects real-world impact. Almost all trials reported results without using an intention-to-screen analysis, in which all women in the randomized arm are in the denominator for all calculations. Thus, for comparability, we used the number of women screened (or other comparable measures) for the denominator in our calculations. For trials nested within ongoing screening programs, either denominator has a rationale, although intention-to-screen would be most conservative. It is reassuring, however, that long-term disease detection was not substantially different using intention-to-screen analysis than when calculated using only women screened in one study reporting both.134

Emerging Issues/Next Steps

An international effort to pool data from HPV-based primary screening trials has been recently announced, recognizing the need to provide complete, uniformly reported, age-stratified data to inform evidence-based guideline development.201 These efforts are critical and could provide the best simulations of various possible HPV-based screening strategies, considering between-trial differences in screening and rescreening protocols. When available, their results will greatly enhance what we found through our systematic review.

Studies under way could impact the findings of this review and perhaps necessitate an update. These include a Canadian RCT comparing HPV with cytology triage to cytology followed by HPV triage among women aged 25 to 65 years.198 Results after two rounds of screening after implementation of the FDA-approved co-testing strategy in Kaiser Permanente Northern California in over 300,000 women are also expected.202 Initial HMO experience suggests that co-testing every 3 years is acceptable to both patients and providers,24 and that the average interval between negative tests is appropriately lengthened.203 Data from a nationally representative sample, however, suggests that U.S. primary care providers are not likely to extend the screening interval to 3 years, as suggested.203

This review excluded several emerging HPV testing methods, including tests that detect HPV-16 and HPV-18 only, p16 immunostaining, in situ hybridization, tests of mRNA or protein expression, and tests of viral load, which we felt to be of less clinical significance to the primary care setting when our review started. Since that time, more data are emerging to suggest that these may be important strategies to evaluate in the future. Recently, additional new technologies beyond the scope of our review have been approved by the FDA. These include the Cervista HPV HR and Cervista HPV 16/18 tests and Roche Diagnostics’ Cobas 4800 HPV test. Cervista HPV HR tests for 14 high-risk HPV types and Cervista HPV 16/18 individually identify two high-risk HPV types. The Cobas 4800 HPV Test simultaneously detects 14 high-risk HPV types (the same as those detected by Cervista HR HPV) and specifically identifies types 16 and 18.74 We found no studies on the Roche technologies that met our inclusion criteria. However, this test is in use abroad and approved in early 2011 for as yet undocumented indications. Triage strategies that allow immediate colposcopy referral for the highest-risk women, such as HPV genotyping and/or p16 immunostaining in HPV positive women, could improve overall compliance with colposcopy and potentially improve HPV-based screening program performance.170 Emerging technologies such as Roche Diagnostics’ Amplicor HPV test and Linear Array HPV genotyping test and Gen-Probe’s APTIMA HPV test will require future consideration, if submitted to and approved by the FDA. Gen-Probe’s APTIMA HPV, which detects 14 high-risk HPV types and also mRNA from viral oncogenes E6 and E7, is approved for use in Europe and has been submitted for FDA approval.

Future Research

Future research and future reviews will need to address the long-term impact of the HPV vaccine on the incidence of CIN and cervical cancer and on cervical cancer screening strategies. Reports from trials of GARDASIL and CERVARIX include about 3 years of followup, but longer-term efficacy is unknown.92,204 Brisson and colleagues used a cohort model of the natural history of HPV infection to estimate the number needed to vaccinate to prevent HPV-related disease and death, and found that results were highly dependent on the vaccine’s duration of protection.205 As discussed earlier, none of the HPV screening studies included in this review included HPV-vaccinated women; therefore, the impact of HPV vaccines on the effectiveness of cervical cancer screening programs is also currently unknown. Similarly, whether screening strategies should be modified in the face of known (or uncertain) vaccination histories will need study.

Additional research on the appropriate age at which to start screening (with year-specific data reported for younger women rather than 5-year age groups) and exploration of risk-stratification tools for targeted, earlier screening would extend the limited findings from this report. Similarly, given the relatively high proportion of women aged 65 years and older who are unscreened or underscreened and the apparent downward trend in cervical cancer screening (as recommended) among this age group, continuing research to determine screening history and other characteristics of women who develop ICC before and after age 65 years will be informative.

Ongoing population screening program research in Canada is under way to directly compare the efficacy of primary HPV screening (with cytology triage using LBC) to primary LBC with HPV triage, using tests and protocols similar to those in current use in North America.198 Results could help inform screening policy in the United States and Canada, including safety of an HPV primary screening approach and prolonged intervals for HPV negative women. Other research confirming the long-term low risk of high-grade cervical lesions in screening-negative women, along with research and modeling studies which incorporate sociodemographic and medical factors, may help further risk stratify women for more or less aggressive cervical cancer screening regimens. Ongoing research evaluating type-specific high-risk HPV testing, mRNA, or p16INK4A and other molecular markers has the potential to further clarify future risk in women and to improve the specificity of targeted screening approaches.144 Additionally, other future research should continue to address means to encourage screening in women who often ignore invitations to screening visits; one promising approach could be self-sampling for HPV testing, among other innovations.206

Conclusions

In summary, our systematic review supports the following conclusions:

  1. Due to the high prevalence of HPV, the regressive nature of prevalent cervical abnormalities, and the low prevalence of cervical cancer in women younger than age 21 years, cervical cancer screening in women younger than age 21 years does not appear to offer substantial benefit. No studies provided specific information on which risk factors beyond age should influence the decision of when to start screening, and we found no sufficient data on screening interval specific to younger women.
  2. In terms of cervical cytology approaches, LBC did not differ from CC in absolute test performance (sensitivity, specificity) or improve relative CIN detection. Most data suggest that LBC yields a lower proportion of unsatisfactory slides compared to CC and also allows for several different screening strategies with one specimen (i.e., reflex HPV after an ASCU-US cytology result, co-testing with both LBC and HPV, or reflex cytology after a positive HPV result). Cost and feasibility were not part of our review, but may be considerations, along with other local factors.
  3. The use of the HC2 HPV test as a primary cervical cancer screening tool appears very promising in women aged 30 years and older, particularly when coupled with cytology triage of HPV positive results. HC2 clearly is more sensitive for the detection of CIN2+ or CIN3+, compared with cytology alone, but somewhat less specific, with some uncertainty about overdiagnosis of regressive lesions. Use of cytology triage may reduce the increase in false positives (and their related harms) seen with HC2 testing alone. The net benefit of a primary HPV-screening strategy (with or without cytology triage) appears promising, but the net impact of such a program remains to be confirmed through more complete reporting of cumulative program results and requirements and modeling exercises.
  4. HPV testing in combination with cytology for women aged 30 years and older is also more sensitive than cytology alone for the detection of CIN2+ and CIN3+, but round-specific and cumulative impact on CIN3+ detection is still incompletely reported in RCTs, with mixed results at present. An acceptable measure of comparative benefit for a cervical cancer screening program has not been specified, although some European RCTs suggest decreased CIN3+ in a second screening round. However, available RCTs primarily test protocols that may not be very applicable to current U.S. practice. Also, through indirect comparisons and observational studies, HPV-cytology co-testing appears to be no more sensitive than HPV alone, and is possibly less specific; current RCTs do not completely report round-specific and cumulative colposcopy or related harms. Thus, from available data, there appears to be no additional advantage of HPV testing in combination with cytology compared to HPV testing alone, unless an advantage is conferred by assigning a subgroup of women who are negative on both tests to a program of less-intensive screening. Modeling would be needed to inform this possibility, also considering the similarly high NPV of HPV negativity alone.
  5. A single HC2 HPV test is more sensitive but equally or slightly less specific than repeat cytology for the detection of CIN2+ among women with ASC-US cytology. There is no benefit to combined cytology and HPV triage over HPV triage alone, and this strategy is associated with more false positives. Two trials (that actually tested HC2 plus CC triage) suggest non-significantly increased detection of CIN3+ with HC2 HPV triage; results apply particularly to women aged 30 or 35 years and older, with less data in younger women. HPV testing is not useful for the triage of LSIL or higher grade cytology, and HPV testing in women younger than age 21 years is clearly not advised.
  6. The best studied test for any HPV-enhanced screening program is HC2. Data reported here primarily refer to results with HC2 at a positive threshold of 1 pg/ml, and to a lesser extent, PCR GP5+/6+. Some trials simulate screening program results using a 2 pg/ml threshold for HC2 screening. In the absence of adequate RCT data, substitution of other types of HPV testing in cervical cancer screening programs based on these trials should be based on careful consideration of clinical test performance (test positivity, sensitivity, and specificity) when directly compared with HC2, on evidence of test-retest and inter-laboratory test reliability, other quality control issues, and cost.
PubReader format: click here to try

Views

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...