NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

McCrory DC, Matchar DB, Bastian L, et al. Evaluation of Cervical Cytology. Rockville (MD): Agency for Health Care Policy and Research (US); 1999 Feb. (Evidence Reports/Technology Assessments, No. 5.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Evaluation of Cervical Cytology

Evaluation of Cervical Cytology.

Show details

3Results

This section describes results on the diagnostic accuracy of cervical cytology screening tests, the meta-analysis of Pap test accuracy, and cost estimates of screening, diagnosis, and treatment of cervical cancer and its precursors. A description is also provided of the review of studies and models of effectiveness and cost-effectiveness of cervical cancer screening. Finally, results are provided from the cost-effectiveness analysis.

Results: Diagnostic Accuracy of Cervical Cytology Screening Tests

Of the 939 citations identified by the literature search, approximately 60 percent were eliminated at screening Step 1 either because they did not give results on Pap smear screening for cervical cancer or precursors, or because the Pap smears were not compared with a reference standard. In Step 2, another 31 percent of citations were excluded because cytological screening was compared with a reference standard other than histology or colposcopy (e.g., cytology), the screening test and reference standard were not reasonably concurrent (i.e., more than 3 months apart), or the study failed to report data on both sensitivity and specificity (i.e., a 2-by-2 table could not be completed) (Table 19).

Table 19. Distribution of Articles in Literature Screening and Abstraction Process.

Table

Table 19. Distribution of Articles in Literature Screening and Abstraction Process.

New Technologies

Because very few studies of the new technologies met the original Step 1 and Step 2 screening criteria (no studies of AutoPap®, one study of Papnet®, and one study of ThinPrep®), we modified the criteria to include studies pertaining to any of the new technologies that used a cytological reference standard if applied by an independent panel and if at least half of high-grade cytological results were verified histologically, as suggested in guidelines produced by the Intersociety Working Group for Cytology Technologies (1998). We further modified the screening criteria to include studies that failed to verify diagnosis in patients negative on two independent tests being compared. Though such studies do not estimate sensitivity and specificity, they can still provide estimates of relative TPR and relative FPR, as described by Chock, Irwig, Berry et al. (1997).

We considered a total of 59 studies (12 on AutoPap®, 27 on Papnet® and 20 on ThinPrep®) during this final stage of the screening process (Step 3). Forty-three of these articles had initially been excluded in Step 2, and 16 were brought to our attention at this time by reviewers or manufacturers (7 unpublished manuscripts and 2 published only in abstract form). Ultimately, we decided to describe in the evidence table and the text of this report all the studies considered for Step 3 that also met the criteria requiring the application of a concurrent reference standard (histology, colposcopy, or cytology). The net result was the inclusion of 6 studies of AutoPap®, 11 of Papnet®, and 8 of ThinPrep®.

Six studies of either AutoPap 300 QC® or the AutoPap Primary Screening System® permitted estimates of sensitivity. Little information was available on which to base an estimate of the specificity of rescreening. In three studies of the performance of AutoPap 300 QC® in slides manually screened as negative, the estimated sensitivity for detecting ASCUS or worse at the 20 percent review rate was 0.43 (Stevens, Milne, James et al., 1997), 0.51 (Patten, Lee, Wilbur et al., 1997a), and 0.663 (Colgan, Patten, and Lee, 1995). The estimated sensitivity for detecting LSIL or worse at the 20 percent review rate was 0.66 and 0.77 in the latter two studies. Studies suggested that sensitivity at the 20 percent review rate for ASCUS or worse was higher when these technologies were used for screening than when they were used for rescreening: 0.86 (Wilbur, Bonfiglio, Rutkowski et al., 1996) for the AutoPap 300 QC®, and 0.86 (Wilbur, Prey, Miller et al., 1998) and 0.92 (Lee, Kuan, Oh et al., 1998) for the AutoPap Primary Screening System®. The estimated sensitivity for detecting LSIL or worse at the 20 percent review rate was 0.92 (Wilbur, Prey, Miller et al., 1998). An estimate of specificity of 0.98 was obtained for screening use of the AutoPap Primary Screening System® (Lee, Kuan, Oh et al., 1998); however, this estimate was based on initial screening rather than rescreening. Because the spectrum of disease is different in patients whose smears are initially manually screened as negative compared with patients undergoing initial screening, this specificity estimate may be inaccurate. Furthermore, the Primary Screening System uses a somewhat different algorithm for classification than does the AutoPap 300 QC® and therefore could be expected to have a different specificity.

For Papnet®, one study provided estimates of both sensitivity and specificity (Kaufman, Schreiber, and Carter, 1998). This study estimated sensitivity at 0.38-0.41 and specificity at 0.82-0.92, depending on whether the screening test threshold was ASCUS or LSIL. Other available studies, described in Evidence Table 1, provided data on the impact of Papnet® rescreening on estimated sensitivity. For slides that had initially been read as normal, studies suggested that the yield of Papnet® rescreening (that is, the number of false negatives [FNs] detected divided by the number of slides rescreened) was between 0.3 percent and 3.3 percent. One study reported a reference standard diagnosis on all slides rescreened, permitting an estimated sensitivity of the rescreening step for detecting FN slides of 0.89 (Jenny, Isenegger, Boon et al., 1997). Other studies using a histological reference standard obtained lower estimates (Mango and Radensky, 1998). Mango and Radensky (1998) provide a comprehensive review of all studies of the accuracy of Papnet® screening and rescreening, including unpublished studies reviewed by the FDA as part of the premarketing application.

Two studies of ThinPrep® permitted estimation of both sensitivity and specificity (Bolick and Hellman, 1998; Roberts, Gurley, Thurloe et al., 1997). Bolick and Hellman (1998) compared ThinPrep® Pap smear diagnoses of LSIL or higher with a histological reference standard of CIN2-3 or higher, permitting direct estimation of the sensitivity of 0.94 and specificity of 0.58. In the same report, conventionally prepared Pap smears achieved a sensitivity of 85 percent and a specificity of 36 percent according to the same thresholds. Roberts, Gurley, Thurloe et al. (1997) compared conventional and ThinPrep® slides prepared with a split sample technique. Any positive result on either test was verified either cytologically or histologically; histological verification was obtained on a majority of HSIL samples. Although sensitivity and specificity could not be directly estimated, the relative performance of ThinPrep® to conventional Pap smears could be estimated. The relative TPR was 1.13, indicating that ThinPrep® had higher sensitivity, and the relative FPR was 1.12, indicating that ThinPrep® had slightly lower specificity. Details of all these studies are provided in Evidence Table 1.

Conventional Pap Test

Characteristics of the 84 studies that met the inclusion criteria for the meta-analysis of conventional Pap test accuracy are displayed in Table 20. Sample sizes in these studies ranged from 14 to 22,412, with a median of 194. The majority of studies (83%) used histology as a reference standard, but only 31 percent obtained verification of all or a random fraction of test negative subjects. A quarter of the studies did not ensure independent assessment of the test and reference standard or blinding. A majority (87%) of studies selected consecutive subjects. All included studies were published in full-length reports; abstracts identified in the screening process uniformly failed to provide sufficient data to meet inclusion criteria. In contrast to the studies of the new technologies, few of these studies were performed or supported by a medical device manufacturer.

Table 20. Quality of Selected Studies.

Table

Table 20. Quality of Selected Studies.

Data were available according to different combinations of test thresholds and reference standard thresholds: ASCUS/CIN1 (31 studies), LSIL/CIN1 (69 studies), LSIL/CIN2-3 (43 studies), and HSIL/CIN2-3 (45 studies). In two studies, the test and reference standard thresholds were not explicitly stated and could not be precisely inferred. Most studies permitted us to construct 2-by-2 tables using more than one combination of test and reference standard threshold.

Sensitivity estimates from these studies covered nearly the entire spectrum, ranging from 0.06 to 0.99. Similarly, specificity estimates ranged from 0.06 to 0.99. Because sensitivity and specificity are interdependent, simple statistics such as means do not provide an accurate description of the diagnostic performance of the test. We used several analytical strategies to determine the cause of the large amount of variability in sensitivity and specificity estimates and to ascertain a reasonable estimate of the performance of Pap tests in a low prevalence screened population. These analyses are described in the section on "Meta-analysis of Pap Smear Accuracy."

The setting in which most of these studies were conducted has important implications for interpreting their results and explaining the differences among them. Most of the studies were conducted in colposcopy clinics in patients who had been referred for colposcopy because of a cytological abnormality on an initial Pap smear screening. In such studies, the repeat Pap smear taken at the time of colposcopy usually served as the "test" result that was compared to the result of either colposcopy (available for all subjects) or histology (available only for women who underwent biopsy; that is, women with a negative colposcopy would be systematically excluded).

The assembly of the study sample from a population referred because of cytological abnormality on initial Pap screening can be expected to bias the spectrum of disease. Women who have a negative Pap smear at the time of colposcopy (following an initial positive smear) are likely to be different from women with a negative result on initial smear. Some have suggested that the removal of cervical epithelium with the initial Pap smear might remove enough abnormal cells that the subsequent Pap may be normal. Our analysis assumes that women with subsequent normal Pap smears had an initial false positive smear. If untrue, this assumption would lead to underestimation of Pap test accuracy.

Returning to the above example, when disease status verification is preferentially obtained in women with a colposcopically visible lesion who undergo biopsy, the study sample is further biased. This selection of subjects for verification results not only in a high prevalence of histological abnormalities in the study sample, but also in "workup" bias (Ransohoff and Feinstein, 1978). Although sensitivity and specificity are relatively invariant to prevalence, they are subject to workup bias. This bias can be expected to lead to elevated estimates of sensitivity and lowered estimates of specificity (Feinstein, 1985).

Many studies comparing conventional Pap screening with new technologies applied the reference standard (adjudicated cytology or histology) only to discrepant cases. Concordant positive and concordant negative test results are assumed to be true positives and true negatives, respectively, but may actually be concordant false results. This study design has a consistent underlying bias that can be expected to overestimate sensitivity and specificity of the new test (Miller, 1998). When the conventional and new test are conditionally dependent, as when tests may have similar problems with sample collection or interpretation of mild disease, this bias can be substantial.

Our goal was to obtain estimates of Pap test performance that are applicable to a low prevalence screened population. However, only three studies identified patients undergoing initial Pap smear screening and verified all (or a random fraction) of test negative subjects. The study of Baldauf, Dreyfus, Lehmann et al. (1995) was the largest of these, comprising 1,539 women. In this study, a 10 percent random sample of Pap negative women underwent colposcopy and biopsy to verify their disease status; this allowed for correction of any verification bias. Davison and Marty (1994) studied 200 women, and Hockstad (1992) tested 73 women; all test negative women in these two studies had their disease status verified with the reference standard test. These studies estimated sensitivity at 56 percent, 53 percent, and 29 percent, respectively, and specificity at 98 percent, 100 percent, and 97 percent, respectively.

Results: Meta-analysis of Pap Smear Accuracy

Data from the 84 studies of conventional Pap tests included in the meta-analysis are summarized in Table 21. These data were examined in a series of logistic regression models, one for each of the four screening test/reference test threshold combinations, and one for each of the outcomes sensitivity, specificity, and effectiveness score (D). Data from studies using a colposcopy reference standard were combined with those from studies using a histological reference standard. To attempt to explain between-study variation in effectiveness score, we added a term describing an implicit threshold (S) to the models; however, it did not explain a statistically significant amount of variation in the outcomes; therefore, the simpler one-parameter models are displayed. The summary effectiveness score and log odds ratio for each of the four thresholds are shown in Table 22. The summary effectiveness scores ranged from 1.03 (95% confidence interval [CI]; 0.78 to 1.14) for ASCUS/CIN1 to 1.29 (95% CI; 1.08 to 1.50) for HSIL/CIN2-3. Although the effectiveness score for each threshold was relatively low, better discrimination was seen with higher cytological and histological thresholds.

Table 21. Studies Meeting Inclusion Criteria for Meta-analysis of Conventional Pap Smear Accuracy.

Table

Table 21. Studies Meeting Inclusion Criteria for Meta-analysis of Conventional Pap Smear Accuracy.

Table 22. Summary Effectiveness Scores from Single-Parameter Model.

Table

Table 22. Summary Effectiveness Scores from Single-Parameter Model.

Effectiveness score estimates were used to derive the ROC curves displayed in Figures 14-17, with each figure representing a different test threshold/reference standard threshold combination. Points representing sensitivity and specificity combinations from individual studies are also plotted to describe the range of data from which the ROC curve was generated. It is important to note in interpreting these figures that the movement along a single ROC curve does not correspond to movement across different grades of cytological abnormality (e.g., ASCUS through CIN2/3) because each ROC curve describes studies using a single combination of test threshold and reference standard threshold. Rather, the change in sensitivity and specificity should reflect differences in "implicit" threshold.

Figure 14. Summary ROC curve of studies reporting on ASCUS/CIN1 threshold.

Figure

Figure 14. Summary ROC curve of studies reporting on ASCUS/CIN1 threshold.

Figure 15. Summary ROC curve of studies reporting on LSIL/CIN1 threshold.

Figure

Figure 15. Summary ROC curve of studies reporting on LSIL/CIN1 threshold.

Figure 16. Summary ROC curve of studies reporting on LSIL/CIN2-3 threshold.

Figure

Figure 16. Summary ROC curve of studies reporting on LSIL/CIN2-3 threshold.

Figure 17. Summary ROC curve of studies reporting on HSIL/CIN2-3 threshold.

Figure

Figure 17. Summary ROC curve of studies reporting on HSIL/CIN2-3 threshold.

Next, we attempted to explain between-study variation in effectiveness score by including the prevalence of "disease" according to the reference standard threshold as an independent variable in each logistic regression model. This proved to be highly statistically significant. The proportion of "diseased" subjects in the included studies ranged from 0.02 to 0.98. The parameter estimates from the model are shown in Table 23. The negative parameter estimate for prevalence term indicates that as prevalence increases, the effectiveness score decreases, sensitivity increases, and specificity decreases. This effect is consistent with the idea that prevalence is a marker of workup bias. To illustrate its effect, we calculated separate ROC curves for disease prevalences of 0.10, 0.25, and 0.50. As seen in Figures 18-21, as prevalence increases, test effectiveness decreases.

Table 23. Parameter Estimates From Model That Includes Disease Prevalence.

Table

Table 23. Parameter Estimates From Model That Includes Disease Prevalence.

Figure 18. Summary ROC curve of studies reporting On ASCUS/CIN1 threshold with varying prevalence of disease.

Figure

Figure 18. Summary ROC curve of studies reporting On ASCUS/CIN1 threshold with varying prevalence of disease.

Figure 19. Summary ROC curve of studies reporting on LSIL/CIN1 threshold with varying prevalence of disease.

Figure

Figure 19. Summary ROC curve of studies reporting on LSIL/CIN1 threshold with varying prevalence of disease.

Figure 20. Summary ROC curve of studies reporting on LSIL/CIN2-3 threshold with varying prevalence of disease.

Figure

Figure 20. Summary ROC curve of studies reporting on LSIL/CIN2-3 threshold with varying prevalence of disease.

Figure 21. Summary ROC curve of studies reporting on HSIL/CIN2-3 threshold with varying.

Figure

Figure 21. Summary ROC curve of studies reporting on HSIL/CIN2-3 threshold with varying.

We next examined the effect of the various determinants of the quality of studies on effectiveness scores. We examined the presence of blinded evaluation of the screening test in a model that included prevalence for each of the four test threshold/reference standard threshold combinations. A significant effect was observed only for the HSIL/CIN2-3 threshold; the parameter estimate of 0.940 (95% CI; 0.455 to 1.425) was positive, indicating that studies employing blinded evaluation of cytology demonstrated better discrimination.

The effects of verification of test negative subjects and type of reference standard on between-study variation in effectiveness scores were difficult to assess together with prevalence because there were only a few studies and because of collinearity, leading to nonconvergence of maximum likelihood estimates for most thresholds. Because of these problems, we evaluated the effects of these variables separately.

We also evaluated the effect of the quality score on sensitivity, specificity, and effectiveness. Neither verification nor the numerical dichotomous quality score variables were significantly associated with the test operating characteristics. The type of reference standard was significantly associated with effectiveness only at the HSIL/CIN2-3 thresholds, where a histology reference standard led to improved effectiveness scores compared with a colposcopy reference standard.

To calculate estimates for Pap test performance that are applicable to a low-prevalence screened population we had two options. One option was to use predictions from the regression model that included prevalence, substituting appropriately low values for prevalence. Then from the resulting effectiveness score, we would need to choose an appropriate combination of sensitivity and specificity. The other option was to use the subset of studies that were conducted in low-prevalence screened populations and avoid workup bias.

We chose the latter option for two reasons. First, we suspected that the prevalence term was more an indicator of biases than of different prevalence in the underlying population of interest; thus, adjusting for "low prevalence" might have uncertain effects. Second, we had no clear rationale for choosing any particular combination of sensitivity and specificity from the joint distribution described by the adjusted effectiveness score that would result from the first option.

Only three studies identified patients undergoing initial Pap smear screening and verified all (or a random fraction of) test negative subjects. In this subgroup, the combined sensitivity was 0.51 (95 percent CI; 0.37-0.66) and the combined specificity was 0.98 (95% CI; 0.97-0.99). The summary effectiveness score was 2.185 (95% CI; 1.658-2.712). Prevalence of disease in these studies ranged from 10 percent to 19 percent. The data points representing these three studies are circled in Figure 14.

Results: Cost Estimates of Screening, Diagnosis, and Treatment

The following sections describe cost estimates for using the Pap smear (and other technologies) to screen for, diagnose, and treat cervical cancer and its precursors.

Cost of Screening

Estimating the costs of using the Pap smear to screen for cervical cancer is a complex process because costs need to be estimated for both obtaining and processing the smear. Estimates of these screening costs, and the methods used to obtain the estimates, are described below. Estimates of the total cost of Pap smear screening and of the cost of new screening techniques are also provided.

Cost of Obtaining Pap Smears

Several steps were taken in this study to estimate the cost of collecting cytology samples. In the past, studies have used either the cost of an entire office visit or the amount of reimbursement made for the sample collection as a proxy. This overestimates the cost, as Pap smears are usually performed as part of an annual gynecological exam, which involves other procedures as well. In this study, attempts were made to arrive at more accurate estimates. Only the cost of the proportion of the office visit that was directly attributable to obtaining a Pap smear was included. To do so, first the time spent by the physician and other health care staff in obtaining smears was estimated. Second, the cost per minute of office visit time was calculated. Finally, the time devoted to collecting the samples during the office visit was multiplied by the cost per minute to obtain the cost attributable to the collection process.

The physician's time spent to obtain the smears was estimated from the National Ambulatory Medical Care Survey (NAMCS). This survey provides data on ambulatory medical care provided in physicians' offices and is based on a sample of nationally representative patient visits. The survey contains information on age, race, and sex of the patient, organized by selected physician characteristics such as type of specialty and geographic location. Information provided about the clinical substance of the visit includes the patient's problem, the physician's diagnosis (ICD-9-CM), the diagnostic/screening services provided, the surgical procedures performed, and the duration of the visit. Health Economics Research, Inc., analyzed the 1992 NAMCS data, which contained 34,606 patient records from 1,558 doctors who participated in the survey.

The NAMCS data provide information on the amount of time (minutes) spent in direct face-to-face contact with the physician. Separate estimates of physician time were obtained for patients who were 20-64 years old and for those who were 65 years or older. A regression technique was employed to derive the proportion of time spent obtaining a Pap smear. The model used is shown below:

Log (MINUTES) = B1 + B2 PAP_TEST

where

PAP_TEST = 1 if performed; = 0 otherwise

The B1 value is an intercept, and the B2 value is the proportion of time spent obtaining the smears. A log distribution was used to model the skewed distribution of office visit minutes.

Table 24 shows the calculations used to derive estimates of time spent by physicians to obtain Pap smears during office visits. The physician time spent in obtaining the smears was 1.44 minutes for women under the age of 65 and 5.42 minutes for those 65 and older. The difference in the amounts of physician time reflects the difficulties that may be experienced in obtaining cytology samples from women as they age.

Table 24. Estimates of Physician and Total Time Spent Obtaining Pap Smears During Office Visits.

Table

Table 24. Estimates of Physician and Total Time Spent Obtaining Pap Smears During Office Visits.

The NAMCS data only contain information on time the doctor spent with the patient, not on time the patient spent receiving care from someone else, for instance, a nurse or a technician. Therefore, time spent by other medical staff who assisted in obtaining the smears was estimated by subtracting physician time from total time, calculated by Waugh, Smith, Robertson et al. (1996a), for collecting smears from the under-65 population. When the time spent by other staff was included, the total time spent obtaining the smears was estimated as 7.30 minutes for the younger age group and 11.28 minutes for the older age group.

In Table 25, the cost per minute of office visits for Pap smears is shown. For the 20- to 64-year-old age group, office visit cost was obtained from MEDSTAT data; for patients 65 years old and older, data from both MEDSTAT and Medicare payments were used. Weighted average costs per visit and time spent were obtained using the distribution of Pap smears by type of office visit. The cost per minute derived from these estimates was $2.70 for those 20-64 years old (from MEDSTAT data), and $2.55 and $2.20 for the 65 years and older group (from MEDSTAT data and from the Medicare Fee Schedule, respectively). Ranges of the cost estimates were also provided to allow for sensitivity analyses.

Table 25. Calculating of Office Visits for Pap Smears.

Table

Table 25. Calculating of Office Visits for Pap Smears.

Cost of Processing Pap Smears

The average cost of processing cytology samples from Pap smears is shown in Table 26. The proportion of smears requiring physician interpretation and the proportion of those with definitive hormonal evaluation are presented, along with the charges associated with processing the cytology samples. The laboratory charges were derived from MEDSTAT data and from Medicare's Clinical Laboratory fee schedule and from the resource-based relative value scale fee schedule (RBRVS). The weighted average cost per smear read from the MEDSTAT data was $18.97, with a standard deviation of $6.94, for patients aged 20 to 64. The average Medicare reimbursement for processing Pap smears was estimated to be $10.19 for patients aged 65 years and older.

Table 26. Average Cost for Processing Cytology Samples from Pap Smears.

Table

Table 26. Average Cost for Processing Cytology Samples from Pap Smears.

Total Cost of Pap Smear Screening

In Table 27, the total cost of performing Pap smears is presented. Total cost, which included the office visit cost and the processing cost, was calculated for the two groups separately. For women ages 20-64 years, the total cost in 1997 dollars was $38.68. For women 65 years or older, the MEDSTAT estimate was $47.73 and the Medicare estimate was $35.01. Range estimates were provided for all cost estimates to allow for sensitivity analyses.

Table 27. Total Cost of Pap Smear Screening.

Table

Table 27. Total Cost of Pap Smear Screening.

Cost of New Screening Techniques

Numerous technological advances have been made in the collection of specimens and in the interpretation of Pap smears. Table 28 contains the cost or charge per smear, using various screening procedures available in the market today -- conventional Pap, ThinPrep®, AutoPap®, and Papnet®. Cost estimates were provided for the conventional Pap smear, and charges were presented for the others (cost information was not available).

Table 28. Estimates of Screening Costs for Cervical Cancer.

Table

Table 28. Estimates of Screening Costs for Cervical Cancer.

ThinPrep®

The costs to perform a ThinPrep® Pap test can be broken into the following categories for a clinical laboratory: supplies, preparation costs, evaluation costs, and indirect expenses. The laboratory supplies consist of the ThinPrep® Pap test kit that is distributed to clinicians along with stains and fixatives. The current retail price to laboratories for the ThinPrep® Pap test kit and related supplies is $9.75. Preparation costs include the ThinPrep® 2000 Processor cost of $45,000 as well as costs associated with personnel for preparing the slide for processing and disposing of waste after processing. Amortizing the cost of the ThinPrep® 2000 Processor over a 5-year period and assuming that 40,000 slides are processed annually yields an estimated $0.22 equipment cost.

The last two categories of costs, evaluation and indirect expenses, as well as the personnel costs for preparing the slide for processing and disposing of waste, are essentially equivalent between the ThinPrep® technology and the conventional Pap smear technology. A recent study conducted by the College of American Pathologists (CAP) estimated these costs for conventional Pap smears to be approximately $14.60. Thus, a total cost estimate of $24.57 can be derived by summing the components.

This estimate, however, excludes the cost savings attributed to a reduction in the number of Pap smears that may be repeated because the specimen is determined to have a "Satisfactory but limited by" (SBLB) diagnosis. The manufacturer of ThinPrep® argues that specimen quality is significantly improved with this new technology, and it can reduce the number of specimens labeled SBLB (currently estimated to be 40 percent) as well as the number of repeat Pap smears that result from this labeling. Cytyc Corporation estimates a 50 percent reduction in the number of specimens labeled as SBLB; however, estimates on the number of women who return for a repeat Pap smear as a result of an SBLB diagnosis are not well known. The cost estimate of $24.57 does not include any adjustments to reflect these potential cost savings. Ignoring these costs will tend to favor conventional Pap smears over the new ThinPrep® technology, and this is a conservative assumption.

However, the cost analysis has been conducted from the perspective of the health care system, with the majority of costs estimated using payments or allowed charges as a proxy for costs. Thus, it seems reasonable to take a similar approach to the estimation of costs for the new technologies. Cytyc Corporation estimates from a large number of private insurers that the average payment to laboratories for primary screening using the ThinPrep® technology ranges from $15.00 to $30.00 and includes all nonoffice costs associated with the processing and evaluation of the ThinPrep® smears, including pathologist fees for reviewing abnormal slides. Because Medicare does not currently reimburse laboratories at a different level from conventional Pap smears when ThinPrep® smears are performed, we do not have current payment estimates for the Medicare population. Thus, the ThinPrep® payment rates from the private sector will be used as an estimate for the Medicare population as well.

Using the payment levels from the private sector for the ThinPrep® technology and office visit payment estimates derived from MEDSTAT data and Medicare's RBRVS physician fee schedule (see Table 27), the cost estimates for ThinPrep® smears were obtained (Table 28).

Papnet®

The incremental costs associated with the Papnet® technology may be broken down into the following categories: the costs of preparation and shipment of the "normal" slides to a central facility for processing, the costs associated with the Papnet® neural network computer system's processing of the slide at a central facility, the costs of a viewing station to review the data shipped back to the laboratory on compact disks, and the costs of triaging and reviewing selected slides by the cytotechnologist. Neuromedical Systems Inc. (NSI) estimates that the average cost to laboratories for these costs ranges from $12.00 to $18.00 depending on the laboratory's volume. These costs are in addition to the cytotechnologist and pathologist costs associated with the manual review of normal and abnormal slides and the costs of supplies for obtaining and processing the sample, costs that are already occurring with conventional Pap smears.

The charge for Papnet® rescreening has been reported in the literature and from NSI to range from $30.00 (Schechter, 1996) to $40.00 (literature provided by NSI). This charge is in addition to charges for the conventional Pap smear, which includes all nonoffice costs associated with the processing and evaluation of the conventional Pap smear, including pathologist fees for reviewing abnormal slides. However, we were unable to obtain information from NSI regarding current average payments for rescreening with Papnet® technology. Because cost estimates for the competing technologies were derived using average payments, rather than charges, we believe that it would be inappropriate to use charges for Papnet®. Thus, we model Papnet®'s cost-effectiveness using a cost estimate that more closely approximates the projected costs, rather than reflecting current charges (see Table 28).

AutoPap®

The incremental costs associated with the AutoPap® technology may be broken down into the following categories: equipment, software, service, and software licensing. NeoPath, Inc., estimates the total average cost to laboratories for these four cost components to be $4.50, with a range from $3.75 to $4.50. These costs are in addition to the cytotechnologist and pathologist costs associated with manual review of normal and abnormal slides and the costs of supplies for obtaining and processing the sample, which are already occurring with conventional Pap smears. When AutoPap® is used for quality control purposes, there are no additional incremental costs or savings associated with labor. In contrast, when it is used as a primary screening vehicle, NeoPath, Inc., estimates that there is an overall reduction in cytotechnologist labor of about 20 percent across all slides. The reduction is a function of fewer slides being processed for review by the laboratory's cytotechnologist.

This savings in cytotechnologist labor can be demonstrated through a simple example. Assuming a laboratory currently processes 100 patient slides using the conventional technique of manual review of all 100 slides by a cytotechnologist, the cytotechnologist must actually review 110 slides because of the CLIA quality control requirement of a 10 percent re-review. In contrast, the AutoPap® scores all 100 slides and archives 25 percent of them as the "most normal" of all slides. There is no manual review of these slides by the cytotechnologist. There is manual review of the remaining 75 slides as well as a review of 15 of the slides to meet CLIA quality control standards (set at 15 percent for AutoPap® versus 10 percent for conventional manual review). Thus, the total number of slides undergoing manual review by the cytotechnologist is actually 90 when the AutoPap® technology is used versus 110 when conventional manual review is done. Ignoring this cost will tend to favor conventional Pap smears relative to AutoPap®.

The cost analysis has been conducted from the perspective of the health care system, however, with the majority of costs estimated using payments or allowed charges as a proxy for costs. Thus, it seems reasonable to take a similar approach to the estimation of costs for the new technologies. NeoPath, Inc., estimates that average payment to laboratories for quality control screening using the AutoPap® technology ranges from $7.15 to $8.00 from a large number of public and private insurers, respectively. This payment is in addition to payment for the conventional Pap smear, which includes all nonoffice costs associated with the processing and evaluation of the conventional Pap smears, including pathologist fees for reviewing abnormal slides. Because Medicare does not currently reimburse laboratories at a different level from conventional Pap smears when AutoPap® is used to identify slides for review, we do not have current payment estimates for the Medicare population. Thus, the AutoPap® payment rates from the non-Medicare public and private sector will be used as an estimate for the Medicare population as well.

Using the above-referenced payment levels for the AutoPap® technology, conventional Pap smear payment rates to account for cytotechnologist and pathologist review costs and slide preparation costs, and office visit payment estimates derived from MEDSTAT data and Medicare's RBRVS physician fee schedule (see Table 27), the cost estimates for AutoPap® smears were obtained (see Table 28).

Cost of Diagnostic Procedures and Treatment

A claims analysis was performed to obtain the cost of diagnostic procedures and treatments for women 20-64 years old. Because no primary data analysis was undertaken for patients 65 years old and older, Medicare's RBRVS fee schedule, Clinical Laboratory fee schedule, diagnosis-related group payment rates, and ambulatory surgery center payment rates were reported.

All outpatient and inpatient claims for women (20-64 years old) with ICD-9 diagnosis codes for cervical cancer, carcinoma in situ, and dysplasia were included in the sample analyzed. The specific diagnosis codes used are shown below:

Cost of Diagnostic Procedures and Treatment

ICD-9 CodeDescription
180.xNeoplasm of cervix, primary (malignant)
233.1Carcinoma in situ of cervix
622.1Dysplasia of cervix

Pregnant women were excluded from the analysis, as costs associated with diagnosing and treating this group differ from those of the general population. Cases with only an ICD-9 code of 180.8, "other specified sites of cervix," and no other relevant diagnostic codes (180.0, 180.1, or 180.9) were excluded from the analysis as they may have indicated cases requiring specialized treatment. Very few cases were excluded because of this selection criterion.

Costs for procedures performed on an inpatient and on an outpatient basis are provided. Certain procedures are provided only on an inpatient basis (for example, exenteration), though others (colposcopy, cone biopsy) may be performed in either an inpatient or an outpatient setting. For the procedures in the latter category, the appropriate setting or the setting where the procedure was most often performed for women with cervical cancer was utilized to derive the estimates. Most of these procedures were estimated using outpatient (physician and hospital outpatient department) costs. For procedures where both inpatient and outpatient settings were deemed reasonable, the cost of the procedure was estimated for both the outpatient and the inpatient setting.

The analysis was limited to those individuals with a single procedure on a given day or admission to minimize differences related to multiple procedures. Thus, individuals who had both cone biopsy and LEEP performed on the same day would not have been included in the estimation because accurate allocation of costs to each specific procedure was not always possible. For outpatient procedures, in addition to the costs specifically related to the procedure (identified by the procedure code), we also estimated the cost of related services inherent in the performance of the procedure. Therefore, we evaluated all other outpatient services that were performed on the same day to determine if they were appropriate to include in our cost definitions. For instance, costs associated with anesthesia were added to the cost estimates to capture the total cost of performing the procedure. Supplementing procedure-specific costs with other related services provided allowed for a more comprehensive estimate of the costs associated with the procedure. In addition, all separately billed pathology services (CPT codes 88305 and 88307) were included in the final procedure cost.

Cost estimates in 1997 dollars for diagnosis and treatment of cervical cancer are provided in Tables 29 and 30 respectively. In Table 31, additional cost estimates are provided for other relevant procedures associated with the diagnosis and treatment of cervical cancer. These include procedures such as endocervical curettage, CAT scans, and cytoscopy. In each table, Medicare cost estimates are provided for physician services (Medicare's RBRVS fee schedule), and hospital admissions (DRG payments).

Table 29. Cost Estimates for Diagnostic Procedures for Cervical Cancer.

Table

Table 29. Cost Estimates for Diagnostic Procedures for Cervical Cancer.

Table 30. Cost Estimates for Cervical Cancer Treatment1.

Table

Table 30. Cost Estimates for Cervical Cancer Treatment1.

Table 31. Cost Estimates for Other Procedures Used to Diagnose and Treat Cervical Cancer 1 (Continued).

Table

Table 31. Cost Estimates for Other Procedures Used to Diagnose and Treat Cervical Cancer 1 (Continued).

For the cost estimates derived from MEDSTAT data, the number of observations available for analysis and the standard deviation associated with the average cost are provided (see Tables 29, 30, and 31). For inpatient procedures, the number of hospital days is shown (see Table 30). In certain cases, the standard deviation is very large and, therefore, the values for the first quartile, the median, and the third quartile are also provided (see Tables 29, 30, and 31). These values may be used to perform sensitivity analyses. In addition, comparing these values with the mean provides a good indication of the distribution (skewness) of the cost for the procedure. For instance, the average cost for LEEP ($564) and the median ($542) is similar. This indicates that the costs are evenly distributed (closely approximating a normal distribution). On the other hand, the mean cost for exenteration ($51,413) is much larger than the median ($19,373), indicating a skewed distribution where a few large charges are influencing the mean.

Cost by Stage of Disease

Costs associated with six categories were identified. These categories included: (1) low-grade squamous intraepithelial lesions (LSIL); (2) high-grade squamous intraepithelial lesions (HSIL); (3) Stage I; (4) Stage II; (5) Stage III; and (6) Stage IV. Table 32 specifies the ICD-9 codes used to distinguish the stages of cervical cancer.

Table 32. Identifying Stages of Cervical Cancer.

Table

Table 32. Identifying Stages of Cervical Cancer.

The cost by stage of cancer was only estimated for the 20-64 age group, as current treatment protocols were not analyzed for women 65 and older. To estimate the total cost associated with treating cervical cancer, all costs for inpatient admissions and outpatient visits with the relevant diagnosis were included. The costs by stage, estimated from the MEDSTAT data, are presented in Table 33. The average inpatient, outpatient, and total costs are provided for each stage. In addition, the number of hospital days is also indicated.

Table 33. Cost Associated with Diagnosis, Treatment, & Follow-up of Cervical Cancer by Stage of Disease.

Table

Table 33. Cost Associated with Diagnosis, Treatment, & Follow-up of Cervical Cancer by Stage of Disease.

Very few cases were identified as Stage III and, therefore, a combined cost estimate is provided for patients with Stage II and III cervical cancer. Because ICD-9 codes do not include stage, there may be some misclassification of recurrent or persistent disease as higher stage cancers. The costs associated with all the treatments women received were included in a single estimate. Thus, in Table 33, for Stage II/III and Stage IV, two separate cost estimates were provided: one indicating costs specifically related to that particular stage and another that included costs associated with treatment of earlier stages as well. Because stage does not change once initially assigned, these estimates presumably represent women with recurrent or persistent disease. As would be expected, the average total cost increased as the severity of the disease increased. The cost for diagnosing, treating, and providing follow-up care for LSIL is $1,728, compared with $17,645 and $40,280 for Stages I and IV, respectively.

In Table 34, the standard deviation of the cost estimates and the values for the first quartile, the median, and the third quartile are provided. The mean costs associated with the lower stages are very close to the median values, indicating that these cost values closely approximate a normal distribution. On the other hand, the costs for the later stages of cervical cancer (Stages II, III, and IV) are very skewed, and the mean value is determined to a large extent by a few cases with large average treatment costs. Again, this probably includes the care of women with persistent or recurrent disease, especially women with recurrent Stage I disease who subsequently undergo radical therapy such as exenteration. Because of the lack of precision in the ICD-9 codes, there is no way to determine the actual clinical scenario with certainty. It is therefore important to use a range of cost estimates to perform sensitivity analyses when these average cost estimates are used. For the cost-effectiveness analysis, we used higher estimates for the treatment of cancer cases in order to bias the analysis in favor of more sensitive screening strategies.

Table 34. Mean and Standard Deviation of Total Cost for Cervical Cancer by Stage of Disease1.

Table

Table 34. Mean and Standard Deviation of Total Cost for Cervical Cancer by Stage of Disease1.

Results: Review of Studies and Models of Effectiveness and Cost-effectiveness

Because no definitive randomized controlled trials have been published on the effectiveness of cervical cancer screening, policymakers have relied on decision modeling studies to integrate epidemiological data on the natural history of cervical cancer precursors, performance data on diagnostic tests for early cervical cancer or cervical cancer precursors, and cost data to draw conclusions about the relative benefits and costs of various screening strategies. Table 35, Studies Meeting Inclusion Criteria for Evidence Table 2, identifies 34 studies that modeled health or economic outcomes of alternative cervical cancer screening programs or alternative strategies for managing cytological abnormalities. These studies were found through a search of MEDLINE and a search of citations from relevant publications.

Table 35. Studies Meeting Inclusion Criteria for Evidence Table 2.

Table

Table 35. Studies Meeting Inclusion Criteria for Evidence Table 2.

Alternative screening programs or strategies addressed by the studies included the following:

1.

Pap screening compared with no use of Pap screening

2.

Screening intervals for repeating Pap smear screening

3.

Rescreening of negative Pap smears

4.

Management of women found to have abnormal Pap smears

The studies used various approaches to address the question of the impact of Pap screening on health and economic outcomes. Some studies used cost-benefit analysis, and others used Markov modeling to estimate cost-effectiveness. Some studies used data from actual populations, and others used hypothetical cohorts combined with incidence and prevalence data. The best-quality models (Eddy, 1990 ; Fahs, Mandelblatt, Schechter et al., 1992) are the most often cited and have both been used for updated analyses of related questions. Eddy (1990) initially addressed age of onset for Pap screening and Pap screening intervals, which was reused for the Blue Cross and Blue Shield's Technical Evaluation Center analysis of Papnet®, AutoPap®, and ThinPrep® (Brown and Garber, 1998) and Radensky and Mango's (1998) analysis of Papnet® rescreening versus conventional Pap testing. The analysis by Fahs, Mandelblatt, Schechter et al. (1992), based on a 1990 report for the U.S. Congress Office of Technology Assessment (OTA), was reused for the analysis by Schechter (1996) of Papnet® cost-effectiveness.

Cost-effectiveness of Conventional Pap Testing

A series of comprehensive models of the cost-effectiveness of cervical cancer screening was initiated with a 1981 OTA report (U.S. Congress Office of Technology Assessment, 1981). Subsequently, efforts by Eddy (1990), Mandelblatt and Fahs (1988), and OTA (1990) (and a related study by Fahs, Mandelblatt, Schechter et al., 1992) addressed similar questions in somewhat different populations and perspectives. Eddy's target population for initial screening was asymptomatic 20-year-old women of average risk, whereas Fahs, Mandelblatt, Schechter et al. (1992) assessed women over the age of 65. Both analyses were conducted from the payer's perspective. All of these models used Markov processes, with the modification that transition probabilities change as patients or population age. Of all the models, Eddy's is considered to be the most general.

There are significant discrepancies in the results of different models of cervical cancer screening. For example, both Eddy (1990) and Fahs, Mandelblatt, Schechter et al. (1992) reported the marginal cost-effectiveness of triennial Pap smear screening for women age 65 or over under two conditions: (1) assuming no prior screening and (2) assuming previous regular screening. Triennial Pap screening beginning at age 65 in women with no prior screening was estimated to cost $22,448 per year of life saved, according to Eddy (1990), but was actually cost-saving in the Fahs, Mandelblatt, Schechter et al. (1992) analysis. When considering screening in women with prior regular screening, Eddy and Fahs et al. estimated different marginal cost-effectiveness ratios: $52,241 per year of life saved versus $33,572 per year of life saved, respectively (Eddy, 1990; Fahs, Mandelblatt, Schechter et al., 1992). It is difficult to assess which differences in model assumptions account for the differing results. However, sensitivity of Pap smears is known to strongly influence cost-effectiveness ratios. Eddy used 85 percent sensitivity in his base case analysis, whereas Fahs et al. used 75 percent sensitivity in their model. Another principal difference between Eddy (1990) and Fahs, Mandelblatt, Schechter et al. (1992) was whether the models assessed implementing a change in screening practices that might detect a large, one-time benefit by detecting prevalent cases in an unscreened population versus assessing the ongoing benefit one might see in a continuing policy. By attempting to recreate the results of Eddy (1990) and Fahs, Mandelblatt, Schechter et al. (1992) with the model created for this report, we were able to identify other differences, which are discussed in greater detail below.

Cost-Effectiveness of Three New Technologies

Two analyses have addressed the cost-effectiveness of Papnet®, which provides interactive neural-network rescreening, versus unassisted manual interpretation of Pap smears (Schechter, 1996; Radensky and Mango, 1998). Schechter (1996) revised a previously published model constructed for OTA that examined the cost-effectiveness of implementing Pap screening as a Medicare benefit (Fahs, Mandelblatt, Schechter et al., 1992; U.S. Congress Office of Technology Assessment, 1990). Parameters in the model were updated to reflect the general U.S. population of women undergoing Pap testing, and the addition of a Papnet® strategy with information on the sensitivity of the new technology was supplied by the manufacturer. The primary result of this study, $48,474 per life-year saved for biennially screened women, is given in terms of the overall cost-effectiveness ratio (which compares Papnet® screening to no screening) rather than in terms of the conventional marginal (or incremental) cost-effectiveness ratio (which would require comparing Papnet® screening with conventional Pap screening). This makes it appear that Papnet® is highly cost-effective when, in fact, most of the cost-effectiveness is from conventional Pap testing; the incremental cost-effectiveness ratio for Papnet®-assisted compared with that for conventional Pap testing is much higher.

Radensky and Mango (1998) adapted Eddy's (1990) model to estimate the marginal cost-effectiveness ratio of using interactive neural network-assisted (INNA) rescreening of negative smears compared with unassisted manual examination of cervical smears at a frequency of every 3 years. The cost-effectiveness ratios ranged from $39,087 to $79,440, depending on the sensitivity of the Pap smear examination and on the specificity of INNA rescreening. More frequent screening intervals than every 3 years were not examined in this analysis.

Brown and Garber (1998) performed a cost-effectiveness analysis of Papnet®, AutoPap® and ThinPrep® for cervical cancer screening for the Blue Cross and Blue Shield Association's Technology Evaluation Center. This analysis was based upon the model by Eddy (1990). Results from this model show that Papnet®, AutoPap®, and ThinPrep® modestly improve on the diagnostic accuracy of conventional Pap screening. These screening technologies are likely to increase life expectancy by only 1 or 2 days for most women and only by a few hours for those women who already get Pap tests frequently. The marginal cost-effectiveness ratios of the new technologies were found to exceed $80,000 per additional year of life saved with biannual screening and far greater with annual screening. However, the cost-effectiveness ratios are sensitive to the costs of each technology, sensitivity of the initial test, prevalence of cervical cancer, and proportion of tumors that grow rapidly.

Results: Cost-Effectiveness Analysis

For the cost-effectiveness analysis, we compared conventional Pap smears at 1-, 2-, and 3-year intervals, with 10 percent manual rescreening of smears initially read as normal, with two types of technologies: one that improves the sensitivity of the initial screening step (10 percent of initially normal smears are rescreened), and one in which the sensitivity of the initial screening step is unchanged but 100 percent of initially normal smears are rescreened with improved sensitivity.

We attempted to define thresholds of cost, sensitivity, and specificity at which improved initial or rescreening technology would produce cost-effectiveness ratios of $50,000 per life-year or less. Because of the significant uncertainty surrounding both the effectiveness and the incremental costs of the new technologies, we did not attempt to estimate cost-effectiveness ratios for any specific technology.

The section begins with tables illustrating the effect of increasing conventional Pap sensitivity. We examine the effect of reducing the false negative rate by 40 percent, 60 percent, and 90 percent, using any technology, on life expectancy, cases of cancer, cancer deaths, and morbid treatments predicted by the model at 1-, 2-, and 3-year screening intervals. We also present estimates of the cumulative number of cases and deaths predicted for various age groups with each alternative screening strategy.

Because increasing the frequency of Pap smear screening is a legitimate option for increasing the sensitivity of a screening program, the results as incremental costs per life-year saved comparing all possible combinations of technology and frequency.

Terminology

The sensitivity of a test is defined as the conditional probability that, given the presence of disease, the test will be positive. In the setting of our model, where the presence of disease is defined by the various Markov states, the sensitivity is equivalent to the true positive rate. The FNR is equal to the quantity:
1 - sensitivity.

Because modeling probabilities is easiest with variables between 0 and 1, we expressed the potential improvement in Pap test performance as a reduction in the FNR:
X (1 -sensitivity).

A reduction in the FNR is equivalent to improving sensitivity within the context of the model. The two expressions are used interchangeably throughout this section of the report.

Base Case Parameter Estimates

Sources of parameter estimates are described in detail in the section on Methodology. In Tables 36 and 37 these estimates are summarized for the base case implementation of the model.

Table 36. Probability Estimates for Natural History Model.

Table

Table 36. Probability Estimates for Natural History Model.

Table 37. Probability Estimates for Screening.

Table

Table 37. Probability Estimates for Screening.

Table 38 illustrates the calculation of the total sensitivity and specificity for conventional Pap testing with 10 percent rescreening of initially normal smears, improved initial screening with 10 percent rescreening of normal smears, and improved rescreening of 100 percent of initially normal smears.

Table 38. Total Sensitivity of Conventional Pap with 10 percent Rescreening, Improved Initial Screening with 10 percent Rescreening, and Improved Rescreening of 100 percent of Initially Normal Smears.

Table

Table 38. Total Sensitivity of Conventional Pap with 10 percent Rescreening, Improved Initial Screening with 10 percent Rescreening, and Improved Rescreening of 100 percent of Initially Normal Smears.

Effect of Increasing Conventional Pap Sensitivity

Although life expectancy is the most common measure of effectiveness used in economic analyses of health policies, there are other important measures. We first present the predicted effects of various screening strategies on cervical cancer incidence, mortality, and treatment. Then we present the potential impact of improving the sensitivity of cervical cancer screening strategies on life expectancy, cost, and cost-effectiveness.

Cancer Cases and Mortality Avoided

Table 39 shows the effect of reducing the Pap smear false negative rate by 40 percent, 60 percent, and 90 percent on lifetime cancer incidence and mortality. Table 40 shows the predicted cumulative number of cervical cancer cases and deaths per 100,000 between ages 15 and 50, 15 and 65, and 16 and 85 under each strategy.

Table 39. Effect of Reducing Pap Smear False Negative Rate by 40 percent, 60 percent, and 90 percent at Various Screening Intervals on Lifetime Cancer Incidence and Mortality.

Table

Table 39. Effect of Reducing Pap Smear False Negative Rate by 40 percent, 60 percent, and 90 percent at Various Screening Intervals on Lifetime Cancer Incidence and Mortality.

Table 40. Cumulative Number of Cancer Cases and Deaths by Ages 50, 65, and 85 by Reducing Pap Smear False Negative Rate 40 percent, 60 percent, and 90 percent at Various Screening Intervals.

Table

Table 40. Cumulative Number of Cancer Cases and Deaths by Ages 50, 65, and 85 by Reducing Pap Smear False Negative Rate 40 percent, 60 percent, and 90 percent at Various Screening Intervals.

Table 39 illustrates that, as expected, progressively better screening methods lead to fewer cervical cancer cases and deaths, as does progressively more frequent screening. In Table 40 we see that at every level of improved test performance, the majority of cervical cancer cases and deaths in screened women occur in those individuals less than 50 years old. The proportion of cases in younger women actually increases as sensitivity increases. For example, without screening, 47.7 percent of cases and 42.6 percent of deaths occur in younger women. With Pap screening every 2 years, the proportions are 65.7 percent of cases and 65.8 percent of deaths. For women screened yearly with a test 1.9 times more sensitive than conventional Pap (i.e., FNR reduced 0.9), the proportions are 71.1 percent of cases and 73.3 percent of deaths.

In infrequently screened populations, part of this phenomenon is a result of detection of early cancer prior to the development of symptoms. In frequently screened populations, this is more consistent with variation in tumor biology, where some tumors will be more rapidly growing. It is also consistent with length bias, where screening tests will be more likely to detect slower growing, less aggressive tumors. This prediction has several implications. First, improving the sensitivity of cervical cancer screening will reduce the overall numbers of cervical cancer cases and deaths, but it will not substantially change the proportion of cases and deaths that occur in younger women. Second, changes in the age distribution of cancer cases may be related to the success of screening programs as well as to changes in the epidemiology of HPV.

Major Interventions Avoided

A successful screening strategy will shift the distribution of cases primarily to Stages I and II. The model predicts that very sensitive strategies will result in 100 percent Stage I tumors. Table 41 shows the expected number of lifetime cervical cancer treatments resulting from progressively better screening tests.

Table 41. Expected Number of Lifetime Cervical Cancer Treatments by Reducing Pap False Negative Rate by 40 percent, 60 percent, and 90 percent at Various Screening Intervals.

Table

Table 41. Expected Number of Lifetime Cervical Cancer Treatments by Reducing Pap False Negative Rate by 40 percent, 60 percent, and 90 percent at Various Screening Intervals.

These results show that increasing the sensitivity of the screening strategy by increasing the frequency and/or decreasing the FNR of the screening test results in significant decreases in morbid treatments. The relative proportion of treatments for early stage disease -- simple and radical hysterectomy, and radiation only -- also increases with increased sensitivity, as the distribution of cases shifts to primarily Stage I and II. As with all other outcomes, the incremental improvement with each successive strategy is small.

The shifts in stage distribution resulting from improved screening sensitivity would clearly reduce the number of cervical cancer treatments, many of which have significant short- and long-term morbidity. Given the potential impact of these treatments on quality of life, it is possible that adjustment for the effects of morbidity associated with cervical cancer treatment would alter cost-effectiveness estimates based on life expectancy alone. Further research on measurement of quality of life in women with cervical cancer is clearly needed.

Cost-Effectiveness: Costs per Life-Year Saved

In Table 42, we present the model results for the base case cost-effectiveness of various combinations of screening strategy and frequency. These results are based on the assumption that the relative reduction in FNR for each step is identical for each technology. However, as illustrated by Table 38, we assumed that the FNR reduction of the rescreening step for the improved initial screening technology is identical to that of the initial screening step, which means that this technology will have an overall sensitivity that is slightly higher than that seen with conventional Pap testing and 100 percent rescreening. Furthermore, it is based on an estimated incremental cost for each improved test of $10. This cost estimate is within the range of estimated incremental costs for all three of the currently available technologies.

Table 42. Base Case: Costs per Life-year Saved, 60 Percent Reduction in False Negative Rate.

Table

Table 42. Base Case: Costs per Life-year Saved, 60 Percent Reduction in False Negative Rate.

Because improved initial screening with 10 percent rescreening has a slightly higher sensitivity than improved rescreening only, it is both less expensive and more effective under the assumptions of identical incremental costs and relative specificity. In addition, the slightly higher specificity compared with improved rescreening also reduces relative costs. Eliminating strategies dominated by conventional or extended dominance, the ranking of screening strategies in terms of cost-effectiveness is shown in Table 43.

Table 43. Screening Strategies Ranked by Cost-Effectiveness.

Table

Table 43. Screening Strategies Ranked by Cost-Effectiveness.

We can see from this table that in the base case, all strategies involving improved rescreening are dominated. These results were not affected by varying the relative improvement in sensitivity from 0.4 to 0.9; however, the cost per life-year saved varied substantially (Table 44).

Table 44. Screening Strategies Ranked by Cost-effectiveness, for Increasing Reductions in FNR.

Table

Table 44. Screening Strategies Ranked by Cost-effectiveness, for Increasing Reductions in FNR.

We can explain these findings as follows: At screening frequencies of less than every 3 years, the majority of lesions detected by improving sensitivity are LSIL lesions which would be less likely to progress. Therefore, the extra costs of diagnosis and treatment of low-grade (LSIL) lesions outweigh the increase in life expectancy and decreases in costs related to invasive cervical cancer diagnosis and treatment, thus resulting in higher cost per life-year as sensitivity increases. With every 3-year screening, enough significant lesions are detected that increasing sensitivity (i.e., reducing FNR) improves life expectancy. Similar results were obtained using an incremental cost of $5 per slide; incremental cost-effectiveness ratios were less than $50,000 per life-year at screening intervals of 3 years and decreased as the incremental sensitivity increased, but cost-effectiveness ratios were consistently above $50,000 per life-year at screening intervals of 1 or 2 years and increased as sensitivity increased.

Improving the sensitivity of the initial screening step had consistently favorable cost-effectiveness ratios compared with improving the sensitivity of the rescreening step provided that (1) a certain proportion of smears read as normal with the improved initial technology were rescreened at the same improved sensitivity, (2) the incremental costs per slide of both screening technologies were identical, (3) the sensitivity of the initial screening technology was at least as high as that of the rescreening technology, and (4) the specificity of the two technologies relative to conventional Pap smears was identical. We next present the results of threshold analyses performed to identify thresholds where (1) a greater reduction in FNR, (2) higher specificity relative to improved initial screening, or (3) lower cost for a rescreening technology would favor it over an initial screening technology, followed by sensitivity analyses of model and test parameters on cost-effectiveness.

Threshold Analysis

Improved Reduction in FNR

If both types of technology add $10 to the cost of a Pap test, and the FNR of the initial screening technology is 0.6 times the FNR of conventional Pap smear, then the threshold for FNR reduction of the rescreening technology is 0.85 times the FNR of conventional Pap. Inputs are shown in Table 45. Results are shown in Table 46 with dominated strategies eliminated.

Table 45. Parameter Inputs for Threshold Analysis of Improved Rescreening Technology Reduction In FNR.

Table

Table 45. Parameter Inputs for Threshold Analysis of Improved Rescreening Technology Reduction In FNR.

Table 46. Incremental Cost/Life-year Saved with Reduction in FNR of Rescreening Technology of 0.85, Compared with Initial Screening Technology Reduction in FNR of 0.6.

Table

Table 46. Incremental Cost/Life-year Saved with Reduction in FNR of Rescreening Technology of 0.85, Compared with Initial Screening Technology Reduction in FNR of 0.6.

Varying the incremental cost of the technologies from $5 to $15 did not significantly change the threshold value. With incremental costs at $10 for both technologies, and a relative improvement of 1.4 for the initial screening technology, a relative reduction in FNR of 0.5 for the rescreening technology resulted in cost-effectiveness ratios of less than $50,000 per life-year at 3-year intervals. At a reduction in FNR of 0.9 for the initial screening technology, a relative reduction of more than 0.99 for the rescreening technology would be needed.

Decreased Specificity of Initial Screening Step

Changing the assumption of equal specificity of the two types of technology relative to conventional Pap substantially changed the rankings. Inputs are shown in Table 47. Results are shown in Table 48 with dominated strategies eliminated.

Table 47. Parameter Inputs for Threshold Analysis of Relative Specificity.

Table

Table 47. Parameter Inputs for Threshold Analysis of Relative Specificity.

Table 48. Incremental Cost per Life-year Saved with Relative Specificity of Initial Screening Step of 0.96 Compared With Conventional Pap or Rescreening Technology.

Table

Table 48. Incremental Cost per Life-year Saved with Relative Specificity of Initial Screening Step of 0.96 Compared With Conventional Pap or Rescreening Technology.

Because the overall sensitivity of the improved initial screening step is slightly higher than improved rescreening at every screening level, the overall increase in life expectancy will be slightly higher. However, with only a 4 percent reduction in relative specificity, the excess diagnostic costs make improved rescreening at 3-year intervals the only strategy with a cost-effectiveness ratio of less than $50,000 per life-year. Given that the overwhelming majority of screened patients will not have LSIL, HSIL, or cancer, a small decrease in relative specificity can greatly increase the overall costs of screening of the cohort.

Incremental Costs

We next identified a cost threshold where, holding sensitivity and specificity equal, improved rescreening would be favored. Inputs are shown in Table 49. Results are shown in Table 50 with dominated strategies eliminated.

Table 49. Parameter Inputs for Threshold Analysis on Incremental Cost per Slide of Rescreening Technology.

Table

Table 49. Parameter Inputs for Threshold Analysis on Incremental Cost per Slide of Rescreening Technology.

Table 50. Incremental Cost per Life-year Saved of Screening Strategies Using Incremental Cost of Rescreening of $3 per Slide Compared with $10 per Slide for Initial Screening Technology.

Table

Table 50. Incremental Cost per Life-year Saved of Screening Strategies Using Incremental Cost of Rescreening of $3 per Slide Compared with $10 per Slide for Initial Screening Technology.

Summary of Threshold Analysis

Given equivalent improvements in sensitivity, equivalent incremental costs, and equivalent relative specificities, a technology that improves the sensitivity of the primary screening step will always be more effective if a certain proportion of slides initially read as "normal" are rescreened. In this scenario, improved primary screening every 3 years is the only strategy with an incremental cost per life year of $50,000 or less. We were able to identify thresholds of improved sensitivity (reduced FNR), improved relative specificity, or reduced cost where improved rescreening every 3 years was the preferred strategy, at a cost per life-year of $50,000 or less. We could not identify any strategy where screening more than every 2 years, using any technology, including conventional Pap smear screening, had a cost-effectiveness ratio of less than $50,000. Table 51 summarizes our threshold values for an incremental cost of $10 for the initial rescreening technology.

Table 51. Threshold Value Where Improved Rescreening Every 3 years Favored Over Improved Initial Screening at Cost per Life-year of $50,000 or Less.

Table

Table 51. Threshold Value Where Improved Rescreening Every 3 years Favored Over Improved Initial Screening at Cost per Life-year of $50,000 or Less.

There are several patterns that emerge from these results. First, as the reduction in FNR of the initial screening technology increases, the additional relative reduction in FNR required for a rescreening technology to be favored also increases. Proportionately greater improvements in detection are needed as the initial screening technology detection rate increases. Second, the relative decrease in specificity required from the initial screening technology in order to favor the rescreening technology decreases as sensitivity increases -- as the overall diagnostic costs increase, relatively small increases in false positive rates have a greater impact on costs. Finally, if test characteristics are equivalent, the incremental cost of the rescreening technology must be substantially lower in order for rescreening to be favored. The threshold differences decrease as sensitivity increases -- again, since most of the lesions detected by increased sensitivity do not contribute substantially to mortality risk, relatively small reductions in overall costs can cause large changes in the cost-effectiveness ratios.

These results have several important implications. First, small decreases in specificity can significantly affect the cost-effectiveness estimates of any technology that improves sensitivity. This makes sense intuitively: Since a large majority of the population, even a "high-risk" population, will be "normal" at any given point in time, a small increase in the false positive rate can lead to substantial increases in cost. Because the incremental gain in life expectancy is small (because of the relative rarity of the disease and the high survival rate for early invasive disease), the cost-effectiveness ratio increases rapidly as specificity decreases. Second, there are likely to be significant interactions between sensitivity, specificity, and cost-effectiveness thresholds. We did not perform formal two-way analyses, but clearly there will be combinations of sensitivity and specificity on either side of the threshold that result in one strategy being favored over the other. Third, the range of relative sensitivity, specificity, and incremental cost for the initial and rescreening technologies that we evaluated is well within the range reported in the literature or claimed by the manufacturers of ThinPrep®, AutoPap®, and Papnet®. Given the uncertainty surrounding these estimates, it is possible that all three technologies fall within accepted ranges of cost-effectiveness at 3-year screening intervals. No strategy or technology used for screening more often than every 3 years results in estimates of less than $50,000 per life-year. Under some scenarios, decreasing the screening interval to every 3 years and using an improved technology was more effective, at an acceptable cost, than using conventional Pap screening every 2 years.

Sensitivity Analyses

To see the relative effects of varying other model parameters on cost-effectiveness estimates, we elected to use sensitivity and specificity estimates for initial and rescreening technologies that resulted in incremental cost-effectiveness ratios of $50,000 per life-year or less for both technologies. Based on the threshold analysis above, these estimates were a reduction in FNR of 0.6 for initial screening and 0.85 for rescreening. Incremental costs for both were set at $10 and relative specificities were set at 1.0. Thus, the sensitivity analyses on model parameters were performed using parameters for the two types of technologies that resulted in cost-effectiveness ratios of $50,000 per life-year or less for both types in the base case model.

Diagnostic Strategies

Our threshold analyses suggested that a large portion of the excess costs associated with reduction of FNR is a result of the cost of diagnosis and treatment of low-grade lesions. Our base case assumption was that smears originally read as ASCUS would be repeated in 6 months, an assumption that should reduce overall diagnostic costs, but that might miss high-grade histologic lesions with cytologic diagnoses of ASCUS.

We compared an aggressive diagnostic strategy of colposcopy for all Pap smears with ASCUS or higher to one where initial ASCUS results were repeated after 6 months. However, given the results of the threshold analysis above, it is important to bear in mind that, if the reduction in FNR of the rescreening step is less than 0.85 or the relative specificity of the initial screening step is less than 0.97, then only the alternate strategy would have a cost per life-year saved of less than $50,000 at 3-year intervals. The results are seen in Table 52.

Table 52. Comparison of Cost per Life-year Gained for Different Strategies for Managing ASCUS Smears: Repeat Pap Smear versus Immediate Colposcopy.

Table

Table 52. Comparison of Cost per Life-year Gained for Different Strategies for Managing ASCUS Smears: Repeat Pap Smear versus Immediate Colposcopy.

Changing the strategy does not change the rankings. Again, because the bulk of the lesions detected through improved sensitivity will be low-grade lesions, an aggressive diagnostic workup will increase costs with minimal gain in life expectancy.

Cost of Diagnosis and Treatment of Low-Grade Lesions

We next varied the cost of diagnosis and treatment of low-grade lesions from the base case value of $1,728, the mean value of the MEDSTAT data, to $675, the 25th percentile. The results are summarized in Table 53.

Table 53. Effect of Reducing the Cost of Diagnosis and Treatment of LSIL on Incremental Cost-Effectiveness.

Table

Table 53. Effect of Reducing the Cost of Diagnosis and Treatment of LSIL on Incremental Cost-Effectiveness.

Decreasing the cost of managing LSIL significantly reduces the overall cost per life-year gained, especially at lower screening frequencies, but does not alter the ranking or put any other strategies below the $50,000 per life-year strategy.

Incidence of Cervical Cancer

We varied the incidence of cervical cancer in two ways: by increasing the age-specific incidence rate of HPV infection 1.5 times and by decreasing the progression rate from HSIL to Stage I cancer from 40 percent in 12 years to 40 percent in 8 years. The results are seen in Table 54.

Table 54. Effect on Cost per Life-year Gained of Increasing Incidence of HPV Compared with Increasing HSIL Progression Rate as Mechanisms for Increasing Cervical Cancer Incidence.

Table

Table 54. Effect on Cost per Life-year Gained of Increasing Incidence of HPV Compared with Increasing HSIL Progression Rate as Mechanisms for Increasing Cervical Cancer Incidence.

When the increase in peak incidence is due to an increase in HPV incidence, improved initial screening every 3 years is more expensive but more effective than conventional Pap smears every 2 years. When the increase in peak incidence (and lifetime risk) is due to a more rapid progression of HSIL lesions to cancer, improved initial screening every 3 years is less expensive and more effective than conventional Pap every 2 years. These results illustrate several points. First, although the cost-effectiveness ratios decrease with increasing cancer incidence, very sensitive tests at frequent intervals are still well above the $50,000 threshold. Second, lifetime risk of cancer alone is not the only important component in determining cost-effectiveness. Clearly, the natural history of premalignant changes plays an important role in determining the degree of benefit obtained from a screening program. Third, changing different parameters within the model can result in similar patterns of cancer incidence. This may well explain some of the difference in results between models. Additional data on the natural history of HPV infection and premalignant and malignant changes, as well as further modeling efforts, are needed.

Baseline Sensitivity and Specificity of Conventional Pap Smears

We tested the impact of changing the sensitivity and specificity of conventional Pap to 78.8 percent sensitivity and 88.8 percent specificity, the mean results for each test characteristic in our meta-analysis (Table 55). These estimates are also close to those of Brown and Garber (1998).

Table 55. Effect on Cost per Life-year Gained of Changing Sensitivity and Specificity of Conventional Pap Smear Screening.

Table

Table 55. Effect on Cost per Life-year Gained of Changing Sensitivity and Specificity of Conventional Pap Smear Screening.

Technologies that reduce the FNR of the conventional Pap smear when the FNR is 21.2 percent (sensitivity = 0.788) result in increased cost-effectiveness ratios, and no strategy falls below the $50,000 per life-year threshold. If relative specificity were also lowered, ratios would be even higher. These higher base-case estimates also explain some of the differences between our findings and those of other authors.

Another important point is that increasing the sensitivity and decreasing the specificity, even without incremental costs associated with a new technology, significantly increases the cost-effectiveness ratio.

Effect of Hysterectomy Incidence

We also examined the effect of our inclusion of hysterectomy risk on cost-effectiveness. The results are seen in Table 56.

Table 56. Effect of Adjustment for Hysterectomy Rate on Cost per Life-year Gained.

Table

Table 56. Effect of Adjustment for Hysterectomy Rate on Cost per Life-year Gained.

At higher screening frequencies, the cost-effectiveness ratios increase somewhat when hysterectomy rates are not included. This is because including hysterectomy reduces the number of tests during the lifetime of the cohort, an effect that is most obvious at higher screening frequencies, and also because the actual risk of cervical cancer in women without hysterectomy is higher than the population-based risk that does not correct for hysterectomy, since the denominator in the incidence fraction is smaller. However, the inclusion of hysterectomy risk does not change the rankings of strategies.

Discount Rate

We examined the effect of varying the discount rate from 0 to 5 percent (Table 57). As expected, the discount rate does affect the size of the cost-effectiveness estimate. For improved rescreening every 3 years, or for improved initial screening every 2 years, the discount rate does affect whether the strategy meets the $50,000 per life-year threshold.

Table 57. Effect of Varying Discount Rates on Cost-Effectiveness.

Table

Table 57. Effect of Varying Discount Rates on Cost-Effectiveness.

Age at Beginning of Screening

We compared the cost per life-year saved (at a 0% discount rate) at various starting ages for screening: 15 (the base case), 18 (ACOG recommendations), 20 (the starting point for other models), 35, 50, and 65 (the starting point for the model of Fahs, Mandelblatt, Schechter et al. (1992). The results are shown in Table 58. Dominated strategies are not excluded; with exclusion of strategies that are dominated by conventional or extended dominance, no strategy at a frequency of more than every 3 years costs less than $50,000 per life-year.

Table 58. Effect of the Age at Which Cervical Cytological Screening Begins on Incremental Cost-Effectiveness.

Table

Table 58. Effect of the Age at Which Cervical Cytological Screening Begins on Incremental Cost-Effectiveness.

Results were similar at discount rates of 3 percent and 5 percent: cost-effectiveness ratios decrease until age 35 and then increase with increasing age. These findings are consistent with those of Gustaffson and Adami (1992). Using age-specific cancer incidence curves similar to ours, their model predicted that focusing screening efforts in the mid-30 age range was the most efficient strategy. Given the natural history of the disease, this is not a surprising finding. HPV infection and LSIL peak in the early 1920s, whereas HSIL is a disease primarily of the 1930s, and cancer a disease of the late 1940s and 1950s. A large proportion of LSIL will spontaneously regress, the majority of the HSIL lesions that progress do so slowly, and cancer incidence peaks in the late 1940s. Beginning screening at the time when those lesions destined to regress have done so, and before the cancer incidence begins to reach its peak, is the most efficient strategy for increasing life expectancy. This is especially true given the prediction of our model that those cancers that do occur in younger women are less likely to be detected by any regular screening program (see Table 40).

Fixed Screening Frequencies

As demonstrated above, there is an interaction between screening frequency and test characteristics in determining cost-effectiveness. There may be some settings where reducing screening frequency is not a practical option. We therefore present estimates for the cost-effectiveness of an improved initial screening technology and an improved rescreening technology at fixed screening intervals. Conventional Pap smear screening is compared with no screening, not to conventional Pap smear screening at more or less frequent intervals. Costs and days of life are discounted at 3 percent, the incremental cost for each technology is $10, the relative specificity of each is 1.0, the rescreening technology has a reduction in FNR of 0.85, and the initial screening technology has a reduction in FNR of 0.6. Results are summarized in Table 59.

Table 59. Cost per Life-year Gained of an Improved Initial Screening Technology and an Improved Rescreening Technology Compared with Conventional Pap Smear at Fixed Screening Intervals.

Table

Table 59. Cost per Life-year Gained of an Improved Initial Screening Technology and an Improved Rescreening Technology Compared with Conventional Pap Smear at Fixed Screening Intervals.

The addition of either technology to annual screening does not meet the $50,000 per life-year threshold. At every 3 years, both technologies do meet that threshold under these assumptions but, as illustrated above, reasonable changes in costs or test characteristics can change both the actual cost estimate and the favored strategy. At every 2 years, it is possible that, given these assumptions and no comparison to less frequent screening at higher sensitivity, one (or both) technologies might meet the $50,000 threshold. However, in several scenarios above, less frequent testing with a more sensitive test was both less expensive and more effective than alternatives with higher screening frequencies. Policymakers considering this information should look at all reasonable alternatives for improving the cost-effectiveness of cervical cancer prevention programs.

Summary Tables

Given the degree of uncertainty surrounding estimates of both the costs and effectiveness of the new technologies, we present tables that allow calculation of predicted cost-effectiveness ratios under varying assumptions (Tables 60-62). Each table presents the average costs and life expectancy (discounted at 3 percent), number of cervical cancer cases, and number of cervical cancer deaths, for a cohort of women followed from ages 15 to 85. These outcomes are tabulated for no screening; conventional Pap screening every 1, 2, and 3 years; and for reductions in conventional Pap false negative rates of 0.4, 0.6, and 0.9. The three tables present the costs associated with incremental costs for these reductions in FNR of $10 (our base case), $5, and $15.

Table 60. Average Costs, Average Life Expectancy, Number of Cervical Cancer Cases Expected/100,000, and Number of Cancer Deaths/100,000, at Various Levels of Test Sensitivity and Specificity (Incremental cost to increase test sensitivity: $10. Costs and life expectancy discounted 3 percent).

Table

Table 60. Average Costs, Average Life Expectancy, Number of Cervical Cancer Cases Expected/100,000, and Number of Cancer Deaths/100,000, at Various Levels of Test Sensitivity and Specificity (Incremental cost to increase test sensitivity: (more...)

Table 61. Average Costs, Average Life Expectancy, Number of Cervical Cancer Cases Expected/100,000, and Number of Cancer Deaths/100,000, at Various Levels of Test Sensitivity and Specificity (Incremental cost to increase test sensitivity: $5. Disount rate for costs and life expectancy 3 percent).

Table

Table 61. Average Costs, Average Life Expectancy, Number of Cervical Cancer Cases Expected/100,000, and Number of Cancer Deaths/100,000, at Various Levels of Test Sensitivity and Specificity (Incremental cost to increase test sensitivity: (more...)

Table 62. Average Costs, Average Life Expectancy, Number of Cervical Cancer Cases Expected/100,000, and Number of Cancer Deaths/100,000, at Various Levels of Test Sensitivity and Specificity (Incremental cost to increase test sensitivity: $15. Discount rate for costs and life expectancy 3 percent).

Table

Table 62. Average Costs, Average Life Expectancy, Number of Cervical Cancer Cases Expected/100,000, and Number of Cancer Deaths/100,000, at Various Levels of Test Sensitivity and Specificity (Incremental cost to increase test sensitivity: (more...)

In using these tables to estimate cost-effectiveness of a technology that improves the overall sensitivity of conventional Pap smears (i.e., reduces the FNR), the reader can choose an estimate for incremental cost and reduction in FNR and then use the table to calculate incremental costs per life-year gained, cancer case prevented, or cancer death prevented.

For example, if the estimated incremental cost of a new technology is $10 per slide, and the estimated reduction in FNR rate is 0.6, then the incremental cost per life-year saved for that technology every 3 years would be:
(Average cost for every 3-year screening at FNR reduction of 0.6-Average cost for every 3-year conventional Pap)/(Life expectancy for every 3-year screening at FNR reduction of 0.6-Life expectancy for every 3-year conventional Pap).

Validation of the Model

The model was validated against two types of external data: epidemiological data and previously published models of cervical cytological screening.

Validation Against Epidemiological Data

The model predictions of age-specific prevalence of HPV infection, LSIL, and HSIL were compared with external epidemiological data from the literature and expected consequences of changing various parameters.

Age-specific prevalence of HPV infection in women with normal cytology

The age-specific prevalence of HPV infection in women with normal cytology predicted by the model using base case estimates is shown in Figure 22. These prevalence figures are consistent with cross-sectional data in low-risk populations (Bauer, Hildesheim, Schiffman et al., 1993). The estimate for HPV prevalence in women over 50 years of age is at the high end of the reported range, but high estimates would result in a higher incidence of cancer at later ages, thus increasing the overall lifetime risk of cancer and favoring more sensitive screening strategies.

Figure 22. Age-specific prevalence of HPV.

Figure

Figure 22. Age-specific prevalence of HPV.

Age-specific prevalence of LSIL and HSIL

Figure 23 shows the age-specific prevalence of LSIL and HSIL predicted by the model. Figure 24 shows the relationship between HPV, LSIL, and HSIL prevalence predicted by the model. Distributions are similar to those reported in other studies and are consistent with estimates for HPV incidence and for progression and regression rates for HPV and SIL. In particular, the shape of the LSIL curve is similar to that reported by Carson and DeMay (1993).

Figure 23. Predicted prevalence of LSIL and HSI.

Figure

Figure 23. Predicted prevalence of LSIL and HSI.

Figure 24. Predicted prevalence of HPV, LSIL, and HSIL by age.

Figure

Figure 24. Predicted prevalence of HPV, LSIL, and HSIL by age.

Sensitivity analyses

We tested the impact of varying the age-specific incidence of HPV from one-half to twice the base case estimates. As shown in Figure 25, the overall shape of the cancer incidence curve does not change, although the peak incidence and overall risk of cervical cancer varies with HPV incidence. This graph suggests the potential impact of primary prevention on cervical cancer incidence by such measures as increased use of barrier methods of contraception or, potentially, vaccination against HPV.

Figure 25. Effect of HPV incidence on cervical cancer incidence.

Figure

Figure 25. Effect of HPV incidence on cervical cancer incidence.

We also tested the impact of varying the prevalence of HPV and LSIL at age 15 on subsequent cervical cancer incidence (Figure 26). Increasing prevalence at younger ages without any changes in other parameters increases the overall incidence and decreases the youngest ages at which cancer appears. Delaying onset of sexual activity, especially with multiple partners, would also have a significant impact on cervical cancer risk.

Figure 26. Effect of baseline HPV/LSIL prevalence on cancer incidence.

Figure

Figure 26. Effect of baseline HPV/LSIL prevalence on cancer incidence.

Finally, we tested the impact of our natural history estimates on the lifetime risk of cervical cancer in the absence of screening. Table 63 presents the parameters, the input range for sensitivity analysis, and the resulting range of lifetime cervical cancer risk. Cervical cancer risk is most sensitive to HPV incidence, to the proportion of HPV infections progressing directly to HSIL, and to LSIL progression rates. Changes in these parameters result in twofold to threefold differences in cervical cancer risk. Changes in LSIL regression rates, and HSIL progression and regression rates, resulted in 50-75 percent differences in cancer risk. The proportion of LSIL regressing directly to the "Well" state instead of to the "Unknown HPV" state, and the proportion of HSIL regressing to "Well" instead of to LSIL had minimal impact on cervical cancer risk.

Table 63. Parameters, Input Range for Senstivity Analysis, and Range of Lifetime Cervical Cancer Risk.

Table

Table 63. Parameters, Input Range for Senstivity Analysis, and Range of Lifetime Cervical Cancer Risk.

Additional epidemiological data on the natural history parameters that affect cervical cancer risk in the absence of screening are needed. Different combinations of parameter estimates can result in similar predictions of cancer risk. Further refinement of this model may help to determine which parameters are most important in determining cervical cancer incidence.

Validation Against Other Models

Direct comparison with other models is difficult because of differences in terminology, assumptions, parameter estimates, and modeling techniques. However, we adjusted our model estimates to approximate previously published models as both a means of comparison and a validation technique.

We attempted to systematically identify differences in predicted natural history between our model and the two most widely-used models: Eddy (1990) and Fahs, Mandelblatt, Schechter et al. (1992). Table 64 summarizes these differences.

Table 64. Differences Between Current Model and Eddy (1990) and Fahs, Muller, Mandelblatt et al. (1992).

Table

Table 64. Differences Between Current Model and Eddy (1990) and Fahs, Muller, Mandelblatt et al. (1992).

Eddy (1990) Model

This widely-cited model is the basis for other recently published cost-effectiveness analyses (Brown and Garber, 1998; Radensky and Mango, 1998). We attempted to recreate Eddy's results using our model. Although Eddy's model parameters were adjusted to fit International Agency for Research on Cancer (IARC) data (IARC Working Group on Evaluation of Cervical Cancer Screening Programmes, 1986), the incidence of invasive cervical cancer in an unscreened U.S. population was estimated by assuming that it would be three times higher than that observed in a partially screened population. However, this assumption does not account for the fact that 30-50 percent of cancer cases in the United States are from an unscreened population. Figure 27 shows the effects of different screening frequencies on the age-specific incidence of cervical cancer using a cohort simulation. Increasing screening frequency will lead to earlier detection of earlier stage disease at younger ages than would be expected in an unscreened population. Because the incidence of cervical cancer in the United States reflects both screened and unscreened populations, as well as the effect of different cohorts with varying exposure to HPV, simply increasing the age-specific incidence by three-fold will overestimate the expected incidence in unscreened patients at younger ages. If the distribution of stages is not changed, then the ratio of incidence to mortality will be overestimated, since the ratio of early-stage cases in the SEER registries is much higher in younger women than in older women (66 percent localized in women under 50 compared with 37 percent in women over 50) (Ries, Kosary, Hankey et al., 1998).

Figure 27. Age-specific incidence of cervical cancer by screening frequency.

Figure

Figure 27. Age-specific incidence of cervical cancer by screening frequency.

To determine the effect of heterogeneous screening frequencies within a population on the relationship between cervical cancer incidence and mortality, we ran simulations varying the proportion of the cohort who received no screening and Pap smear screening every 5, 3, 2, and 1 years, at a Pap sensitivity of 51 percent and specificity of 97 percent, and we also attempted to match the lifetime risk of cervical cancer diagnosis (0.79 percent) and mortality (0.29 percent) predicted by SEER data (Ries, Kosary, Hankey et al., 1998). The results are summarized in Table 65.

Table 65. Recreating SEER Lifetime (ages 15-85) Risks of Cervical Cancer Diagnosis and Mortality Using Current Model Parameters.

Table

Table 65. Recreating SEER Lifetime (ages 15-85) Risks of Cervical Cancer Diagnosis and Mortality Using Current Model Parameters.

We were able to approximate lifetime risks for cervical cancer diagnosis and mortality based on current SEER data by simulating a cohort with a reasonable distribution of screening frequencies. The proportion of women not having had a Pap smear in the preceding 3 years is estimated at 5-10 percent (Martin, Calle, Wingo et al., 1996). Because our model assumes perfect patient and provider compliance with appropriate treatment, a higher proportion of unscreened and underscreened women are needed to approximate observed incidence and death rates. This provides further evidence for the overall validity of our model as well as our base case estimate of Pap sensitivity and specificity.

We were able to approximate the lifetime cervical cancer risk of Eddy (2.5 percent) by altering our HPV incidence. Figure 28 shows the age-specific incidence predicted by our model, the age-specific incidence predicted by increasing 1988 SEER incidence rates threefold, and the adjustment in our model resulting in a 2.5 percent lifetime risk. However, the lifetime mortality risk predicted by this model is 0.88 percent, substantially lower than Eddy's estimate of 1.18 percent. We then adjusted rates for progression between cancer stages and symptoms to obtain similar incidence and mortality risks. By changing the progression rates to 90 percent in 2.5 years for Stage I to Stage II, 75 percent in 1 year from Stage II to Stage III, and 100 percent in 1 year for Stage III to Stage IV, and changing the probability of symptoms for Stage III to 35 percent, we obtained a lifetime risk of 2.52 percent and a mortality risk of 1.14 percent. In addition, since Eddy's model does not account for nonspecific cytological diagnoses such as ASCUS, we combined the diagnosis of ASCUS and LSIL in our conditional probabilities. Although these progression rates are substantially higher than those used in other models, they do replicate Eddy's results. We then tried this model, along with Eddy's cost and test characteristics estimates, to compare results (Table 66).

Figure 28. Incidence of cervical cancer predicted by current model, threefold increase in 1988 SEER data, and current model adjusted to fit Eddy estimation.

Figure

Figure 28. Incidence of cervical cancer predicted by current model, threefold increase in 1988 SEER data, and current model adjusted to fit Eddy estimation.

Table 66. Comparison of Current Model Adjusted to Fit Eddy Estimates with Eddy (1990) Model.

Table

Table 66. Comparison of Current Model Adjusted to Fit Eddy Estimates with Eddy (1990) Model.

By attempting to replicate Eddy's results, we were able to identify several features that may affect conclusions about the cost-effectiveness of cervical cancer screening strategies when Eddy's model is used. First, the age-specific incidence in younger women may be overestimated, since the majority of cases in younger women in a partially screened population (the basis for Eddy's estimates) will be early-stage disease detected through screening. This in turn will contribute to an overestimation of the case-fatality rate if the distribution of stages is not adjusted. Second, because cytological atypia reported as ASCUS is not considered, the overall screening costs will be underestimated.

Fahs, Mandelblatt, Schechter et al. (1992) Model

We adjusted our model to include only two stages of cancer, "early" and "late," and used the parameters cited in the article, including the age-specific survival for early and late cervical cancer and began the cohort simulation at age 65. Using these parameters, we found that Pap smear screening every 5 or every 3 years results in cost-savings compared with no screening. Despite differences in total cost and effectiveness, our adjustments resulted in incremental costs for more frequent screening, which were of the same order of magnitude as those of Fahs, Mandelblatt, Schechter et al. (1992). For screening every 3 years versus every 5 years, our model predicted an incremental cost per life-year of $5,581 compared with the published value of $5,956. For screening every year versus every 3 years, our model predicted an incremental cost per life-year of $74,615 compared with the published value of $39,693.

Brown and Garber (1998) Model

We used our model, adjusted to approximate Eddy's results, in an attempt to replicate the results of Brown and Garber (1998). Our results, using their cost and probability estimates, are found in Table 67.

Table 67. Comparison of the Current Model, Adjusted to Eddy's Parameters, and Brown and Garber.

Table

Table 67. Comparison of the Current Model, Adjusted to Eddy's Parameters, and Brown and Garber.

Discussion

By making adjustments to certain natural history parameters in our basic model, and by using the cost and sensitivity parameters of Eddy (1990) and Brown and Garber (1998), we were able to reasonably approximate the cost-effectiveness estimates for varying frequencies of conventional Pap smears of these models. We were also able to come relatively close to the incremental cost-effectiveness values of different screening frequencies in the model of Fahs, Mandelblatt, Schechter et al. (1992), although we found cost-savings for every 3- and 5-year Pap smear screening compared with no screening using their cost and probability estimates. We were able to replicate the relative ranking for ThinPrep®, AutoPap®, and Papnet® from Brown and Garber (1998); however, our adjusted model resulted in significantly higher estimates of incremental cost-effectiveness for these technologies at each level of screening frequency.

Although we have been unable to identify specific characteristics of our model to explain these differences, there are several potential factors that may explain some of these observed differences.

Our model consistently produced lower gains in life expectancy than other models for a given increase in screening sensitivity. This may be because of differences in the source of other-cause mortality estimates, different distribution of cases within stages between different models, or the inclusion of other Markov states such as those indicating HPV infection. Given these lower gains in life expectancy, it is not surprising that the cost-effectiveness estimates are somewhat higher.

We used specific 1-, 2-, 3-, 4-, and 5-year survival rates rather than 5-year survival rates averaged over 5 years. Because mortality for Stages II, III, and IV is higher in the first 2 years after diagnosis than in the next 3 years, discounting may affect the marginal gains in life expectancy. At the very low gains that all of the models found, this may be sufficient to change cost-effectiveness ratios appreciably.

None of the other models considered the effect of ASCUS diagnoses. We attempted to correct for this by changing the relative probability of an LSIL cytological diagnosis within each histological state. However, this correction might result in higher diagnostic and treatment costs (our average costs were substantially higher than those seen in all the models, although our incremental costs were similar) and, subsequently, higher cost-effectiveness ratios.

Despite these differences, we were able to replicate the ranking of preferred screening strategies and reasonably approximate the incremental cost per life-year saved of other previously published models for conventional Pap smears. Further studies with the current model should be able to further elucidate the reasons for the differences between our model and those previously published.

Although there are numerous differences between our model and other previously published models, there are striking similarities. All of the models show little marginal gains in life expectancy at screening intervals of more than 3 years. All of the models show only modest improvements in life expectancy from increasing the sensitivity of conventional Pap smears and, for those that evaluate new technologies, cost-effectiveness ratios for technologies that improve sensitivity (or reduce false negative rates) when screening is performed at frequent intervals.

Summary of Results of Cost-Effectiveness Analysis

We found that the incremental cost per life-year gained increased dramatically as screening intervals increased, as have prior cost-effectiveness analyses of Pap smear screening. We also found that the specificity of the test has profound implications for cost-effectiveness: Because the majority of women screened will be normal, the number of additional diagnostic tests generated by a small decrease in specificity will be quite large and will in fact exceed the number of potentially significant lesions detected by increases in sensitivity in many populations.

We were able to identify thresholds of cost, reduction in false negative rate, and relative specificity compared with conventional Paps where both technologies to improve initial screening and to improve rescreening meet conventional cost-effectiveness thresholds at every 3-year screening intervals. However, under most scenarios, each technology resulted in cost-effectiveness estimates of more than $50,000 per life-year when used every 1 or 2 years. Further studies are needed in order to provide more precise estimates of the costs and effectiveness of specific technologies in order to make more reliable estimates of cost-effectiveness.

Views

  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...