The Agency for Health Care Policy and Research (AHCPR), through its Evidence-based Practice Centers (EPCs), sponsors the development of evidence reports and technology assessments to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. The reports and assessments provide organizations with comprehensive, science-based information on common, costly medical conditions and new health care technologies. The EPCs systematically review the relevant scientific literature on topics assigned to them by AHCPR and conduct additional analyses when appropriate prior to developing their reports and assessments.
To bring the broadest range of experts into the development of evidence reports and health technology assessments, AHCPR encourages the EPCs to form partnerships and enter into collaborations with other medical and research organizations. The EPCs work with these partner organizations to ensure that the evidence reports and technology assessments they produce will become building blocks for health care quality improvement projects throughout the Nation. The reports undergo peer review prior to their release.
AHCPR expects that the EPC evidence reports and technology assessments will inform individual health plans, providers, and purchasers as well as the health care system as a whole by providing important information to help improve health care quality.
We welcome written comments on this evidence report. They may be sent to: Director, Center for Practice and Technology Assessment, Agency for Health Care Policy and Research, 6010 Executive Blvd., Suite 300, Rockville, MD 20852.
| John M. Eisenberg, M.D. | Douglas B. Kamerow, M.D. |
| Administrator | Director, Center for Practice and Technology Assessment |
| Agency for Health Care Policy and Research | Agency for Health Care Policy and Research |
| The authors of this report are responsible for its content. Statements in the report should not be construed as endorsement by the Agency for Health Care Policy and Research or the U.S. Department of Health and Human Services of a particular drug, device, test treatment, or other clinical service. |
This report compares new technologies for cervical cytological screening with conventional Papanicolaou (Pap) test screening in terms of diagnostic accuracy, costs, effectiveness, and cost-effectiveness in adult women of average cervical cancer risk
Published literature on the accuracy of cervical cytological screening, costs of screening and treatment, and cost-effectiveness were identified in MEDLINE, CINAHL, CancerLit, EconLit, HealthSTAR, and EMBASE databases.
Diagnostic test studies were included if they compared cervical cytology diagnosis with concurrent colposcopy or biopsy and provided estimates of sensitivity and specificity. For the new technologies, studies were also included that used a cytology reference standard and allowed estimation of either sensitivity or specificity. Articles on costs and health outcomes were selected if they assessed the effect of screening on life expectancy or quality of life, number of cases of cervical cancer, or total health care costs.
For diagnostic test studies, paired reviewers independently abstracted sensitivity and specificity data from each study. Quality scores were assessed on blind interpretation of screening test results, histological reference standard, verification of test negative subjects, description of disease spectrum, avoidance of bias in sample selection, publication type, and source of support. Diverse articles on costs and health outcomes were summarized and quality-scored according to criteria published by an expert panel.
Supplemental analyses include a meta-analysis to generate summary estimates of Pap test discrimination; cost analysis using claims databases to generate costs of treatment and screening; and a Markov model to estimate the effectiveness and costs of different technologies and clinical strategies.
Conventional Pap smear screening, based on the few studies that avoided severe biases, showed specificity of 98 percent (95 percent confidence interval (CI); 97-99 percent) and sensitivity of 51 percent (95 percent CI; 37-66 percent). The sample prevalence of disease is strongly related to between-study variability in Pap test sensitivity and specificity and may reflect bias. Other indicators of study quality were not significant when prevalence was controlled. The Pap test is more accurate when a high-grade squamous intraepithelial lesion threshold is used with the goal of detecting a high-grade lesion than when lower thresholds, such as a low-grade squamous intraepithelial lesion (LSIL) or atypical squamous cells of uncertain significance (ASCUS), are used with the goal of detecting low- or high-grade dysplasia. Few studies of the new technologies used histology or colposcopy as a reference standard or allowed estimates of both sensitivity and specificity. In studies using a cytology reference standard, each of the new technologies appears to significantly improve sensitivity relative to conventional Pap smear screening; however, little information is available on the effects on specificity.
Cost-effectiveness ratios from published models comparing Pap smear screening with no screening fall into an acceptable range, but these models used parameters that overstate Pap test accuracy.
Base case estimates of the incremental cost-effectiveness of conventional Pap screening every 3 years compared with no Pap screening is $4,097 per life-year saved. A technology applied to the initial step in Pap screening that reduces the false negative rate by a factor of 0.6 at an incremental cost per slide of $10 has an incremental cost of $22,010 per life-year saved when performed every 3 years. With more frequent screening intervals, the cost per life-year saved is greater than $50,000. Technologies that allow 100 percent rescreening of slides initially read as normal by conventional Pap screening, at a reduction in false negative rate of 0.85 or higher, are more effective than technologies that improve initial screening with a reduction of 0.6. At these reductions in false negative rate, with identical incremental costs of $10 per slide and a screening interval of every 3 years, the cost per life-year saved of rescreening technologies compared with improved initial screening is $45,375. Findings were relatively insensitive to assumptions about cervical cancer incidence, cost of technologies, diagnostic strategies for abnormal screening results, and age at onset of screening. Findings were sensitive to both the reduction in false negative rate (i.e., improvement in sensitivity) and the relative specificity of the technologies compared with conventional Pap.
Estimates of the sensitivity of the conventional Pap test are biased in most studies; based on the least biased studies, sensitivity is near 50 percent, much lower than generally believed. Newer technologies improve sensitivity compared with conventional Pap screening; however, there are no precise estimates for their effect on specificity. Under assumptions favorable to improved initial screening technologies and rescreening technologies, either approach can result in acceptable cost per life-year saved at 3-year Pap screening intervals. However, the imprecision in estimates of effectiveness and cost of the new technologies makes drawing firm conclusions about their relative cost-effectiveness problematic.
This document is in the public domain and may be used and reprinted without permission, except for those copyrighted materials noted for which further reproduction is prohibited without the specific permission of copyright holders.
Suggested Citation:
McCrory DC, Matchar DB, Bastian L, et al. Evaluation of Cervical Cytology. Evidence Report/Technology Assessment No. 5. (Prepared by Duke University under Contract No. 290-97-0014.) AHCPR Publication No. 99-E010. Rockville, MD: Agency for Health Care Policy and Research. February 1999.
Worldwide, carcinoma of the cervix is one of the most common malignancies in women. It is estimated that approximately 13,700 new cases of the disease will occur in the United States in 1998. A woman's lifetime risk of being diagnosed with cervical cancer in the United States is currently 0.83 percent, and the risk of dying from the disease is 0.27 percent.
The incidence of cervical cancer and associated mortality have each decreased over 40 percent since 1973; the decreases are largely attributable to the success of mass screening using the Papanicolaou (Pap) test to diagnose premalignant or early-stage cases. The decreases in invasive cervical cancer incidence and mortality since the introduction of the Pap smear have been so dramatic that it is one of the few interventions to receive an "A" recommendation from the U.S. Preventive Services Task Force even though there are no randomized trials demonstrating its effectiveness.
Despite the indisputably dramatic impact of Pap screening, there is still uncertainty about the details of Pap smear performance, and much could be done to improve the performance of the test and followup of patients after screening. Controversy about the details of Pap smear performance is manifest in differing recommendations about the frequency of screening and the age (if any) at which screening may safely be stopped. A significant proportion of patients and providers fail to comply with even the least demanding recommendations for Pap screening frequency. Numerous barriers to screening have been identified that reduce access to Pap smears and other preventive services.
Recently, efforts to improve Pap smear performance have focused on reducing the number of false negative smears, that is, cases in which premalignant or malignant cells have been misdiagnosed as normal. Measures adopted to improve laboratory performance on this point include manual rescreening of a portion of slides initially evaluated as negative, an approach mandated by Federal law (Clinical Laboratory Improvement Amendments [CLIA]). Recently, several technologies have been developed to optimize Pap test screening by reducing the false negative rate. These technologies are a major focus of this report.
The report addresses three main questions:
What is the accuracy of cervical cytology using conventional Pap smears and new technologies (thin-layer cytology, computer rescreening, algorithm-based decision making technology) for detecting cervical cancer and its precursors?
What are the direct medical costs associated with cervical cancer screening, evaluation, treatment, and followup of cervical cytological abnormalities and treatment and followup of cervical cancer?
What are the effects on total health care cost, morbidity, and mortality of regular cervical cytological screening using thin-layer cytology and computer rescreening using neural network or algorithm-based decisionmaking technology compared with the conventional Pap smear in women participating in a screening program?
On the first point, the report will review published studies comparing cervical cytological diagnosis with clinical diagnosis based on colposcopy or biopsy. The results of this review will form the basis for a meta-analysis.
On the second point, the report will identify and examine current claims data and other datasets to estimate empirically costs associated with cervical cytological screening.
On the third point, the report will review the literature on the effectiveness and cost-effectiveness of cervical cytology screening and use these data to develop a comprehensive cost-effectiveness model to examine the impact of the newer screening technologies. In the absence of definitive clinical trials on key questions of cervical cancer screening, policymakers have relied on decision-modeling studies to integrate epidemiological data on the natural history of cervical cancer precursors, data on the performance of diagnostic tests for early cervical cancer or cervical cancer precursors, and data on cost. These models estimate the efficacy of various screening programs, balance estimated efficacy against estimated cost, and lead to decisions about appropriate screening intervals and age cutoffs.
Recent developments in specimen processing and interpretation may substantially improve the Pap smear as a diagnostic test for cervical cancer and cancer precursors. Three new devices recently approved by the Food and Drug Administration (FDA) are considered in this report: ThinPrep®, Papnet®, and AutoPap®. The three devices employ three different types of technology: thin-layer cytology (ThinPrep®) and computerized rescreening utilizing neural-network technology (Papnet®) or algorithmic classification (AutoPap®).
Each of these technologies was developed to reduce the false negative rate associated with cervical cytological screening. The two major components to this false negative rate are false negatives related to sampling error and false negatives related to detection error. About two-thirds of false negatives are a result of sampling error and the remaining one-third a result of detection error. Each of the new technologies is directed at one of these components of false negatives. Thin-layer cytology aims primarily to fix sampling error, whereas computerized rescreening targets detection error. This implies that neither technology will be able to reduce false negatives beyond a certain threshold.
One newly approved device, Papnet®, uses neural-network computerized rescreening of Pap smears initially read as negative by a cytotechnologist.The system works by using automated computerized imaging of Pap smear slides and interpretation of images using a computerized algorithm to identify slides that are likely to contain abnormal cells. The Papnet® system (Neuromedical Systems, Inc.) identifies cells or clusters of cells that require review and can display up to 128 images of the slide likely to contain abnormalities. These images can be reviewed by a cytotechnologist who can decide whether or not to review the slide using light microscopy.
AutoPap® 300 QC system (Neopath, Inc.), an algorithm-based decisionmaking technology, identifies slides exceeding a certain threshold for the likelihood of abnormal cells. The laboratfory can select different thresholds corresponding to 10, 15, and 20 percent review rates. In contrast to random rescreening, the population of slides selected by the AutoPap® 300 QC system is enriched with abnormalities and, at the 10-15 percent sort rate, this population of slides should contain 70-80 percent of the slides containing abnormalities missed by manual screening.
A variety of other technologies or clinical strategies have been proposed to improve Pap testing including various devices for collecting a cytological sample from the cervix. Still other technologies have been proposed to augment or replace cervical cytological screening, including colposcopic photographs for review by experts (cervicography) and DNA testing for specific human papillomavirus (HPV). These technologies are not considered in the present report.
The primary target population for this evidence report is women of average cervical cancer risk in the United States who are candidates for Pap smear screening. For the purposes of our analysis, candidates for Pap smear screening include women between the age of onset of sexual activity and the age of 85.
Although a large proportion of cervical cancer occurs in women with very limited or no screening, we did not examine programs or policies designed to improve screening compliance. Some previous studies have focused on special populations such as elderly women and elderly women who have not previously been screened.
The principal practice setting considered is the primary care practice in the United States (general internal medicine, family practice, adolescent medicine, and obstetrics/gynecology) and government and nongovernment family planning clinics (e.g., Planned Parenthood, public health clinics).
The comprehensive review of the literature, from identification of databases through abstraction of individual articles into the evidence tables, was a multistep, sequential process. This process is detailed below
MEDLINE, CancerLit, HealthSTAR, CINAHL, EMBASE, and EconLit computerized database searches, supplemented by manual journal searches and querying experts and device manufacturers, were the sources used to identify English language reports on the accuracy of cervical cytological screening, costs associated with screening and treatment, and cost-effectiveness.
Citations for the review of accuracy of cervical cytological testing were retrieved with a search strategy that combined various text word and index terms for cervical cytological tests with cervical cancer or dysplasia and sensitivity and specificity. The strategy to retrieve articles on the costs and health outcomes associated with cervical cancer screening combined cervical cytological test terms with terms describing cost analysis and mathematical modeling. Experienced librarians assisted with the design and translation of these search strategies for each database searched.
Separate sets of criteria for including articles in the evidence report were developed for the two topics that were the subject of literature reviews (diagnostic testing and cost and health outcomes). In each case, final screening criteria were developed through an iterative process. Each iteration of criteria was pilot-tested by each reviewer/abstractor on a subset of randomly chosen articles.
Articles on diagnostic testing were first screened based on information available through the online databases (primarily title, authors, and abstract when available). Citations were eliminated in Step 1 of the screening process if cervical cytology was not evaluated as a screening test or if the screening test results were not compared with a reference standard. In Step 2 of the screening process, full texts of articles were reviewed to select articles in which a reference standard of colposcopy or histology was used, the screening test and references standard were reasonably concurrent (i.e., within 3 months), and sufficient data to calculate both sensitivity and specificity were provided (i.e., all cells of a two-by-two table). Of the 939 bibliographic references reviewed, 561, or approximately 60 percent, were excluded during the first screening, and another 293, or 31 percent, during the second screening. Eighty-six articles were included according to these criteria: 84 studies of conventional Pap screening and one study each of ThinPrep® and Papnet®. Because so few studies of the new technologies met the original criteria, we modified the criteria to include studies of the new technologies that used a cytology reference standard and allowed estimation of either sensitivity or specificity. We considered a total of 59 studies (12 on AutoPap®, 27 on Papnet®, and 20 on ThinPrep®) during this final stage of the screening process (Step 3). The net result was the inclusion of 6 studies of AutoPap®, 11 of Papnet®, and 8 of ThinPrep®.
Articles on cost and health outcomes of cervical cytological screening were selected if they assessed the effect of screening on life expectancy or quality, number of cases of cervical cancer, or total health care costs for any of the following cytological screening technologies: conventional Pap smears, thin-layer cytology, or Pap smears with computerized rescreening. Of the 672 articles identified, 638, or 95 percent, were eliminated during the screening process. Thirty-four articles were included in the review.
Key information was abstracted onto specially designed forms and verified by either duplicate abstraction (two-by-two tables) or overreading by paired clinician-abstractors. Differences were resolved by consensus.
For the diagnostic testing articles, both members of each abstractor team also independently completed two-by-two tables for each study, extracting the key data to calculate sensitivity, specificity, and prevalence and other data to be used in the meta-analysis. The main outcome measures considered were the sensitivity and specificity of cytological abnormality by Pap test for detecting cases, where "cytological abnormality" was defined by one of three thresholds ranging from atypical squamous cells of uncertain significance (ASCUS) (threshold 1) to low-grade squamous intraepithelial lesion (LSIL) (threshold 2) to high-grade squamous intraepithelial lesion (HSIL) (threshold 3), and where a "case" was defined as a histological diagnosis of dysplasia or carcinoma. Equivalent categories in other classification schemes were also used. Two-by-two tables were constructed for four different combinations of cytological versus histological thresholds: ASCUS/ cervical intraepithelial neoplasia (CIN1), LSIL/CIN1, LSIL/CIN2-3, and HSIL/CIN2-3.
Quality scores for articles on diagnostic testing were assigned according to predetermined methodological criteria based on blind interpretation of screening test results, use of a reference standard of histology, selection of test negative patients for verification, avoidance of bias in sample collection, description of the spectrum of disease in the sample, publication as a full report (as opposed to abstract), and source of support.
The quality of articles on costs and health outcomes was described according to recently published criteria by an expert panel on cost and effectiveness in medicine.
We used the effectiveness score to combine data from multiple studies describing the performance of the conventional Pap test in discriminating between patients with and without cervical lesions. The effectiveness score takes account of both sensitivity and specificity by fitting a receiver operating characteristic (ROC) curve through a logistic odds transformation of the two and thus accounts for their interdependence. The effectiveness score is more normally distributed than either sensitivity or specificity and can be thought of as a gauge of the overall discriminatory ability of the test. Standardized effectiveness scores can be interpreted across different diagnostic tests. In general, a score of 3 reflects a test with good discrimination, whereas a score of 1 reflects a test that does not discriminate between disease positives and disease negatives.
We used maximum likelihood estimation techniques and a random effects model to calculate summary measures of effectiveness at each of the four explicit diagnostic thresholds (ASCUS/CIN1, LSIL/CIN1, LSIL/CIN2-3, HSIL/CIN2-3). We further evaluated the effect of variations in disease prevalence and in quality of study design and reporting on test discrimination.
Several available datasets were analyzed to estimate direct medical costs of screening, diagnosing, and treating cervical cancer, calculating separate estimates for women 20-64 years of age and those 65 years and older (eligible for Medicare). For women 20-64, the unit cost of screening, diagnosis, and treatment of cervical cancer was estimated from MEDSTAT data from 1992, 1993, and 1994, inflated to reflect 1994 charges and converted to costs using 1994 cost-to-charge ratios published by the American Hospital Association.
For women over 65, Medicare's resource-based relative value scale (RBRVS) fee schedule for physician services, Medicare's clinical laboratory fee schedule for laboratory services, and national average Diagnosis-related group (DRG) payments for hospital admissions were used to identify the payments associated with services received for cervical cancer screening, diagnosis, and treatment. Charges and payment information obtained from all sources were then converted to reflect costs associated with the services provided and all costs were inflated to 1997 dollars.
We constructed a 20-state Markov model that follows a cohort of women from age 15 to 85 and assumes that there are no prevalent cases of HPV infection or squamous intraepithelial lesion (SIL) at age 15. Cycle lengths are 1 year long. No Pap smear screening is compared with the following screening strategies: conventional Pap smears at 1-, 2- and 3-year intervals, thin-layer cytology smears at 1-, 2- and 3-year intervals, and 100 percent computerized rescreening at 1-, 2- and 3-year intervals.
We used a U.S. health system perspective and evaluated the direct and health-care specific costs associated with screening, diagnosis, and treatment of cervical cancer and its precursors. We did not consider other societal costs such as work loss. The model considers the following outcomes: cost per year of life saved, cost per cervical cancer death prevented and per cervical cancer case prevented, and the number of morbid therapies avoided.
We discounted costs and years of life at 3 percent annually in the base case and varied the discount rate from 0 to 5 percent in a sensitivity analysis. Specific parameter estimates were derived from a preliminary literature assessment conducted for this report and prior published models of cervical cancer screening.
Important findings regarding the discrimination about the accuracy of cervical cytological screening include the following:
Despite the demonstrated ability of cervical cytological screening in reducing cervical cancer mortality, the conventional Pap test is less sensitive than it is generally believed to be.
Few studies of primary screening were unaffected by "workup" bias, but the few that were provided estimates of the specificity of Pap smear screening of 0.98 (95 percent confidence interval; 0.97-0.99) and sensitivity of 0.51 (95 percent confidence interval; 0.37-0.66).
The Pap test is more accurate when a higher cytological threshold (HSIL) is used with the goal of detecting a high-grade lesion. Lower test thresholds or use of the Pap test for detecting low-grade dysplasia results in poorer discrimination.
The accuracy of the Pap test is strongly affected by disease prevalence. Higher disease prevalence is associated with higher estimates of sensitivity and lower estimates of specificity (with a greater effect on specificity). These findings are consistent with prevalence as a marker for workup bias and perhaps also reflect an imperfect reference standard that is more specific than sensitive.
Quality of the studies reviewed, based on previously described criteria, varied widely; however, quality score did not explain a statistically significant amount of the between-study variation in discrimination when the variation in the prevalence of disease was controlled.
Existing information fails to provide accurate estimates for specificity of thin-layer cytology or computerized rescreening technologies. Our initial requirement for verification of test negatives with colposcopy or histology led to the exclusion of all but one study each of ThinPrep® and Papnet® and all studies of AutoPap®. The values reported for sensitivity and specificity in the few studies that use histological or colposcopic reference standards are well within the range of sensitivity and specificity reported for the conventional Pap test. However, including studies that directly compare these new technologies with conventional Pap smear testing (screening or rescreening) using a cytological reference standard results in significant improvements in sensitivity.
Important findings regarding the costs of cervical cytological screening and cervical cancer diagnosis and treatment include the following:
Pap smear screening cost is somewhat higher in older women than younger women chiefly because physician and total time spent in obtaining Pap smears during office visits is longer for older women.
Estimated costs of cervical cancer treatment calculated from episodes of care are substantially higher than estimates based on average procedure-specific costs because of both the provision of related services and the effect of complicated cases with unusually high costs. Estimates based on procedure-related costs alone will underestimate the true direct medical costs.
Important findings from a review of previously published models of the cost and effectiveness of cervical cytological screening include the following:
Published models examining the cost and effectiveness of Pap smear screening have consistently found Pap screening to have a significant impact on the incidence and mortality of cervical cancer and to have an acceptable range of cost-effectiveness ratios when compared with no screening.
Estimates of Pap test accuracy used in these models generally overestimated Pap test performance, as determined by recent unbiased studies and the findings of this report, and a previously published meta-analyses. Best estimates of Pap test performance fall outside the range used in sensitivity analyses of some models.
Important findings from a new model of cost and effectiveness of cervical cytological screening include the following:
The cost-effectiveness of either a technology that improves primary screening sensitivity (e.g., thin-layer cytology), or one that improves rescreening sensitivity (e.g., computerized rescreening), is directly related to the frequency of screening -- longer intervals result in lower estimates of cost per life year saved.
Our findings were relatively insensitive to assumptions about cervical cancer incidence, the cost of technologies, diagnostic strategies for abnormal screening results, age at onset of screening, or most other variables tested.
There is substantial uncertainty about the estimates of sensitivity and specificity of thin-layer cytology and computerized rescreening technologies compared with each other and with conventional Pap testing. The uncertainty is not reflected in the point estimates for effectiveness or cost-effectiveness. Although it is clear that both thin-layer cytology and computerized rescreening technologies provide an improvement in effectiveness at higher cost, the imprecision in estimates of effectiveness makes drawing conclusions about the relative cost-effectiveness of thin-layer cytology and computerized rescreening technologies problematic.
Our research suggests several areas for possible future study.
Future decision models, cost-effectiveness studies, and health policy decisions should consider the sensitivity of Pap smear screening close to 50 percent.
Thin-layer cytology technology (ThinPrep®), the computerized rescreening device (AutoPap®), and the algorithm-based decisionmaking technology (Papnet®) have received regulatory approval from the FDA based on their demonstration of improved sensitivity compared with conventional Pap smear techniques. However, the evidence currently available does not fully describe the impact of these technologies on the specificity of the screening process. It is possible that a new technology might simultaneously raise both sensitivity and specificity; however, this has not been conclusively demonstrated for the devices reviewed in this report. Future studies of these technologies should include verification of test negative subjects to allow estimation of specificity.
Comparisons with cytological reference standards attest to the validity of the new technologies compared with optimal Pap screening, but comparison with a histological reference standard provides a more relevant outcome for clinical decisionmakers, since histological diagnosis forms the basis of most clinical management decisions. Further research is needed to validate negative cytological diagnoses made with the new technologies with colposcopy, in both low-prevalence and high-prevalence populations. This could be accomplished by subjecting a random sample of cytology-negative women to colposcopy, which would permit statistical correction for workup bias and estimation of test specificity.
Further research is needed to quantify the effect of cervical cancer and premalignant cervical lesions and various treatments for cervical cancer or dysplasia on quality of life. These data will allow a more comprehensive assessment of the impact of technologies for cervical cytological screening.
Worldwide, carcinoma of the cervix is one of the most common malignancies in women. It is estimated that approximately 13,700 new cases of the disease will occur in the United States in 1998 (Landis, Murray, Bolden et al., 1998). A woman's lifetime risk of being diagnosed with cervical cancer in the United States is currently 0.83 percent, and the risk of dying from the disease is 0.27 percent (Ries, Miller, Hankey et al., 1997).
The incidence of cervical cancer and associated mortality have both decreased over 40 percent since 1973. The discrepancy between incidence and mortality risk, and the decrease in both over time, are largely attributable to the success of mass screening using the Papanicolaou (Pap) test to diagnose premalignant or early-stage disease (Cannistra and Niloff, 1996; Ries, Kosary, Hankey et al., 1998; Shingleton, Patrick, Johnston et al., 1995; U.S. Preventive Services Task Force, 1996). The decrease in invasive cervical cancer incidence and mortality since the introduction of the Pap smear has been so dramatic that it is one of the few interventions to receive an "A" recommendation from the U.S. Preventive Services Task Force even though there are no randomized trials demonstrating its effectiveness (U.S. Preventive Services Task Force, 1996).
Despite the indisputably dramatic impact of Pap screening, there is still uncertainty about the details of Pap smear performance, and much could be done to improve the performance of the test and followup of patients after screening. Controversy about the details of Pap smear performance is manifest in differing recommendations about the frequency of screening and the age (if any) at which screening may safely be stopped (American Academy of Pediatrics, 1988; American Cancer Society, 1993; American College of Obstetricians and Gynecologists, 1995; American College of Physicians, 1991; American Medical Association, 1994; Canadian Task Force on the Periodic Health Examination, 1994; Green, 1994).
A significant proportion of patients and providers fail to comply with even the least demanding recommendations for Pap screening frequency. Numerous barriers to screening have been identified that reduce access to Pap smears and other preventive services (Womeodu and Bailey, 1996). Organized screening programs implemented in other countries have shown higher compliance with recommended screening rates than ad hoc screening programs (Koopmanschap, van Oortmarssen, van Agt et al., 1990; van Ballegooijen, van den Akker-van Marle, Warmerdam et al., 1997).
Efforts have also been made to improve test performance by insuring appropriate followup of results. In the case of abnormal test results, efforts have been focused on improving patient compliance. In the case of normal results, models that vary the frequency of testing intervals have been used to determine the optimum timing of screening visits, although there is disagreement on the interpretation of the models and on the recommendations of authoritative bodies.
More recently, efforts to improve Pap smear performance have focused on reducing the number of false negative smears, that is, cases in which premalignant or malignant cells have failed to be diagnosed either because of sampling error (abnormal cells failed to be placed on the slide) or detection error (abnormal cells were misdiagnosed as normal). Measures adopted to improve laboratory performance on this point include manual rescreening of a portion of slides initially evaluated as negative, an approach mandated by Federal law (Clinical Laboratory Improvement Amendments [CLIA]). Recently, several technologies have been developed to optimize the Pap test by reducing sampling or detection error. These technologies are a major focus of this report.
The purpose of the present report is threefold:
To determine the accuracy of cervical cytology using conventional Pap smears and newer technologies (thin-layer cytology, computerized rescreening, algorithm-based decisionmaking technology) for detecting cervical cancer and its precursors.
To estimate the direct medical costs associated with cervical cancer screening and evaluation, treatment and followup of cervical cytological abnormalities, and treatment and followup of cervical cancer.
To estimate the effects on total health care cost, morbidity, and mortality of regular cervical cytological screening using newer screening technologies (thin-layer cytology, computer rescreening, algorithm-based decisionmaking technology) compared with conventional Pap smear in women participating in a screening program.
On the first point, the report will review published studies comparing cervical cytological diagnosis with clinical diagnosis based on colposcopy or biopsy. The results of this review will form the basis for a meta-analysis.
On the second point, the report will identify and examine current claims data and other datasets to empirically estimate costs associated with cervical cytological screening.
On the third point, the report will review the literature on the effectiveness and cost-effectiveness of cervical cytology screening and use these data to develop a comprehensive cost-effectiveness model to examine the impact of the newer screening technologies. In the absence of definitive clinical trials on key questions of cervical cancer screening, policymakers have relied on decision-modeling studies to integrate epidemiological data on the natural history of cervical cancer precursors, data on the performance of diagnostic tests for early cervical cancer or cervical cancer precursors, and data on cost. These models estimate the efficacy of various screening programs, balance estimated efficacy against estimated cost, and lead to decisions about appropriate screening intervals and age cutoffs (Eddy, 1990; Fahs, Mandelblatt, Schechter et al., 1992; Mandelblatt and Fahs, 1988; Schechter, 1996; U.S. Congress Office of Technology Assessment, 1981, 1990).
Three new devices recently approved by the Food and Drug Administration (FDA) are considered in this report: ThinPrep®, Papnet®, and AutoPap®. The three devices employ three different types of technology: thin-layer cytology (ThinPrep®), and computerized rescreening utilizing neural-network technology (Papnet®) or algorithmic classification (AutoPap®).
Each of these technologies was developed to reduce the false negative rate associated with cervical cytological screening. The two major components to this false negative rate are false negatives related to sampling error and false negatives related to detection error. About two-thirds of false negatives are due to sampling error and the remaining one-third due to detection error. Each of the new technologies is directed at one these components of false negatives. Thin-layer cytology aims primarily to fix sampling error, whereas computerized rescreening targets detection error. This implies that neither technology will be able to reduce false negatives beyond a certain threshold.
Thin-layer cytology is a new technology for processing cytological samples. The sample is collected as in the conventional Pap test using a cervical broom or cervical spatula and endocervical brush, but rather than smearing the cytological sample directly onto a microscope slide, this new method suspends the sample cells in a fixative solution, disperses them, and then selectively collects cells on a filter. The cells are then transferred to a microscope slide for cytological interpretation. Because cytological samples are fixed immediately after collection, there are fewer artifacts in cellular morphology. Fewer cells on the slide are obscured, both because the process reduces artifactual material such as blood and mucous and because cells are deposited on the slide in a monolayer. Clinical studies of the ThinPrep® 2000 (Cytyc Corporation, Boxborough, MA) have shown that the sensitivity is improved compared with conventional Pap smears; however, few data are available on the specificity of this technology compared with conventional Pap smears. The improvement in sensitivity appears to be greater in populations with a low prevalence of cytological abnormalities.
Papnet® is a newly approved device that uses neural-network computerized rescreening of Pap smears initially read as negative by a cytotechnologist. The system works by using automated imaging of Pap smear slides and computerized interpretation of images. The Papnet® system (Neuromedical Systems, Inc.) identifies cells or clusters of cells that require review and displays up to 128 images of the slide likely to contain abnormalities. These images must be reviewed by a cytotechnologist who can decide whether to review the slide using light microscopy.
AutoPap® 300 QC system (Neopath, Inc.), which uses algorithm-based decisionmaking technology, identifies slides exceeding a certain threshold for the likelihood of abnormal cells. The laboratory can select different thresholds, corresponding to 10, 15, and 20 percent review rates. In contrast to random rescreening, the population of slides selected by the AutoPap® 300 QC system is enriched with abnormalities and should contain 70-80 percent of the slides containing abnormalities missed by manual screening.
A variety of other technologies or clinical strategies have been proposed to improve Pap testing, including various devices for collecting a cytological sample from the cervix. Still other technologies have been proposed to augment or replace cervical cytological screening; for example, colposcopic photographs for review by experts (cervicography) and DNA testing for specific human papillomavirus (HPV). These technologies are not considered in the present report.
The majority of cases of invasive cervical cancer occur in women who have either never had a Pap smear or have not had a smear in the previous 5 years (Shingleton, Patrick, Johnston et al., 1995). Lack of screening may explain many of the differences in cervical cancer rates among ethnic groups (Miller, Kolonel, Bernstein et al., 1998); for example, overall mortality from cervical cancer in black women is higher than in white women; stage-specific survival is identical, but black women present with more advanced disease (Ries, Kosary, Hankey et al., 1997).
Although all sexually active women are at risk for cervical cancer, the disease is more common in women of low socioeconomic status and in those who smoke, have a history of multiple sex partners, or have early onset of sexual intercourse.
Certain types of HPV have a strong epidemiological association with cervical cancer. Types 16 and 18 are two of several oncogenic strains of the more than 70 types of HPV. Both incidence and prevalence of HPV are greatest in women under the age of 25 (Koutsky, 1997). Ho, Bierman, Beardsley et al. (1998) have recently reported a cumulative 3-year incidence of 43 percent, or an average annual incidence of 14 percent, in a cohort of college women. Prevalence of HPV DNA in women with normal cervical cytology decreases with age in a variety of populations (Bauer, Hildesheim, Schiffman et al., 1993; Coker, Jenkins, Busnardo et al., 1993; ; Figueroa, Ward, Luthi et al., 1995; Hildesheim, Gravitt, Schiffman et al., 1993; Wheeler, Parmenter, Hunt et al., 1993). A second peak is seen after age 40 in some studies (Figueroa, Ward, Luthi et al., 1995); whether this is due to reinfection, or a cohort effect secondary to differences in age at onset of sexual activity, is unclear. Data on postmenopausal women are rare. Reported prevalence in peri- and postmenopausal women ranges from 1 percent (Ferenczy, Gelfand, Franco et al., 1997) to 38.1 percent (Smith, Johnson, Figuerres et al., 1997), a difference that may be attributable to differences in populations, cohort effects, or viral assay techniques. An international study found prevalence in women between 50 and 59 ranging from 4.1 percent in Spain to 13.6 percent in Brazil (Munoz, Kato, Bosch et al., 1996). Prevalence in a German cohort of women over 55 was 3.2 percent (De Villiers, Wagner, Schneider et al., 1992).
The natural history of HPV infection is complex, with clearance and persistence of viral DNA, along with progression to squamous intraepithelial lesion (SIL), varying depending on the viral type, patient characteristics such as age, and study design and assay methods (Herrero, 1996; Kiviat, 1996; Koutsky, 1997; Mitchell, Tortolero-Luna, Wright et al., 1996 ; Schiffman, Bauer, Hoover et al., 1993).
Although a small percentage of cervical cancers do not have detectable HPV DNA, even with sensitive assays, there is consensus that HPV infection is the causative agent for the vast majority of cervical cancers (Herrero, 1996; Kiviat, 1996; Koutsky, 1997; Schiffman, Bauer, Hoover et al., 1993). Certain HPV types are clearly more likely to progress to cancer than others, and identification of these types in cervical cells may have a role in determining optimal diagnostic and treatment strategies for patients with abnormal Pap smears (Cox, Lorincz, Schiffman et al., 1995).
Invasive cervical cancer usually is preceded by a long, premalignant phase during which it is readily treatable (Cannistra and Niloff, 1996; Wright and Richart, 1992; Wright, Kurman, and Ferenczy,1994). Even if this premalignant phase progresses to invasive cancer, early-stage disease is highly curable: 5-year survival rates for early-stage disease are above 90 percent, but if the disease has spread outside of the pelvis (Stage IV), cure rates are only 10-15 percent.
Epidemiological data suggest that a substantial proportion of patients with low-grade squamous intraepithelial lesions (LSIL) will have regression if the lesion are not treated, a finding that supports a "watchful waiting" strategy involving retesting.
Despite the success of Pap smear screening, women continue to develop cervical cancer. Treatment of early-stage disease is effective, but both of the alternatives, radical hysterectomy or radiation therapy, can have significant short- and long-term morbidity (Cannistra and Niloff, 1996; Landoni, Maneo, Columbo et al., 1997). Mortality from late-stage disease remains high.
The primary target population for this evidence report is women of average cervical cancer risk in the United States who are candidates for Pap smear screening. For the purposes of our analysis, candidates for Pap smear screening include women between the age of onset of sexual activity and the age of 85.
Although a large proportion of cervical cancer occurs in women with very limited or no screening, we did not examine programs or policies designed to improve screening compliance. Some previous studies have focused on special populations such as elderly women (Fahs, Mandelblatt, Schechter et al., 1992; U.S. Congress Office of Technology Assessment, 1990) and elderly women who have not previously been screened (Mandelblatt and Fahs, 1988).
The principal practice setting considered will be the primary care practice in the United States (general internal medicine, family practice, adolescent medicine, and obstetrics/gynecology) and government and nongovernment family planning clinics (e.g., Planned Parenthood, public health clinics).
This section documents explicitly the methods and procedures used to develop the evidence report, emphasizing the comprehensive evaluation of the evidence and the analytic framework used. The section begins with a description of the research questions and evidence model that guided our work and proceeds to a detailed description of the techniques and approaches used in the literature review, including descriptions of the literature search and review parameters, MeSH terms used, types of study design included, number and identity of databases searched, and years included in the search. The methodologies used to produce cost estimates and carry out the meta-analysis and cost-effectiveness analysis are then described. Finally, the role of the report partner (American College of Obstetricians and Gynecologists) and quality control methods are described.
The report addresses three main questions:
What is the accuracy of cervical cytology using conventional Pap smears and new technologies (thin-layer cytology, computerized rescreening using neural network or algorithm-based technology) for detecting cervical cancer and its precursors?
What are the direct medical costs associated with cervical cancer screening, evaluation, treatment, and followup of cervical cytological abnormalities and treatment and followup of cervical cancer?
What are the effects on total health care cost, morbidity, and mortality of regular cervical cytological screening using thin-layer cytology, computerized rescreening using neural network technology, and computerized rescreening using algorithm-based technology compared with conventional Pap smear in women participating in a screening program
The evidence model or "causal pathway" used, which is illustrated in Figure 1
We drafted and revised the evidence model, soliciting input from the study's Advisory Panel of Technical Experts regarding the outcomes and harms of interest, and identified the type of evidence that exists to support each linkage. The specific linkages identified as key questions were the subject of a systematic literature review.
The project focused on several linkages that are crucial to the question of the cost an effectiveness of several new technologies for cervical cytological screening. The overall question of the effectiveness of cervical cytological screening on morbidity and mortality is described by the overarching linkages between cervical cytological screening and reductions in mortality and morbidity, as well as influences on cost and other health outcomes such as quality of life. Strong but indirect evidence supports the effectiveness of cervical cytological screening in reducing cervical cancer morbidity and mortality. This evidence includes quasi-experimental observational studies describing cervical cancer incidence before and after implementation of screening programs. Additional indirect evidence comes from decision- and simulation-modeling of cervical cancer screening.
Current costs associated with cervical cytological screening are not well described in the literature, and in this project an empirical investigation of health care claims and other databases was undertaken to address this linkage in the evidence model.
The link between cervical cytological screening and detection of abnormalities is investigated in a literature review on the diagnostic accuracy of conventional Pap smear screening and the new technologies. Discrimination, sensitivity, and specificity are the most relevant data for the assessment of the new technologies.
We did not specifically address the well-established linkages between detection of abnormal cytology or early cervical cancer and reductions in cervical cancer morbidity and mortality through early treatment of cervical lesions.
The evidence model also functioned as a conceptual basis for the construction of a working mathematical model of the cost and effectiveness of cervical cytological screening that synthesizes all the linkages depicted regarding cervical cancer screening. Our approach thus extends the idea of the evidence model to a mathematical model that allows us to examine the impact of various cervical cancer screening strategies on the health and economic outcomes depicted in the evidence model. Both models provide an inventory of the inputs relevant to a screening decision and give a sense of which screening questions are tractable and for which questions reasonable scientific data exist. The models thus also guided the design of our literature search to focus on those inputs that most strongly influence the relative desirability of alternative decision strategies to various clinical decisionmakers.
The comprehensive review of the literature, from identification of databases through abstraction of individual articles into the evidence tables, was a multistep, sequential process. This process is detailed below.
Primary sources of literature for the two key research questions that were the subject of literature reviews were six of the most widely used online bibliographic databases: MEDLINE, CancerLit, HealthSTAR, CINAHL, EMBASE, and EconLit. Although there is considerable overlap among these databases, each contains unique information.
Produced by the U.S. National Library of Medicine, the MEDLINE database is widely recognized as the premier source for bibliographic and abstract coverage of biomedical literature. MEDLINE coverage began in 1966. More than 8.7 million records from more than 3,600 journals are indexed, plus selected monographs of congresses and symposia. Abstracts are included for about 67 percent of the records.
Produced by the U.S. National Cancer Institute, CancerLit is an important source of bibliographic records (most with abstracts) beginning in 1983 and pertaining to all aspects of cancer therapy, including experimental and clinical cancer therapy; chemical, viral, and other cancer-causing agents; mechanisms of carcinogenesis; biochemistry, immunology, and physiology of cancer; and mutagen and growth factor studies. Approximately 200 core journals contribute a large percentage of the records. Other entries are drawn from proceedings of meetings, government reports, symposia reports, theses, and selected monographs. Indexed materials include articles from journals, abstracts of papers presented at professional meetings, government and technical reports, dissertations, and monographs.
HealthSTAR, produced cooperatively by the U.S. National Library of Medicine and the American Hospital Association, contains citations to the published literature on health services, technology, administration, and research from 1975 to the present. Topics included that were of particular interest to us are evaluation of patient outcomes; effectiveness of procedures, programs, products, services, and processes; health economics and financial management; and quality assurance. This database contains citations and abstracts (when available) to journal articles, monographs, technical reports, meeting abstracts and papers, book chapters, government documents, and newspaper articles.
The Cumulative Index to Nursing & Allied Health (CINAHL) provides authoritative coverage of the literature related to nursing and allied health from 1983 to the present. CINAHL was chosen for the literature searches on cervical cytology primarily because it indexes several medical laboratory journals.
EMBASE is a comprehensive pharmacological and biomedical database containing over 3 million records from 1980 to the present, indexed from 3,500 journals published in 70 countries. More than 65 percent of the records contain abstracts.
EconLit contains abstracts of journal articles, books, and working papers, book reviews, and citations to articles in collective volumes. It covers economic literature broadly, from 1969 to the present.
Computerized searches of these online databases were supplemented by secondary searches, including the manual scanning of newly published journal issues that had not yet been indexed in the online databases. This scanning was accomplished through online searches of journal websites as well as searches of printed journals obtained through subscriptions and the Duke University Medical Center Library. In addition, the World Wide Web of the manufacturers of the automated cytology devices reviewed in this report were checked several times a week, along with those of relevant professional societies.
The two strategies used to search the above computerized databases were developed in consultation with a Duke University Medical Center librarian who specializes in evidence-based medicine research. There were several iterations of each search strategy before the final strategies were agreed upon. The final strategy for articles on diagnostic testing for cervical cancer was:
Vaginal smears/
((pap or papan$) and (smear$ or test$)).tw
(papnet or autopap or thinprep).tw.
1 or 2 or 3
exp Cervix neoplasms/
Cervix dysplasia/
Cervical Intraepithelial Neoplasia/
dyskaryo$.tw.
5 or 6 or 7 or 8
exp "Sensitivity and Specificity"/
(sensitivity or specificity).tw.
exp Diagnostic errors/
4 and (10 or 11 or 12)
13 and 9
limit 14 to (human and english language)
Papillomavirus, Human/
15 not 16
The final search strategy for articles on the costs and health outcomes associated with cervical cancer screening was as follows:
Vaginal smears/
((pap or papan$) and (smear$ or test$)).tw
(papnet or autopap or thinprep).tw.
1 or 2 or 3
exp "costs and cost analysis"/
(cost$ or expenditure$).tw.
ec.fs.
5 or 6 or 7
4 and 8
limit 9 to (human and english language)
exp Decision Support Techniques/
exp Models, Statistical/
Technology assessment, biomedical/
Monte carlo method/ or Survival Analysis/
11 or 12 or 13 or 14
4 and 15
limit 16 to (human and english language)
10 or 17
These strategies were initially developed using the National Library of Medicine MeSH key word nomenclature developed for MEDLINE. The Duke University Medical Center librarians assisted with the translation of these search strategies to the key word structures used by other databases. The basic search terms remained the same. Each online search began with the first year of the database (see individual database descriptions above).
After the initial runs in January 1998, the computerized searches were updated twice, in February and March. The final yield of the two searches was 1,538 unduplicated articles--939 articles were identified by the diagnostic testing searches and 671 by the cost and health outcomes searches, including articles identified through manual searches. Approximately 67 percent of all articles were identified through MEDLINE, 23 percent through EMBASE, and 5 percent through manual searches, with the remaining 5 percent divided among CancerLit, HealthSTAR, and CINAHL. EconLit did not identify any relevant or unique articles.
As online literature searches were conducted, results were imported into Pro-Cite Version 2.0, a database software program for storing, managing, and retrieving bibliographic information. Each article was given a unique identifier within the Pro-Cite database. An ongoing search for duplicates was conducted. Initially, each article was coded to indicate its source and key question(s). As screening and data abstraction progressed, each article was coded for the stage at which it was excluded from the evidence report.
Separate sets of criteria for including articles in the evidence report were developed for the two topics that were the subject of literature reviews (diagnostic testing and cost and health outcomes). In each case, final screening criteria were developed through an iterative process. Each iteration of criteria was pilot-tested by each reviewer/abstractor on a subset of randomly chosen articles.
These articles were clinically and methodologically complex and thus required a two-step screening process. In Step 1, articles were screened on the basis of the basic information available through the online databases (primarily title, authors, and abstract when available). Articles at this step were eliminated from further consideration based on answers to three questions:
Was cervical cytology evaluated as a screening test? If no, then article excluded.
Was the screening test for primary screening only, rescreening only, or primary screening with rescreening? If none of these, then article excluded.
Was the reference standard histology, histology or negative colposcopy, or cytology? If none of these, then article excluded.
At this stage, the following information was also recorded:
Type of rescreening included: none, manual, interactive, or independent.
Description of target population.
Bibliographic records were screened by three pairs of clinicians, two pairs consisting of a general internist and obstetrician/gynecologist and one pair consisting of a medical student and obstetrician/gynecologist. The decisions of the two partners in each pair were compared, and a kappa statistic was calculated to estimate the strength of the agreement between them. Differences of opinion were then reconciled. Only when both clinician-reviewers agreed to include or exclude an article was the decision entered into the Pro-Cite database. Copies of the full text of all articles passing the initial screening were obtained. Each included article was subjected to a further screening (Step 2) by the same pair of reviewers who had performed the initial screening of the corresponding bibliographic record (Step 1).
As in Step 1, the criteria used in Step 2 went through several iterations to assure that the most salient articles would be included. The four questions used for excluding articles at Step 2 were:
Did the study use a reference standard? If not, then article excluded.
Was the reference standard histology or colposcopy? Histology and colposcopy qualified the article for inclusion; articles with cytology reference standard were excluded.
For studies comparing histology or colposcopy with cytology as a screening test: were these tests reasonably concurrent, i.e., up to 3 months apart? If not, then article excluded.
Can all cells of a 2-by-2 table be completed? If not, then article excluded. The 2-by-2 table is defined as a tabulation of the number of subjects with normal and abnormal screening test results according to the specified screening test threshold versus the number of subjects with normal or abnormal reference standard results according to the reference standard threshold.
The review process was the same as in the first step, with the two clinicians in each team comparing their decisions and reaching agreement to include or exclude the article. The decision to exclude at Step 2 was then entered into the Pro-Cite database. The two major reasons for excluding an article at this stage were lack of histology/colposcopy as a reference standard and the lack of sufficient data to complete a 2-by-2 table.
Of the 939 bibliographic references reviewed, 561, or approximately 60 percent, were excluded during Step 1 screening, and another 293, or 31 percent, during Step 2 screening. Eighty-six articles passed both screens.
Few articles pertaining to the new technologies were included among the 86 articles passing both screens. In most cases, this was due to the lack of a histology/colposcopy reference standard. We sought guidance from cytology society guidelines (Intersociety Working Group for Cytology Technologies, 1998 ) and FDA documents (Food and Drug Administration, 1994) about acceptable reference standards other than histology or colposcopy. These resources suggested that cytology would be an acceptable surrogate reference standard, but only when an independent panel of cytology professionals was used to arrive at a consensus diagnosis. Furthermore, the Intersociety Working Group guidelines called for histological validation of a significant proportion of high-grade cytological findings.
We also explored the methodological literature on diagnostic tests to determine techniques for dealing with lack of verification of test negative subjects. Information about the incremental test performance characteristics can be obtained from a direct comparison of independently applied conventional and new tests, in which case it is not necessary to obtain reference standard results for subjects who are negative on both tests (Chock, Irwig, Berry et al.,1997). Using this method, one cannot calculate sensitivity and specificity directly, but a relative true positive rate (TPR) and a relative false positive rate (FPR) can be calculated. These statistics can be used to modify estimates of the conventional test performance to yield estimates of the performance of the new test relative to the reference standard. These methods apply only when the tests being compared are used independently. For example, studies that applied initial manual screening followed by computerized rescreening could not be evaluated with these methods because the use of rescreening was conditional on a negative initial screen and thus the two tests were not applied independently. These methods would apply primarily to studies of alternatives to conventional Pap when used as primary screeners or studies that use independent comparisons of computerized versus manual rescreening.
This review of methodological issues led to the development of a separate set of screening criteria for studies of the new technologies (Step 3 screen). We reevaluated articles on monolayer slide preparation and neural-network or algorithm-based screening technologies that had been excluded in Step 2 either because they used a cytology reference standard or because they failed to allow estimates of both sensitivity and specificity. Again, two clinicians in each team compared their decisions and reached agreement on whether to include or exclude an article. The Step 3 screening criteria were as follows:
Did the study use a two-armed prospective design? If not, then article excluded. (A two-arm design implies that the tests being compared are applied to the same set of subjects or slides.)
Were discordant results from the two study arms adjudicated by an independent panel of experienced cytology professionals? If not, then article excluded.
Were the majority of those testing positive for HSIL verified with histology or colposcopy?Studies which did not verify at least half of those with screening tests positive for HSIL were excluded.
Did the study design allow for separate analyses of sensitivity (or relative TPR) and specificity (or relative FPR)? If not, then article excluded.
A total of 59 articles were considered at this stage (12 on AutoPap, 27 on Papnet®, and 20 on ThinPrep ®). Forty-three of these fit the description provided above (excluded during Stage 2 screen because of cytology reference standard or failure to allow estimates of sensitivity and specificity). Sixteen new articles were also brought to our attention by reviewers or manufacturers. Of these 16 articles, 7 were unpublished manuscripts and 2 were published only in abstract form.
Articles were excluded on the basis of responses to two questions:
Was the screening test AutoPap ®, Papnet ® , ThinPrep ® , or conventional Pap? If none of these, then article excluded.
Does the article assess the effect of screening on life expectancy or quality, number of cervical cancer cases avoided, or total health care costs? If none of these, then article excluded. Note: Studies that reported costs of screening test alone, without subsequent evaluation- and management-related costs, were excluded.
Bibliographic records (title, author, and abstract, when available) from the Pro-Cite database were screened by two pairs of reviewers, each including a doctoral-level economics graduate student and either a clinician or health policy analyst. Decisions by the partners were recorded, a kappa statistic was calculated, and agreement was reached where discrepancies existed. If the abstract satisfied the screen, then the full article was reviewed with respect to the same two criteria. The decision to exclude an article was entered into the computer database after the abstract and/or the full article had been screened.
Of the 672 articles identified, 638, or 95 percent, were eliminated during the screening process. Thirty-four articles were included in the review.
At the time the literature searches were conducted, there were no completed randomized clinical trials comparing conventional Pap tests with new technologies to detect cervical cancer and its precursors. Study designs included in the diagnostic testing review were diagnostic test evaluations, meta-analyses, and review articles. Cost and health outcomes articles included were cost studies, typically cost-effectiveness and cost-benefit analyses.
The review team responsible for including a given article was also responsible for abstracting data and other key information from that article. The goal of the data abstraction process was to abstract essential information directly into the template of the evidence table in the format required by the Agency for Health Care and Research (AHCPR).
The evidence table entries were created by one member of each review team, over-read by the other team member, and then revised. For the cost and health outcomes articles, each evidence table entry was again over-read by a clinician.
| Classification System | Within Normal Limits | Benign Cellular Changes | Epithelial Abnormalities | ||||||
|---|---|---|---|---|---|---|---|---|---|
| The Bethesda System (National Cancer Institute Workshop, 1993) | Normal | Infection reactive repair | ASCUS | Squamous intraepithelial lesion (SIL) | Invasive carcinoma | ||||
| Low grade (LSIL) | High grade (HSIL) | ||||||||
| Richart (1973) | Condylom a | Cervical intraepithelial neoplasia (CIN) | |||||||
| Grade 1 | Grade 2 | Grade 3 | |||||||
| Reagan (1979) (WHO) | Atypia | Mild dysplasia | Moderate dysplasia | Severe dysplasia | In situ carcinoma | ||||
| Papanicolaou (Nyirjesy, 1972) | I | II | III | IV | V | ||||
ASCUS = atypical squamous cell of uncertain significance.
Some studies that failed to permit calculation of sensitivity and specificity directly nonetheless allowed calculation of relative TPR and relative FNR (false negative rate) according to the method of Chock, Irwig, Berry et al. (1997). This method was applied to studies comparing two independently applied screening tests in which verification was obtained on any patient testing positive on either test (but not those testing negative on both tests).
Given the two very distinct sets of articles reviewed for the evidence report, two separate sets of criteria to judge their quality were needed.
The goal was to develop a single numeric or alphabetic summary of the methodological quality of individual studies. A systematic approach was used to reach consensus on rigorous, predetermined methodological criteria and the relative numeric weights to be assigned to each. The participants were nine members of the study's work group (six clinicians, two economists, one health policy analyst). Each member of the group was familiar with the diagnostic testing articles. The group initially identified more than a dozen evaluation criteria; the consensus process narrowed the list to seven. The weight given to each criterion was determined by each participant writing his/her numerical assessment of the value of that criterion (blinded to the rest of the group). The means of these "votes" were calculated. Each participant received a copy of his/her own responses, graphically depicted in relation to the mean of all responses, and was requested to confirm or reconsider his/her responses. The revised responses were then calculated and the means affirmed by the group as the weight. The quality review criteria and their weights (in parentheses) are:
Were the test and reference standard measured independently (blind) of each other?yes (2); no (0)
Was the test compared with a valid reference standard?histology (2); histology or negative colposcopy (1); cytology (0)
Was the choice of patients who were assessed by the reference standard independent of the test's results? all test positives and test negatives verified (2); test positives and random fraction of test negatives verified (1); test positives and selected sample or none of test negatives verified (0)
How was the study sample collected?consecutive or random (2); other (0)
Was the spectrum of disease/nondisease defined? yes (1); no (0)
How was the study published? paper (1); abstract (0)
What was the industry relationship to the study? neither done nor supported by a manufacturer (1); supported by a manufacturer (1/2); done by a manufacturer (0)
The major difficulty with this set of articles was that the two types of studies (health outcomes and costs), sometimes distinct and sometimes overlapping, generated many criteria. There were several expert sources of quality criteria by which to judge these articles that were readily affirmed by the work group and reaffirmed by the study's Advisory Panel of Technical Experts (Drummond, Richardson, O'Brien et al., 1997 ; Gold, Siegel, Russell et al., 1996 ; O'Brien, Heyland, Richardson et al., 1997 ; Siegel, Weinstein, Russell et al., 1996).
Letters representing the criteria that were not satisfied by a given article are shown in the last column of the evidence table. The criteria were:
For articles that modeled health outcomes:
There was a description or diagram of a decision model.
The clinical strategies were described in detail.
A comprehensive literature review was conducted.
The control group was described.
Sources of data for probability estimates were identified.
The source of the utility scale or measurement method was given.
Clinically relevant outcomes of interest were described in detail.
Results of a sensitivity analysis were presented.
For articles that modeled costs:
There was a description of the framework for the cost analysis.
Modeling assumptions were stated explicitly.
There was sufficient description of estimates of types of resources used, amount of resource use, and unit costs, and their sources.
There was a critique of data quality.
The cost data were current, i.e., 1988 or more recent.
There was a description of a method to adjust costs for inflation.
There was a statement of discount rates used (for multiperiod or longitudinal studies).
For articles that compared health outcomes with and without Pap smear screening or with and without use of new Pap technology:
Study population was representative or clearly defined.
Screened and unscreened populations were concurrent (e.g., before-after design).
Screened and unscreened populations were randomly allocated.
The method of ascertainment of health outcomes was the same in the screened and unscreened groups.
The ascertainment of health outcomes was complete (loss to followup <5 percent ).
The costs of screening, diagnosing, and treating cervical cancer were estimated by Health Economics Research, Inc., utilizing both claims and secondary data sources. A comprehensive approach was used to incorporate costs associated with all medical services provided. For instance, the costs associated with both physician and pathology services were included in the total cost of screening for cervical cancer. In the case of diagnosis and treatment, all costs of providing outpatient and inpatient services were estimated. However, societal costs or indirect costs, such as those associated with lost wages or with waiting time at the physician's office, were not included. The methodology employed is discussed in detail below.
Separate estimates are provided for those 20-64 years of age, and those 65 years and older (eligible for Medicare). The costs for the under-65 population were estimated from MEDSTAT data, and for the older subgroup Medicare's resource-based relative value scale (RBRVS) fee schedule, clinical laboratory fee schedule, diagnosis-related group payment rates, and ambulatory surgery center payment rates were utilized. Charges and payment information obtained from these sources were then converted to reflect costs associated with the services provided. Finally, all costs were inflated to 1997 dollars. The methodology used to estimate the cost for each group is discussed separately, starting with the estimates for women 20-64 years.
To obtain cost estimates for this age group, Health Economics Research, Inc., analyzed the MarketScan database created by the MEDSTAT Group and previously used in cost analyses of AHCPR practice guidelines. The MEDSTAT MarketScan database provides claims information for a large sample of privately insured individuals employed at large private companies across the country. The claims data are collected from over 100 different insurance companies, Blue Cross and Blue Shield plans, and third-party administrators (for self-insured employers). The data represent the medical experiences of insured active employees, early retirees, Consolidated Omnibus Budget Reconciliation Act (COBRA) participants, and their dependents. These data were used to estimate screening and treatment costs for the population under the age of 65. For our analyses, we used four types of claims files: inpatient claims, physician claims, outpatient department claims, and pharmaceutical claims. Information on service dates, procedures, diagnoses, and payments was abstracted. We analyzed data for a 3-year period, 1991-93, which represented the medical experience for roughly 2.5 million privately insured beneficiaries with more than 500,000 inpatient admissions and over 76 million outpatient claims.
Although the MarketScan database is a rich source of medical utilization information for the under-65 population, a number of issues arise when these data are used that may or may not affect our analyses. First, our ability to track treatment costs over time due to individuals' dropping in and out of the MarketScan database because of employment changes is limited. Thus, the data were used to estimate average costs of treatment and screening at an aggregated level. It is also important to note that the MarketScan data represent a nonrandom sample of the under-65 population. The MarketScan database includes a disproportionate number of persons employed in large firms, especially in California, Michigan, Ohio, and Texas. To the extent that large employers generally offer more generous insurance benefits than do smaller firms, utilization estimates derived from the MarketScan database are likely to overstate the general under-65 utilization experience. This bias would most directly affect cost estimates for cancer treatment, rather than cost estimates for specific diagnostic procedures or therapeutic services, as episode of care estimates are most affected by differential utilization patterns. Better insured women may use more services as well as more expensive services than less well-insured women. Last, the MarketScan database reflects the mix of fee-for-service and managed care arrangements present in the early 1990s. To the extent that the risk sharing and payment arrangements for the set of large firms in the MarketScan database differ systematically from those observed for the general under-65 population, there might be biases in the derived cost estimates. However, the direction of the bias is not readily known.
We used the MEDSTAT data to estimate the unit cost of screening, diagnosis, and treatment of cervical cancer. Since 3 years of data (1992, 1993, and 1994) were analyzed, all 1992 and 1993 charges were inflated to reflect 1994 charges. The national medical care services (includes hospital and physician services) inflation rate of 12.02 percent for 1992 charges and 5.17 percent for 1993 charges were utilized (CPI-U, U.S. Bureau of Labor Statistics, 1982-84).
The costs associated with hospital and physician services were derived from the MEDSTAT data estimates by utilizing appropriate adjustments. In the case of services provided in a hospital setting, the total covered charge for each service submitted by the hospital was identified from the MEDSTAT data. The inpatient hospital charges include room and board and ancillary charges. To convert these charges to costs, 1994 cost-to-charge ratios published by the American Hospital Association (AHA) (obtained by AHA through its Annual Survey of Hospitals) were utilized. The 1994 cost-to-charge ratio was 0.58. That is, on average across the Nation, the cost associated with the inpatient services provided was 58 percent of the charge.
Since no similar cost-to-charge ratio was available for physician charges, it was assumed that the payments made by the health care plans closely reflected the costs associated with providing these services. Thus, payments were used as a proxy for costs. This approach provides a "total" cost estimate for a given procedure or set of procedures. It not only includes the amount reimbursed by the insurance plan but also any copayments made by the patient. These payments differ from charges because they include only the amount eligible for payment under the benefit plan after pricing guidelines such as discounts and fee schedules are applied.
Medicare's RBRVS fee schedule for physician services, Medicare's Clinical Laboratory fee schedule for laboratory services, national average diagnosis-related group (DRG) payments for hospital admissions, and ambulatory surgery center payment rates for outpatient procedures were used to identify the payments associated with services received for cervical cancer screening, diagnosis, and treatment. The 1994 Medicare payments were used to allow for comparability in the estimates between the over-65 and the under-65 age groups (MEDSTAT data estimates). The Medicare payments were assumed to closely approximate the national average cost of providing the service, and therefore no conversion factors were employed (Eisenberg, 1989).
Cost estimates for both the under-65 and the over-65 age groups were updated to reflect 1997 dollars. Separate inflators were employed for services provided at institutional and noninstitutional settings. Services received at institutions were updated to 1997 dollars using the DRI-Prospective Payment System (PPS) Hospital Market Index and Health Care Financing Administration (HCFA) Capital Index. A rate of 9.85 percent was utilized. The rate was calculated using DRI-PPS Hospital Consumer Price Survey (CPS) Market Index (DRI/McGraw Hill) for operating costs, and HCFA Capital Index (Federal Register) for capital cost. A rate combining the operating and capital inflator was obtained by assuming that 90 percent of costs were related to operations and 10 percent to capital. The Medicare Economic Index (MEI) was used to inflate 1994 costs of all noninstitutional services to 1997 dollars. The MEI of 8.40 percent was utilized.
To combine data from multiple studies describing the performance of the conventional Pap test in discriminating between patients with and without cervical lesions, we used the effectiveness score, a measure described by Hasselblad and Hedges (1995) and Fahey, Irwig, and Macaskill (1995). The effectiveness score takes account of both sensitivity and specificity. As is well known, sensitivity and specificity are interdependent; both vary according to the test threshold used, but in opposite directions. The effectiveness score allows us to consider both measures by fitting a receiver operating characteristic (ROC) curve (Swets, 1988) through a logistic odds transformation of sensitivity and specificity, thus accounting for the interdependence of the two.
The classifications used to describe the results of cervical cytology in the clinical research literature have an ordered categorical response instead of a dichotomous one. Such a response permits the use of multiple discrete cut-points for defining an abnormal Pap test. In addition to these explicit thresholds, there may be differences in the implicit thresholds used by cytologists in labeling a slide as abnormal. One of the main advantages of using the effectiveness score is that it is often independent of the cut-points used for the test.
The effectiveness score is more normally distributed than either sensitivity or specificity and can be thought of as a gauge of the overall discriminatory ability of the test. To standardize the effectiveness score, we adjusted the standard deviation of the logistic normal distribution curve by the square root (3)/pi to get the following measure:
delta = (square root [3]/pi) * (log(sensitivity/[1-sensitivity])
+
log(specificity/[1-specificity]))
Because it is standardized, the effectiveness score can be interpreted across different diagnostic tests. In general, a score of 3 reflects a test with good discrimination, whereas a score of 1 reflects a test that does not discriminate between disease positives and disease negatives.
Another measure, the log odds ratio, may be more familiar to some readers:
log odds ratio = log(sensitivity/[1-sensitivity]) + log(specificity/[1-specificity]).
This measure differs from the effectiveness score by multiplication by a constant, square root (3)/pi.
We calculated summary measures of effectiveness at each of the four explicit diagnostic thresholds (ASCUS/CIN1, LSIL/CIN1, LSIL/CIN2-3, HSIL/CIN2-3), using maximum likelihood estimation techniques and a random effects model. We chose a random effects model because of the wide variation in reported sensitivity and specificity among the included studies. We also evaluated the effect of differences in implicit threshold on effectiveness scores by including the following variable, S, in the model:
S = log(sensitivity/[1-sensitivity]) - l og(specificity/[1-specificity])
Because the model results were not changed by including this value, we used the parameter estimates from the simpler one-parameter (intercept only) logistic equations to generate summary estimates of sensitivity and specificity and to produce ROC curves.
Our data included studies with different reference standards, primarily histology (cone biopsy, hysterectomy, or punch biopsy) and negative colposcopy. These reference standards are themselves subject to inaccuracy, and one problem that may occur when a diagnostic test is compared with an imperfect independent reference standard is underestimation of sensitivity and specificity (Boyko, 1996). In such a case, the sensitivity and specificity of the test may vary with prevalence of disease. To address the possible imperfect nature of histology/colposcopy, we evaluated the effect of disease prevalence on our summary estimates of effectiveness. Multiple logistic regression was used to estimate the variation in overall effectiveness due to variations in prevalence. We also evaluated the effect of study quality on summary effectiveness scores, initially using individual components of the score, then using the total score both as a continuous and a dichotomous (quality score <7, >7) variable.
This section describes our response to the question, "What are the effects on total health care costs, morbidity, and mortality of regular cervical cytological screening using thin-layer cytology or computer rescreening compared with conventional Pap smear in women participating in a screening program?"
Our basic approach to this question was to create a decision analytic model to simulate the health and economic impact of various cervical cancer screening strategies based on the results of our literature review and meta-analysis. The model, borrowing from previous similar efforts, was created to generate a wide variety of health and cost outcome projections, including survival, cumulative cost, and incremental cost-effectiveness. However, because limited data were available regarding the natural history of cervical cancer and the costs and effectiveness of cervical cancer screening and treatment, we concluded that the sort of precise projections that would be of use to decisionmakers would not be possible. In particular, we determined that the aggregate effect of uncertainty in the model inputs (see below) led to substantial uncertainty in the outputs, such that a point estimate of incremental cost-effectiveness could not be taken as a reliable representation of the value of many of the screening strategies under investigation. As a consequence, we focused the model on the general question, "What characteristics would a cervical cancer screening strategy need to have in order to be cost-effective?"The results of this investigation are intended to guide future research regarding cervical cancer screening strategies.
Two model inputs that are crucial to the precise estimation of the cost-effectiveness of new screening technologies are particularly uncertain, namely, test operating characteristics (sensitivity and specificity) and test costs. Evaluation of the sensitivity and specificity of new screening technologies is particularly problematic given the discrepancy in reference standards. As previously stated, because the clinical management of the patient depends on cytology, colposcopy, and histology taken together (and certainly not cytology alone), histologic diagnosis, or at least colposcopic diagnosis, is the most appropriate reference standard. However, given the practical and methodological difficulties in applying this reference standard, and the paucity of studies available that used such a standard, we attempted to identify other studies that met alternative criteria, as outlined above. Unfortunately, there were few studies that met even these alternative criteria, and the resulting estimates of relative true positive rates and false positive rates had very wide confidence intervals.
Our estimates for the costs associated with implementation of ThinPrep ®, AutoPap®, and Papnet® are based on a variety of sources, including the manufacturers. There is a wide range of estimates, and issues such as training costs, amortization of equipment, and labor costs are contestable. Our original estimate for one technology, based primarily on information from the manufacturer, was thought to be unacceptably low by every reviewer of the draft report except that manufacturer.
Because of the difficulty in estimating operating characteristics and costs that have face validity for all readers, we decided that presenting cost-effectiveness estimates for generic technologies across the range reported would ultimately prove more useful to readers than selecting base case estimates that were uncertain and lacked face validity. Our alternative approach is to provide cost-effectiveness estimates across a range of test characteristics and incremental costs and to identify ranges of these parameters (thresholds) that lead to cost-effective cervical cancer screening. This allows readers to use their own judgment about which costs and effectiveness measures are most appropriate. The thresholds also provide a "target" for future studies -- convincing evidence of either cost or effectiveness that meets these threshold values will be evidence for acceptable cost-effectiveness. To perform this analysis, we considered two generic strategies for improving Pap sensitivity -- 3/4improving the sensitivity of the initial screening test, or allowing 100 percent rescreening with improved sensitivity. By varying sensitivity, specificity, incremental cost, and screening frequency, we sought to identify threshold values where the incremental cost per life-year saved was within generally acceptable limits (in this case, we used a threshold value of $50,000/life-year).
To address the general question of what would constitute a substantially valuable improvement in Pap technology, we reframed the specific question to be addressed by the cost-effectiveness modeling to: "What are the ranges of incremental cost, sensitivity, specificity, and screening frequency that meet conventional levels of cost per life-year saved for technologies that improve Pap test performance by (1) improving the sensitivity of the initial screening step, or (2) allowing 100 percent rescreening at improved sensitivity?"
Also, since some decisionmakers prefer presentation of specific outcomes rather than an aggregate outcome, the incremental cost-effectiveness ratio, we pursued a secondary question: "What are the expected number of cervical cancer cases, deaths, and treatments prevented at different levels of test sensitivity and screening frequency?"
For the cost-effectiveness analysis, we constructed a model with two major components. The first component is a 20-state Markov model (Sonnenberg and Beck, 1993) intended to simulate the natural history of cervical cancer in the absence of screening. The second component is an intervention model that represents possible screening strategies. The model was developed in DATA 3.0 software (Boston, MA: TreeAge Software, Inc).
| State | Definition | Allowable Transitions |
|---|---|---|
| Well | Never infected with HPV, no history of SIL | Well, benign hysterectomy, death from other cause, undetected HPV |
| Benign hysterectomy | Hysterectomy for cause other than cervical cancer or SIL | Benign hysterectomy, death from other cause |
| Death from other cause | Death from cause other than cervical cancer | Absorbing state |
| Death from cervical cancer | Death from cervical cancer | Absorbing state |
| Undetected HPV | Undiagnosed cervical HPV infection | Well, death from other cause, benign hysterectomy, undetected HPV, LSIL |
| Detected HPV | Diagnosed cervical HPV infection or posttreatment for SIL | Well, detected HPV, death from other cause, benign hysterectomy LSIL, HSIL |
| LSIL | Low-grade squamous intraepithelial lesion | LSIL, death from other cause, benign hysterectomy, undetected HPV, detected HPV, well, HSIL |
| HSIL | High-grade squamous intraepithelial lesion | HSIL, death from other cause, benign hysterectomy, detected HPV, undetected HPV, LSIL, well |
| Unknown Stage I | Undiagnosed Stage I cervical cancer | Unknown Stage I, detected Stage I, death from other cause, unknown Stage II |
| Detected Stage I | Stage I cancer diagnosed by Pap or symptoms | Detected Stage I, death from cervical cancer (over 5 years), death from other cause |
| Stage I survivor | 5 years after initial diagnosis of Stage I | Stage I survivor, death from other cause |
| Unknown Stage II | Undiagnosed Stage II cervical cancer | Unknown Stage II, detected Stage II, death from other cause, unknown Stage III |
| Detected Stage II | Stage II cancer diagnosed by Pap or symptoms | Detected Stage II, death from cervical cancer (over 5 years), death from other cause |
| Stage II survivor | 5 years after initial diagnosis of Stage II | Stage II Survivor, death from other cause |
| Unknown Stage III | Undiagnosed Stage III cervical cancer | Unknown Stage III, detected Stage III, death from other cause, Unknown Stage IV |
| Detected Stage III | Stage III cancer diagnosed by Pap or symptoms | Detected Stage III, death from cervical cancer (over 5 years) |
| Stage III survivor | 5 years after initial diagnosis of Stage III | Stage III Survivor, death from other cause |
| Unknown Stage IV | Undiagnosed Stage IV cervical cancer | Unknown Stage IV, detected Stage IV, death from other cause |
| Detected Stage IV | Stage IV cancer diagnosed by Pap or symptoms | Detected Stage IV, death from cervical cancer (over 5 years) |
| Stage IV survivor | 5 years after initial diagnosis of Stage IV | Stage IV survivor, death from other cause |
| Stage | Definition |
|---|---|
| 0 Carcinoma in situ (corresponds to CIN III or HSIL) Ia Confined to the cervix, detectable by microscopy only, depth of invasion not greater than 3 mm in depth or 7 mm in width Ib Confined to the cervix, larger than Ia IIa Involvement of upper 2/3 of vagina, no parametrial involvement IIb Involvement of parametria but not to pelvic sidewall IIIaInvolvement of lower 1/3 of vagina but not to pelvic sidewall if parametria involved | |
| IIIb | Extension to pelvic sidewall and/or hydronephrosis or nonfunctioning kidney |
| IVa | Involvement of mucosa of bladder or rectum |
| IVb | Distant metastasis or disease outside of pelvis |
Because cervical cancer is clinically staged, findings at exploratory surgery or during subsequent treatment do not change the stage. Progression of disease also does not change the stage. In addition, imaging modalities such as computerized axial tomography (CAT) scanning or magnetic resonance imaging (MRI), while useful in determining the extent of the disease, are not permitted to be used in assigning stage.
The FIGO staging system is not used by the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute, the source of the most reliable estimates of cervical cancer incidence and mortality. SEER reports cancer stages as "Local," "Regional," and "Distant." This classification is also used in the MEDSTAT data we used to generate cost estimates associated with cervical cancer. Although cervical cancer staging does not create as many difficulties in synthesizing data for the purposes of constructing a model as does the variety of classification systems for preinvasive cervical changes, there are still some issues that must be addressed.
First, the definition of Stage 0 has changed with time. Initially, this referred only to carcinoma in situ (CIS). Subsequently, when carcinoma in situ was included as CIN3, lesions that previously would not have been included as cervical cancer cases were classified as such at some SEER sites. Adaptation of The Bethesda System further increases the potential number of cases, since lesions previously classified as CIN2 are included along with CIN3 and CIS in the HSIL classification. Because of this, cases that would not meet FIGO criteria for invasive cancer may be included in SEER data (Noller, 1996).
Second, it is unclear whether the Local/Regional/Distant classification is based on clinical, radiological, or surgical findings. Cervical cancer spreads most commonly by local extension and/or by lymphatic spread. The likelihood of pelvic lymph node metastases increases with advancing FIGO stage (DiSaia and Creasman, 1997), but if the diagnosis of nodal metastases is made on the basis of surgery or imaging modalities such as CAT or MRI, the initial FIGO stage should not change.
We did not model combinations of screening intervals--for example, every 3 years after three consecutive negative annual Pap tests as recommended by the American Cancer Society (U.S. Preventive Services Task Force, 1996). Modeling such strategies significantly increases the complexity of the model. A model derived from age-specific incidence rates in unscreened populations suggests that the age at which screening occurs, rather than the interval between screens or the number of preceding negative smears, is the most important determinant of screening efficiency (Gustafsson and Adami, 1992). This is consistent with the natural history of the disease, where the incidence and prevalence of low-grade lesions that are unlikely to progress are high in younger ages, and high-grade lesions have peak incidence and prevalence between ages 30 and 45. Thus, three negative annual smears at age 18 or 20, just when HPV incidence is peaking, might have a low predictive value for the development of lesions 15-20 years later and would not significantly improve screening efficiency.
We assumed that any cytologic result of HSIL or cancer will result in immediate referral for colposcopy and biopsy if warranted and that endocervical curettage is performed routinely as part of the colposcopic evaluation. Because not every patient will undergo endocervical curettage, this assumption may overestimate the sensitivity of colposcopy and overestimate the average cost of colposcopy. However, because we use the global estimate for the evaluation of LSIL and HSIL in the base case and vary these costs in sensitivity analysis, this assumption should not have a major impact on the cost-effectiveness estimates. We also assumed that a colposcopic or histologic diagnosis that is significantly less advanced than the cytologic result (for example, a biopsy result of LSIL in the setting of a cytologic diagnosis of cancer, or a biopsy result of normal in the setting of cytologic HSIL) would result in either a cone biopsy or a loop electrosurgical excision procedure (LEEP) (DiSaia and Creasman, 1997) after the exclusion ofdisease elsewhere in the lower genital tract.
We compared two possible strategies for the management of ASCUS and LSIL. In the base case, we assumed that all patients with LSIL or higher will receive colposcopy, whereas those with ASCUS will receive a repeat smear within 6 months, followed by a colposcopy if abnormal. Because this strategy depends on a Pap of high sensitivity (since colposcopy depends on the persistence of an abnormality), we used this in the base case. We also evaluated an "aggressive" strategy where any Pap with a result of ASCUS or higher is referred for colposcopy as the base case, since this should result in the highest detection rate. Both of these strategies are within current American College of Obstetricians and Gynecologists (ACOG) recommendations
We estimated the cost-effectiveness of cervical cytology using several outcomes. First, we calculated the cost per year of life saved. This allows comparison with other health interventions and with other cost-effectiveness analyses of cervical cancer screening. However, use of life expectancy, or life-years saved, may underestimate some of the benefits of screening. Early-stage cervical cancer is slow-growing and has a very high cure rate (approximate 85-90 percent 5-year survival for Stage I tumors). Thus, failure to detect preinvasive lesions that are then detected as Stage I tumors will not result in much decrement in life expectancy. However, the therapeutic options for all but the most minimally invasive Stage I tumors are radical hysterectomy or radiation therapy, both of which have substantial short- and long-term morbidity that may have significant impact on quality of life.
We were unable to identify any data that would allow calculation of quality-adjusted life years. We therefore calculated other outcomes that can be used to draw some inferences about morbidity. We produced estimates of cost per cervical cancer death prevented and per cervical cancer case prevented. In addition, by estimating the number of cases presenting in each stage and using data on the probability of specific treatments by stage, we also estimated the number of morbid therapies avoided using different screening strategies.
We used a U.S. health system perspective and evaluated the direct health care specific costs associated with screening, diagnosis, and treatment of cervical cancer and its precursors. We did not consider other societal costs such as direct nonmedical costs (e.g., costs attributable to work lost by patients or caregivers due to screening, diagnosis, treatment, or premature mortality). A societal perspective that included these costs would be preferable (Gold, Siegel, Russell et al., 1996); however, estimating these costs (for example, wage/wage equivalent costs, transportation time, and time lost while waiting for an appointment) would be extremely difficult. Invasive cervical cancer is a disease affecting mostly midlife and older women, whereas the much more common HPV infection and SIL, most of which do not progress to cervical cancer, are seen primarily in younger women; therefore, it seems likely that the nonhealth care costs associated with screening and treatment of SIL are greater than those associated with the diagnosis and treatment of invasive cervical cancer. If this is true, then the cost-effectiveness of screening, and of new technologies to improve screening performance, will be more favorable from a health care system perspective than from a societal perspective, especially if new technologies increase sensitivity at the expense of specificity.
We discounted costs and years of life at 3 percent annually in the base case and varied the discount rate from 0 to 5 percent in sensitivity analysis (Gold, Siegel, Russell et al., 1996).
Our model follows a cohort of U.S. women from age 15 through 85. We calculated screening strategies as described above beginning at age 15, at age 18 (as recommended by ACOG), at age 20 (Eddy, 1990), at age 35 (Gustaffson and Adami, 1992), at age 50, and at age 65 (Fahs, Mandelblatt, Schechter et al., 1992). We also varied the incidence and progression rates of HPV and SIL to reflect higher risk populations. However, it is likely that other parameters in the model would also be affected in these populations (for example, smoking, which is associated with higher rates of preinvasive and invasive cervical cancer, is also associated with a higher risk of mortality from other causes).
Squamous cancer of the cervix accounts for approximately 80-85 percent of invasive cervical cancer cases. Adenocarcinoma, which accounts for another 10-15 percent, may be increasing in incidence (DiSaia and Creasman, 1997). Cervical cytology may also be less sensitive for adenocarcinoma. However, we did not distinguish between histologic subtypes in any of our estimates for screening or treatment. This is consistent with other models.
Similarly, the clinical implications of a cytological diagnosis of ASCUS (atypical SQUAMOUS cells of uncertain significance) and AGUS (atypical GLANDULAR cells of uncertain significance) appear to be quite different. A cytological diagnosis of AGUS may be associated with premalignant or malignant changes of the cervix or uterus in as many as 30-35 percent of cases (Veljovoich, Slater, Anderson et al., 1998). The diagnostic evaluation of AGUS usually involves endometrial biopsy as well as colposcopy and endocervical curettage. When ASCUS and AGUS are combined for purposes of conditional probabilities, the proportion of true histological lesions with cytologic diagnoses of atypia is increased. Using ASCUS as our threshold has the overall effect of increasing the sensitivity of screening. Given the relatively low proportion of AGUS as a diagnosis (0.3%), this is unlikely to have a significant impact on our cost-effectiveness estimates.
Our model assumes that all women in the cohort receive the screening test at the appropriate interval and that all patients receive appropriate diagnostic and therapeutic interventions based on the results of the screening tests. Lack of appropriate followup on the part of the patient or provider accounts for a significant proportion of cervical cancer cases (Janerich, Hadjimichael, Schwartz et al., 1995), and lack of patient compliance with treatment has been shown to increase the marginal cost-effectiveness of other screening tests (Myers, Thompson, and Simpson, 1998). Between 30 and 75 percent of women with abnormal Pap smears results do not return for followup (Paskett and Rimer, 1995; Paskett, Carter, Chu et al., 1990). We did not directly test the impact of postscreening behavior on cost-effectiveness estimates. However, patient failure to return for followup visits, or inappropriate management by providers, clearly decreases the effectiveness of a screening program. The actual cost-effectiveness of any screening program will vary depending on patient and provider behavior.
We also did not test the impact of complete lack of screening on the overall cost-effectiveness of different screening strategies. A model based on Dutch data showed that increasing participation was significantly more cost-effective than decreasing screening intervals (Koopmanschap, van Oortmarssen, van Agt et al., 1990). The proportion of women with no prior screening in series of women with invasive cervical cancer is between 30 and 40 percent (Bearman, MacMillan, and Creasman 1987; Janerich, Hadjimichael, Schwartz et al., 1995; Pretorius, Semrad, Watring et al., 1991; Schwartz, Hadjimichael, Lowell et al., 1996). Current Public Health Service estimates are that 5-10 percent of women have never had a Pap smear (Martin, Calle, Wingo et al., 1996). Clearly, the issue of the appropriate allocation of resources for improving the sensitivity of current screening tests versus improving the utilization of any screening test in women who are unscreened or underscreened is important for policymakers to consider. Given reliable estimates of the costs and effectiveness of interventions designed to improve screening rates, this model could be further developed to explore this issue. However, estimating the effectiveness of programs to improve participation in screening is difficult and is beyond the scope of the current report.
Specific assumptions associated with individual model parameters are discussed in detail in the appropriate sections. Our overall aim in selecting base case estimates was to bias the model in favor of more sensitive screening strategies. In addition to the use of a health system perspective (likely creating a bias in favor of more sensitive screening strategies as mentioned above), the following assumptions were incorporated into the cost-effectiveness model:
We chose an age-specific incidence rate for cervical cancer 30 percent higher than that reported in the unscreened U.S. population of the 1930s.
We chose estimates for progression rates from HPV to LSIL, from LSIL to HSIL, and from HSIL to cancer at the higher end of the reported ranges.
We used a base case estimate of conventional Pap sensitivity that is considerably lower than that used in most previous models of cervical cancer screening.
Our cost estimates for the diagnosis and treatment of cervical cancer are based on average global diagnosis and treatment charges. Because these incorporate the full range of outcomes, including complications, they are considerably higher than estimates based on average charges for the components of care, the technique used in previous models. The higher the estimated cost of treating cancer, the greater the potential savings from etection and treatment of premalignant lesions.
We assumed that the diagnostic evaluation of abnormal cervical cytology would detect all true histological abnormalities. Failure of colposcopy and biopsy to detect true premalignant or malignant lesions would decrease the clinical effectiveness of screening, therefore increasing the cost-effectiveness ratio.
We assumed that all patients with abnormalities on cervical cytology would receive appropriate followup and treatment. Failure of the patient to return for appropriate followup, or of the provider to institute appropriate followup, decreases the clinical effectiveness of screening, again increasing the cost-effectiveness ratio.
We assumed in the base case that new technologies would increase sensitivity without any decrement in specificity.
Although Pap smears can detect vaginal or cervical infections with organisms such as Candida, Trichomonas, or Chlamydia, we did not consider the impact of detection of these diseases. Other tests are available for the detection of these organisms that are more sensitive and specific than Pap screening.
These assumptions favor strategies that increase the detection of premalignant and early malignant changes; therefore, our assumptions tend to favor more sensitive screening strategies.
The sources and rationale for specific transition probability estimates and ranges for sensitivity analysis are discussed in detail below. Some transition probabilities are based on large datasets using standard reporting methods (e.g., age-specific death or hysterectomy rates, or stage-specific mortality for cervical cancer); in such cases, estimation was relatively straightforward. However, some transition probabilities had to be estimated from smaller datasets that used nonstandard reporting; in these cases, it was difficult to estimate transition probabilities related to HPV and SIL from reports of incidence, regression, persistence, and progression rates probabilities (Miller and Homan, 1994). Regression and progression are usually reported as percentages of either prevalent or incident infections, with followup time varying widely among studies. In addition, very few data are available that allow calculation of transition probabilities for untreated cervical cancer, such as the annual stage-specific probability of symptom development or the annual probability of progression between stages. Estimates for these probabilities were derived from the literature and prior published models of cervical cancer screening, especially those of Eddy (1990) and Muller, Mandelblatt, Schechter et al. (1990), which have served as the basis for several subsequent models (Brown and Garber, 1998; Radensky and Mango, 1998; Schechter, 1996).
Age-specific hysterectomy rates from the National Hospital Discharge Survey (Lepine, Hillis, Marchbanks et al., 1997) and Maryland discharge data (Kjerluff, Guzinski, Langenberg et al., 1993) were used to estimate age-specific hysterectomy rates. We did not correct these rates for procedures performed for cervical cancer or preinvasive disease for two reasons. First, because the rates do not include radical hysterectomies for invasive cervical cancer, which have a separate International Classification of Diseases, ninth revision (ICD-9) code. Second, the proportion of procedures performed for preinvasive diseases is relatively small, less than 2 percent of all procedures over a 6-year period in North Carolina (E. Myers, unpublished data, 1998).
We did not correct for hysterectomy prevalence in our natural history model, since most population-based cancer incidence rates do not do so. However, since the relative likelihood of hysterectomy with removal of the cervix clearly affects an individual's risk for developing cervical cancer, we included this factor in the cost-effectiveness model and tested the impact of doing so on cost-effectiveness estimates. We did not specifically address continued cytological screening after hysterectomy because of questions about its effectiveness (Fetters, Fischer, and Reed, 1996) and because posthysterectomy Pap smears are screening tests for vaginal cancer, not cervical cancer.
We also did not consider the effectiveness of conventional Pap smears or the new technologies in the followup of treatment for invasive cervical cancer. Our analysis is limited to the use of cervical cytology in screening for previously undetected disease, not the detection of persistent or recurrent cancer after treatment. Costs for followup smears are included in the global estimates of treatment and followup costs.
Although not essential for the current analysis, HPV is included in the model. In this section, we document this component of our cervical cancer screening model is documented.
The model assumes that all cases of cervical cancer begin with infection with human papillomavirus. We have incorporated HPV status into the model for two reasons. First, although a small percentage of cervical cancers do not have detectable HPV DNA, even with sensitive assays, there is consensus that HPV infection is the causative agent for the vast majority of cervical cancers (Herrero, 1996; Kiviat, 1996; Koutsky, 1997; Schiffman, Bauer, Hoover et al., 1993). Second, certain HPV types are clearly more likely to progress to cancer than others, and identification of these types in cervical cells may have a role in determining optimal diagnostic and treatment strategies for patients with abnormal Pap smears (Cox, Lorincz, Schiffman et al., 1995). Although this role is still unclear, the current model can be adapted in the future to allow consideration of technologies such as HPV testing in screening for cervical cancer and its precursors. In addition, incorporating HPV data also allows our model to be used for assessing the cost-effectiveness of vaccines against HPV.
The natural history of HPV infection is complex, with clearance and persistence of viral DNA, along with progression to SIL, varying depending on the viral type, patient characteristics such as age and immune status, and study design and assay methods (Herrero, 1996; Kiviat, 1996; Koutsky, 1997; Mitchell, Tortolero-Luna, Wright et al., 1996; Schiffman, Bauer, Hoover et al., 1993). We do not distinguish between different types of HPV. Our incidence, progression, and regression estimates are averages for all viral types. The risk of developing cervical cancer after infection is clearly related to HPV type. With further data on the test characteristics of screening tests incorporating HPV typing, this model can be adapted for estimating the utility and cost-effectiveness of such tests.
| Age | Incidence (percent per year) |
|---|---|
| 15 | 0.1 |
| 16 | 0.1 |
| 17 | 0.12 |
| 18 | 0.15 |
| 19 | 0.17 |
| 20 | 0.15 |
| 21 | 0.12 |
| 22 | 0.1 |
| 23 | 0.1 |
| 24-29 | 0.05 |
| 30-49 | 0.01 |
| 50+ | 0.005 |
Estimates of regression and progression rates of HPV infection are subject to variability in study design, patient population, and viral assay techniques and are complicated by the mathematical difficulty of converting rates to probabilities. Reported regression rates for prevalent cases include 70 percent after 2 years in a cohort of adolescents and college-age women (Moscicki, Shiboski, Broering et al., 1998), 68 percent over 14 months for women under 25, and 35 percent for women over 30 (Hildesheim, Schiffman, Gravitt et al., 1994). Ho, Bierman, Beardsley et al. (1998) reported a 1-year regression rate of 70 percent. All of these results were in women with normal cytology. The overall regression rate in a large Finnish cohort was 42.8 percent over 50 months (Kataja, Syrjanen, Mantyjarvi et al., 1989; Kataja, Syrjanen, Syrjanen et al., 1990). However, determination of disease status in this group was made on the basis of cytological evidence of HPV infection. Under the Bethesda System, these would be classified as LSIL.
As with incidence, regression rates and possibly progression rates appear to vary with age. Whether this is truly related to age or is a function of duration of infection (the longer HPV persists, the less likely it is to regress) is unclear. Again, this question can be further explored with the current model. For the purposes of a cohort simulation, there is not much difference between an age-based or duration-based progression rate as long as the age-specific incidence is constant.
| Rate | Estimate | Range |
|---|---|---|
| Regression: | ||
| Age 15-24 | 0.7/18 months | 0.60-0.9 |
| Age 25-29 | 0.50/18 months | 0.45-0.6 |
| Age 30-39 | 0.25/18 months | 0.10-0.2 |
| 40-49 | 0.15/18 months | 0.10-0.2 |
| Age 50 + | 0.05/18 months | 0.05-0.2 |
| Progression: | 0.2/36 months | 0.15-0.3 |
| Proportion progressing to HSIL directly | 0.1 | 0.05-0.5 |
Because there is some uncertainty in the literature about whether treatment of SIL results in total eradication of HPV, we created a Diagnosed HPV state to represent women who have been treated for SIL. We assumed that the probability of progression was 5 percent of the age-specific progression rates. This is equivalent to a 95 percent cure rate for SIL treatment used in other models (Eddy, 1990; Fahs, Mandelblatt, Schechter et al., 1992; Schechter, 1996).
| Estimate | Range | |
|---|---|---|
| LSIL | ||
| Regression | ||
| Age 15-34 | 0.65/72 months | 0.6-0.8 |
| Age 35 + | 0.4/72 months | 0.3-0.6 |
| Progression | ||
| Age 15-34 | 0.1/72 months | 0.1-0.3 |
| Age 35 + | 0.35/72 months | 0.3-0.5 |
| HSIL | ||
| Regression | 0.35/72 months | 0.3-0.5 |
| Progression | 0.4/120 months | 0.3-0.5 |
Because the model is based on transitions from true histological states of LSIL, HSIL, and invasive cancer and Pap smear results are categorical, the probability of a particular cytological diagnosis for a given histological state requires estimates beyond "sensitivity" and "specificity" for cytology.
In the model, any Pap smear result of ASCUS or above is considered abnormal. For patients without histological changes (Well, Unknown HPV Infection, and Detected HPV Infection), the probability of a normal smear will be the specificity of the Pap smear or other test for a cytological threshold of ASCUS or above, since specificity is defined as the probability of a normal test result given no pathology, whereas the probability of an abnormal result, or false positive, is 1.0 minus the specificity.
For patients with histological changes (LSIL and higher), our base case estimate of the overall probability of an abnormal result is equal to our sensitivity estimate, using an ASCUS threshold, from the three studies by Baldauf, Dreyfus, Lehmann et al. (1995); Davison and Marty (1994); and Hockstad (1992). This estimate was 51 percent. We varied the sensitivity up to 78 percent, the mean estimate of the meta-analysis.
For the base case for conventional Pap smears, we used an estimate of specificity based on the three studies of the use of conventional Pap smear in primary screening that were unaffected by verification bias (Baldauf, Dreyfus, Lehmann et al., 1995; Davison and Marty, 1994; Hockstad, 1992). This estimate is 97 percent, which we varied down to 0.88 percent, the mean specificity estimate for all of the studies included in the meta-analysis. For the base case, we assumed that new technologies would have no impact on specificity. We varied the relative specificity of the new technologies from a 50 percent increase in specificity to a 50 percent decrease in specificity.
We assumed that the sensitivity of the screening test is independent of the underlying histology. In other words, the sensitivity using an ASCUS threshold is identical for LSIL, HSIL, or cancer. This assumption may not be true. This is a complex issue that would be difficult to model in a useful way. In keeping with other models, we have assumed constant sensitivity and specificity across histological states. Misclassification of cases, whether to a higher or lower stage, could significantly alter procedure or cost parameters.
| Cancer Stage | Progression Rate | Annual Probability of Symptoms | Proportion of Cases in Unscreened Population Predicted by Model | Proportion Reported in Literature |
|---|---|---|---|---|
| I | 0.9/4 years | 0.15 | 0.463 | 0.34-0.55 |
| II | 0.9/3 years | 0.225 | 0.270 | 0.13-0.28 |
| III | 0.9/1.25 years | 0.6 | 0.181 | 0.12-0.19 |
| IV | N/A | 0.9 | 0.085 | 0.06-0.10 |
| Treatment | Stage I | Stage II | Stage III | Stage IV |
|---|---|---|---|---|
| Simple hysterectomy | 0.146 | 0.006 | 0.004 | 0.003 |
| Radical hysterectomy | 0.355 | 0.043 | 0.013 | 0.009 |
| Radiation therapy | 0.204 | 0.647 | 0.537 | 0.398 |
| Radiation therapy and surgery | 0.146 | 0.072 | 0.046 | 0.017 |
| Radiation therapy and chemotherapy | 0.022 | 0.161 | 0.320 | 0.307 |
| Chemotherapy only | 0.002 | 0.004 | 0.004 | 0.103 |
| Other therapy | 0.038 | 0.028 | 0.027 | 0.020 |
| No Therapy | 0.087 | 0.039 | 0.049 | 0.143 |
These proportions were used to calculate the number of specified procedures resulting from each diagnosed case of a specific stage of cervical cancer predicted by the model. These proportions were also used to generate estimates of stage-specific treatment costs (see below).
Survival probabilities at 1, 2, 3, 4, and 5 years postdiagnosis for each stage were obtained from PCE data (Fremgen, personal communication; Jones, Shingleton, Russell et al., 1995). These values were chosen because they represent data from a wide range of facilities treating women with cervical cancer. Five-year survival rates based on these data were Stage I, 86.0 percent; Stage II, 62.5 percent; Stage III, 37.9 percent; and Stage IV, 11.3 percent.
We assumed that there was no cancer-related mortality after 5 years. Although the PCE data show some deaths after 5 years in all stages, they are relatively rare compared with those in the first 5 years. Other models have also used 5-year survivals. The PCE data are disease-specific; patients are also at risk for other causes of death during the 5-year postdiagnosis period.
Mortality from causes other than cervical cancer was estimated by subtracting age-specific cervical cancer mortality rates from age-specific all-cause mortality rates using U.S. life tables for 1995 (obtained from the National Center for Health Statistics World Wide Web page: http://www.cdc.gov/nchswww/daa/gm291_1.pdf).
We derived our estimates for conventional Pap sensitivity and specificity from the systematic literature review and meta-analysis. For sensitivity and specificity of the new technologies, we used articles reviewed under the revised criteria, the review of Brown and Garber (1998), and estimates provided by the manufacturers. The derivation of these estimates is described in greater detail below.
For conventional Pap smears, we used estimates of sensitivity and specificity from the meta-analysis for a cytological threshold of ASCUS and a histological diagnosis of LSIL or higher. For the base case, we used an estimate derived from the three studies not affected by verification bias (Baldauf, Dreyfus, Lehmann et al., 1995; Davison and Marty, 1994; Hockstad, 1992). These values have a sensitivity of 51 percent and a specificity of 97 percent. We varied this level up to the mean values of the test characteristics from the meta-analysis, 78.8 percent for sensitivity and 88.8 percent for specificity. However, these estimates, like those used in all previous models, assume that there are no joint effects for sensitivity and specificity, an assumption that is not confirmed by the meta-analysis.
The prevalence of underlying cervical neoplasia was a significant predictor of test sensitivity in both the meta-analysis of Fahey, Irwig, and Macaskill (1995) and in ours. We used sensitivity estimates based on fairly low prevalence estimates for the base case. We also did not account for the effect of increasing the screening interval on Pap sensitivity: Because prolonging the screening interval should increase the prevalence of the disease, Pap test performance should be better at longer screening intervals compared with shorter. We made no adjustment for this bias against longer screening intervals.
We assumed in the base case that the combination of history, pelvic examination, and Pap have a sensitivity of 100 percent in the diagnosis of Stage II, III, or IV cervical cancer (cancer that has spread beyond the cervix). Because, by definition, these tumors involve the vagina and/or parametrial tissues, the combination of a speculum and bimanual examination and cervical cytology should, in theory, detect all disease. This is especially true because cervical cancer is clinically staged. Therefore, the failure to detect malignancy cannot be attributed solely or, arguably, even primarily to a failure of cytology. We used a sensitivity of 85 percent as the lower bound.
The adequacy of a cervical slide for interpretation can be affected by the presence of inflammatory cells, bloods, vaginal or cervical secretions, the number or type of cells collected, and other factors. The Bethesda System requires a statement of adequacy by the pathologist, along with the reason for any inadequacy. One category used is "Satisfactory but limited by" (SBLB)," which states the reason the interpretation of the slide may be compromised. The American College of Obstetricians and Gynecologists recommends that the decision to repeat a smear in such cases be made at the discretion of the individual practitioner.
A technology that improves the adequacy of the specimen may be superior to conventional techniques in two ways. First, the sensitivity of the smear may be enhanced because the slide may be easier to read and interpret. Second, the number of inadequate slides, and subsequent need for repeating the smear, may be reduced. New thin-layer technologies may produce both types of improvement. For example, in a large study of ThinPrep®, the number of "Satisfactory but limited by" smears was reduced from 27.8 percent with conventional smears to 19.8 percent (Lee, Ashfaq, Birdsong et al., 1997). However, of note, the number of smears that were inadequate because of a lack of endocervical cells was higher in the ThinPrep® group in this same study (15.8 percent versus 9.4 percent for conventional smears). Many clinicians would argue that the absence of endocervical cells is of potentially greater significance than "Satisfactory but limited by" smears, since it may indicate that sampling of the transformation zone was inadequate or that lesions were missed in the endocervical canal. Although this finding may be a result of technical factors in the design of the study, the main issue is that the clinical significance of an "unsatisfactory" slide is widely variable.
Although we do not specifically address the issue of specimen adequacy in our model, we do address its consequences. First, we assume that the primary screening technology has improved sensitivity compared with conventional Pap. Second, reducing the number of "Satisfactory but limited by" smears may reduce costs by reducing the number of repeat smears. However, we are unaware of any data on the actual proportion of "Satisfactory but limited by" smears that are repeated in clinical practice. This will probably vary widely, depending on patient populations, perception of malpractice risk, reimbursement issues, etc. Previously published models of cervical cancer screening suggest that the risk of an undetected premalignant or malignant lesion in a patient with a history of regular normal smears is very small. It is likely that the number of repeat smears actually saved by reducing the "Satisfactory but limited by" category is less than the full 10 percent reported by Lee, Asfaq, Birdsong et al. (1997). There are several reasons for this. First, the compliance rate with followup for Pap smears that are truly abnormal is only 30-70 percent (Paskett and Rimer, 1995). Second, the most compliant patients are likely to be those at lowest risk. Third, ACOG has left the decision to repeat the smear up to the individual practitioner. A small pilot study by Paskett, Carter, Chu et al (1990) found that patient perception of the accuracy and severity of an abnormal report was a significant predictor of the likelihood of returning for followup. Even in the best case scenario, where 100 percent of providers would order a repeat smear for a "Satisfactory but limited by" result, the actual reduction in repeat smears would not be 0.08 (0.278-0.198) but instead would be between 0.027 (0.3 X 0.08) and 0.063 (0.7 X 0.08). In any event, the effect on the need for repeat specimens is primarily a cost issue, which we address by decreasing the extra cost of the new technology in sensitivity analysis.
Because our estimates of conventional Pap smear sensitivity and specificity are based on a histological gold standard, there are limited data that allow direct comparison of the new technologies with conventional Pap smears. We have modeled the potential impact of these technologies on the cost-effectiveness of various screening strategies by assigning two separate sensitivity and specificity estimates, one for initial screening and one for rescreening. We assumed that 10 percent of all slides initially read as normal using either a conventional Pap or a hypothetical new technology will be rescreened. We also assumed that the new technology improves the sensitivity of the initial screening step with no decrement in specificity. We incorporated the cost of rescreening 10 percent of slides into the global cost estimate for Pap tests and further assumed that, with either of these procedures, the sensitivity and specificity of the rescreening step is identical to the initial step. We also tested the impact of 100 percent rescreening using an automated rescreening device, again with improved sensitivity and no decrement in specificity. Because of significant uncertainty about the actual changes in sensitivity and specificity of the new technologies, we have modeled their test characteristics as functions of the sensitivity and specificity of the conventional Pap smear.
We approached the problem of calculating improvement in test characteristics in two ways:
As part of the reevaluation of new technology studies, we calculated relative true positive rates for studies where relative performance could be estimated. The relative TPR is the ratio of the true positive rate of a new technology to the true positive rate of the conventional test (TPRnt/TPRpap). For the studies of Roberts, Gurley, Thurloe et al. (1997) and Bolick and Hellman (1998), we calculated a relative TPR for ThinPrep® of 1.16, with confidence limits from 0.9 to 1.4.
From the review of Brown and Garber (1998), we calculated the reduction in false negative rate based on their base case estimates of sensitivity and specificity for conventional Pap smears and the new technologies. Sensitivity of the new technology was related to the sensitivity of conventional Pap smear as a function of the parameter x using the following formula:
Sensitivity of New Technology = Sensitivity of Conventional Pap
+
[x * (1-Sensitivity of Conventional Pap)]
(equation
a).
This variable, x, is useful for modeling probabilities. We refer to this variable in the Results section as the reduction in false negative rate. For example, if x = 0.4, then there is a 40 percent reduction in false negative rate.
Note that since TPR is the same as test sensitivity, x is algebraically related to relative TPR and sensitivity of conventional Pap smear. That is,
x = [Sensitivity of Conventional Pap * (relative TPR -1)] /
(1-Sensitivity of Conventional Pap)
(equation b).
For example, Bolick and Hellman (1998) calculated a sensitivity of 0.94 for ThinPrep® and 0.85 for conventional Pap. The relative true positive rate would be 1.11 (0.94/0.85). On the basis of equation a, these numbers correspond to a reduction in FNR (x) of 0.6, since
0.94 = 0.85 + [x * (1- 0.85)],
so,
x = (0.94 - 0.85) / 0.15 = 0.6.
From equation b, the reduction in FNR would also be 0.6, since
[0.85 * (1.11-1)]/(1 - 0.85) = 0.6.
| Source | Thin Prep® Sensitivity | Conventional Pap Sensitivity | Calculated Reduction in False Negative Rate |
|---|---|---|---|
| Roberts, Gurley, Thurloe et al (1997) | Relative TPR 1.13 | NA | 0.91 (assumes same sensitivity and specificity as Bolick and Hellman, 1998) |
| Bolick and Hellman, (1998) | 0.94 | 0.85 | 0.6 |
| Brown and Garber (1998) | 0.91 | 0.8 | 0.55 |
We therefore use 0.6 as our base case reduction in FNR for the initial primary step. For sensitivity analysis, we varied the estimate of reduction in FNR from 0.4 to 0.9. This range is consistent with the "50 percent increase in detection of disease" cited by the manufacturer (R.Silverman, personal communication to D.C. McCrory, 1998 ).
Because these values are calculated relative to the sensitivity of the conventional Pap testing using adjudicated review, they should, in theory, be independent of the actual sensitivity of the conventional Pap testing. In fact, this argument is the justification for not requiring histological confirmation studies for FDA approval of these technologies. Therefore, in our model, the base case sensitivity of the initial screening technology will be
0.51 + (0.6 * 0.49) = 0.804.
We assume that 10 percent of slides will be rescreened at the same improved sensitivity, for an overall sensitivity of
0.51 + (0.6 * 0.49) + (0.1 * 0.6 * 0.49) = 0.833.
It is important to note that these estimates do not distinguish test sensitivity for screening and rescreening, nor do they differentiate sampling error and interpretive error. Test performance is likely to be different in a rescreening application, since the population subjected to rescreening is different from the primary screening population. For practical purposes, if sensitivity is improved sufficiently to make a new screening technology cost-effective, then it should not matter if that differential is due to technological superiority or difference in the type of error that is reduced. However, to examine the impact of this difference, we model differential sensitivities between initial and rescreening testing. Regarding different types of test errors, sampling errors may result from either poor technique on the part of the person obtaining the smear, or placement of cells on the slide which may not be representative of the potential areas of abnormality. Determining the relative effect of errors in technique (inadequate visualization of the cervix, inadequate sampling of the transformation zone, failure to clear off blood and mucous prior to obtaining the smear) versus errors in cell sampling from the collection device itself would require more histological confirmation than is currently available. Again, to permit examination of this issue, we model the effect of changes in differential sensitivities between initial and rescreening technologies.
For rescreening technologies, the overall improvement in sensitivity can be calculated as either
Sensitivity Conventional Pap + {1.0 * [Sensitivity Conventional Pap +x
(1-Sensitivity
Conventional Pap)] * (1-Sensitivity Conventional
Pap)}
or as
Sensitivity Conventional Pap + [1.0 * Sensitivity Rescreening Device *
(1-Sensitivity
Conventional Pap)].
| Source | Rescreening Sensitivity | Calculated Reduction in False Negative Rate |
|---|---|---|
| AutoPap® (Rescreening Only) | ||
| Stevens, Milne, James et al. (1997) | 0.43 | 0.43 |
| Patten, Lee, Wilbur et al. (1997a) | 0.51 | 0.51 |
| Colgan, Patten, and Lee (1995) | 0.663 | 0.663 |
| Brown and Garber (1998) | 0.77 | 0.77 |
| Papnet® | ||
| Kaufman, Schreiber, and Carter (1998) | 0.38-0.41 | 0.38-0.41 |
| Jenny, Isenegger, Boon et al. (1997) | 0.89 | 0.89 |
| Brown and Garber (1998) | 0.88 | 0.88 |
Again, we chose 0.6 as the base case for the rescreening technology and varied it between 0.4 and 0.9. The base case estimate was chosen because it was in the mid-range of estimates. By beginning at identical values, identification of thresholds between the two types of technologies was simplified.
Our base case assumes that the new technologies do not affect the specificity of Pap screening. We varied the relative specificity of the technologies from 0.9 to 1.0. However, since these are relative sensitivities and specificity, one would expect that the actual test characteristics of the new technologies will vary depending on the "true" sensitivity and specificity of conventional Pap testing when a histological gold standard is used. Thus, the lower the "true" conventional sensitivity, the lower the "true" sensitivity of the new technology.
Another issue is the relative contribution of sampling errors (failure to adequately collect abnormal cells) and interpretation errors (overlooking or falsely classifying abnormal cells on a smear) to the overall sensitivity of the test. There are several considerations. First, a certain unknown percentage of sampling errors will be attributable to the provider obtaining the smear; changes in Pap technology will not improve the ability of a provider to adequately visualize, clean, and sample the cervix. Second, although sampling errors may be more common, the relative impact on overall test sensitivity of different technologies would need to be demonstrated by direct comparison of smears obtained from conventional Paps, monolayers, and selected by rescreening devices, preferably with histological confirmation. It is difficult to see how the reduction in sampling error for a given technology can be determined without histological, or at least colposcopic, confirmation. We therefore did not consider sampling versus interpretive errors in the base case analysis. However, because it is possible that a differential impact on false negatives due to sampling or interpretation could have an impact on the relative cost-effectiveness of different technologies, we tested this in sensitivity analysis. We divided the false negative rate, 1-Sensitivity Conventional Pap, into two components as follows:
False Negative Rate =
[Relative Contribution Sampling Error *
(1-Sensitivity)] +
[Relative Contribution Interpretive Error *
(1-Sensitivity)].
We assumed in this sensitivity analysis that a rescreening technology would not affect false negatives attributable to sampling, but that a primary screening technology such as ThinPrep® could potentially reduce both types of errors. Again, the relative contributions of sampling and interpretive errors on the relative sensitivities of different technologies can ultimately only be answered by direct comparison of the technologies, preferably with colposcopic and histological confirmation.
Cost estimates for cervical cancer screening and for the diagnosis and treatment of LSIL, HSIL, and invasive cervical cancer were derived primarily from the review of MEDSTAT data conducted by Health Economics Research, Inc.
| Test | Base Case | Range | Medicare Cost |
|---|---|---|---|
| Conventional Pap, 20-64 | $38.68 | $35.32-43.7 | |
| Conventional Pap, 65+ | $47.73 | $44.24-56.98 | $35.01 (range: $34.55-36.92) |
| New technology which improves primary screening sensitivity | Incremental cost: $10.00 | $5.00-15.00 | Not available |
| New technology which improves rescreening sensitivity | Incremental cost: $10.00 | $5.00-15.00 | Not available |
Diagnostic costs were estimated in two ways. First, the mean cost for diagnosis and treatment of LSIL, HSIL, and Stages I, II, III, and IV cervical cancer from MEDSTAT data was used to assign a global diagnosis and treatment cost to each level of pathology. Sensitivity analyses were run using the 25th and 75th percentiles as the minimum and maximum values. The MEDSTAT dataset did not contain information specific to FIGO stage, only World Health Organization (WHO) stage (Local, Regional, Distant). We assumed that "local" disease equaled Stage I, "regional" equaled Stage II or III, and "distant" equaled Stage IV.
| Diagnostic Procedure | Base Case | Range | Medicare Cost |
|---|---|---|---|
| Colposcopy + biopsy + endocervical curettage | $375 | $244-477 | $244 |
| Colposcopy + biopsy only | $276.57 | $195-351 | $172.63 |
| LEEP | $564 | $352-710 | $205 |
| Cone biopsy | $919 | $623-1114 | $409 |
| Procedure | Average Cost | Stage I | Stage II | Stage III | Stage IV |
|---|---|---|---|---|---|
| Examination under anesthesia | $201 | 0.414 | 0.458 | 0.462 | 0.462 |
| Cystoscopy | $982 | 0.201 | 0.388 | 0.454 | 0.454 |
| Proctoscopy | $774 | 0.152 | 0.281 | 0.316 | 0.316 |
| Chest X-ray | $43 | 0.727 | 0.781 | 0.779 | 0.779 |
| Intravenous pyelogram | $339 | 0.313 | 0.327 | 0.311 | 0.311 |
| Barium enema | $108 | 0.136 | 0.199 | 0.213 | 0.213 |
| Bone scan | $236 | 0.045 | 0.112 | 0.136 | 0.136 |
| CT scan | $312 | 0.408 | 0.696 | 0.768 | 0.768 |
| MRI scan | $563 | 0.046 | 0.067 | 0.071 | 0.071 |
| Lymphangiogram | NA | 0.013 | 0.032 | 0.039 | 0.039 |
| Fine needle biopsy | NA | 0.017 | 0.035 | 0.042 | 0.042 |
NA = not available.
| Cervical Cancer Stage | Base Case | Range | Medicare Costs |
|---|---|---|---|
| I | $714 | $274-1042 | $348 |
| II | $1138 | $429-1642 | $520 |
| III | $1257 | $470-1837 | $563 |
| IV | $1257 | $470-1837 | $563 |
Because MEDSTAT data did not include professional charges for anesthesia for pelvic examination under anesthesia, or charges for lymphangiography or fine needle biopsy, the diagnostic costs are likely to underestimate the true average costs of diagnosis of cervical cancer to some extent.
| Pathological Findings | Base Case | Range |
|---|---|---|
| LSIL | $1728 | $675-2274 |
| HSIL | $3049 | $1384-4203 |
| Stage I | $17,645 | $9,439-21,946 |
| Stage II-III | $27,069 | $11,734-29,089 |
| Stage IV | $40,280 | $12,670-46,509 |
| Therapeutic Procedure | Base Case | Range | Medicare Costs |
|---|---|---|---|
| LEEP | $564 | $352-710 | $205 |
| Cryotherapy | $156 | $108-184 | $111 |
| Laser ablation | $730 | $390-976 | $205 |
As noted elsewhere in the report, mean costs for cervical cancer treatment were often substantially higher than median costs, especially for more advanced procedures. This probably represents the effects of varying severity of disease within stages or of complications of treatment, which substantially increase costs. We have chosen to use mean costs for our base case estimate in order to include the economic impact of varying disease severity, comorbidities, and complications. Because of the extremely wide standard deviations resulting from the effect of outliers, we use the 25th and 75th percentiles as endpoints in sensitivity analysis.
| Treatment | Number of Inpatient Procedures | Number of Outpatient Procedures |
|---|---|---|
| Simple hysterectomy only | 1 | 0 |
| Radical hysterectomy only | 1 | 0 |
| Surgery + radiation | 1 Radical hysterectomy or 1 Simple hysterectomy + 2 radiation treatments | 20 outpatient |
| Radiation + chemotherapy | 2 radiation treatments and 0 inpatient chemotherapy, or 6 inpatient chemotherapy | 6 outpatient (if no inpatient) |
| Chemotherapy | 6 | 6 (if no inpatient) |
| Radiation therapy | 2 | 20 |
The cost estimates resulting from these assumptions about treatment regimens for each stage were then multiplied by the proportion of patients in each stage receiving each treatment in the American College of Surgeons PCE dataset to arrive at the following weighted averages for treatment of cervical cancer (Tables 17 and 18):
| Stage | Procedure-Specific Estimates | MEDSTAT Global Estimate | ||
|---|---|---|---|---|
| Diagnosis | Treatment | Total | ||
| I | $714 | $12,243 | $12,957 | $17,645 |
| II | $1138 | $15,244 | $16,382 | $27,069 |
| III | $1257 | $15,244 | $16,501 | $27,069 |
| IV | $1257 | $17,431 | $18,688 | $40,280 |
| Stage | Base Case | Range | Medicare |
|---|---|---|---|
| I | $12,243 | $7320-15185 | $7724 |
| II | $15,244 | $7196-21,660 | $10,105 |
| III | $15,244 | $7526-25,144 | $10,159 |
| IV | $17,431 | $9296-34,365 | $8,001 |
The MEDSTAT data do not allow direct attribution of a cost associated with terminal care for cervical cancer patients. Eddy (1990) estimated a cost of $22,150 in 1989 dollars, based on 1987 Medicare data. The MEDSTAT data do report a category of "All stages (final diagnosis Stage 4)," with an average incremental cost of $16,530 above the average cost for Stage IV cancers. It is unclear whether these represent terminal care, recurrent cases, or progressive cases, since stage, once assigned, should not change.
Eddy's cost, inflated to 1997 dollars, is $34,805. This figure represents charges. Since we use a 0.5 cost/charge ratio, the MEDSTAT costs above represent charges of $33,060. Because of the similarity of these figures, we assume that the cost of terminal care for cervical cancer is $16,530 and assign this cost to all patients dying of cervical cancer.
Epidemiological data on the incidence and prevalence of cervical cancer and its precursors (LSIL, HSIL, and HPV infection) were used to calibrate the cost-effectiveness model.
Converting reported incidence, regression, persistence, and progression rates for HPV and SIL to reliable transition probabilities for the model was difficult because of the way most of these values are reported in the literature. Regression and progression rates are usually reported as percentages of either prevalent or incident infections, but followup time varied widely among studies. Data are rarely provided in a way that easily allows conversion of these rates to yearly transition probabilities (Miller and Homan, 1994). In addition, very few data are available that allow calculation of transition probabilities for untreated cervical cancer, such as the annual stage-specific probability of symptom development or the annual probability of progression between stages. Estimates for these probabilities were derived from the epidemiological literature and prior published models of cervical cancer screening, especially those of Eddy (1990) and Muller, Mandelblatt, Schechter et al. (1990).
The model was calibrated by adjusting natural history parameters to recreate the age-specific incidence of cervical cancer in an unscreened population (Figure 10
Gustafsson, Ponten, Bergstrom et al. (1997) described two separate curves, one with a peak between ages 40 and 50 with a more rapid decline, and one similar to the U.S. data with a peak between ages 50 and 65 and a more gradual decline. Their modeling suggests that some of the difference results from increased incidence in successive birth cohorts. The early peak curves primarily were observed in western European countries with data collected predominantly between 1950 and 1975. Conversely, the U.S. data comes from Connecticut between 1935 and 1949. It is likely that early sexual activity, and therefore exposure to HPV, was more frequent in western Europe and Scandinavia during the 1950s and 1960s than in 1940s Connecticut. In addition, there are likely to be differences in access to care, or willingness to seek care for symptoms, across time and cultures.
We elected to use values resulting in an earlier onset for the base case scenario. Prevention of fatal disease occurring at younger ages results in greater increases in life expectancy compared with prevention of disease at older ages; thus, our base case model is biased in favor of screening strategies that have increased sensitivity.
The peak incidence of cervical cancer used in the model (81/100,000) is somewhat higher than that observed in an unscreened U.S. population in the 1930s (approximately 60/100,000) but lower than that seen in developing nations (Gustafsson, Ponten, Bergstrom et al., 1997). Because there have been significant increases in sexual activity in adolescents over time (Centers for Disease Control and Prevention, 1991), it seems likely that the actual expected rate in an unscreened population would be higher than the rate observed 60 years ago, especially since mortality from other causes, particularly in younger women, has declined. This higher incidence estimate is consistent with our choice of other base case estimates for the model, in that we have tried to bias the model in favor of screening strategies with increased sensitivity compared with conventional Pap smears at longer screening intervals.
In an unscreened population, age-specific incidence of cervical cancer will be dependent both on the rate of progression of the disease and the stage-specific probability of developing symptoms that lead to diagnosis. Very few data are available to validate parameter estimates for rates of progression between stages of cancer or the probability of developing symptoms in any given stage. Therefore, these parameters were adjusted to result in a distribution of cases by stage that approximates that seen in patients without Pap smear screening (Bearman, MacMillan, and Creasman, 1987; Eddy, 1990). These proportions are also similar to those observed in larger series of cervical cancer patients (Jones, Shingleton, Russell et al., 1995; Pretorius, Semrad, Watring et al., 1991), presumably because at least 30-40 percent of cervical cancer patients have never been screened or have not been screened within the past 5 years.
The American College of Obstetricians and Gynecologists formed the "private" half of the "public-private partnership" fostered by AHCPR through the Evidence-based Practice Centers initiative. ACOG played a significant role in this study by nominating evaluation of cervical cytology as a task order topic for funding through the EPC program. Dr. Stanley Zinberg, ACOG's Vice President of Practice Activities, is a member of the study's Advisory Panel of Technical Experts and Peer Review Panel. Through Dr. Zinberg's participation on these panels, ACOG was involved at key decision stages of the study, providing consultation on the key literature search questions and reviewing the first drafts of the evidence tables, the plans for the cost-effectiveness analysis and meta-analysis, and the draft evidence report. ACOG also assisted in writing a dissemination plan for this evidence report for AHCPR.
Internal and external quality control mechanisms were incorporated into each phase of the development of the evidence report.
Quality control procedures were integrated into each step of the literature review. The search strategies used were checked and finalized by two professional medical librarians. For additional checks on completeness, the articles cited in reference lists of reviewed articles, particularly meta-analyses, were compared with those in the Pro-Cite literature database developed for this evidence report, and when appropriate, articles were copied, reviewed, and added to the Pro-Cite database. Further, the manufacturers of AutoPap®, Papnet®, and ThinPrep® were asked to submit articles and other documents that met the specified inclusion criteria for the evidence report.
With regard to the content of the evidence tables and data analyses, quality control procedures included: screening of all articles by at least two clinicians, abstracting of information into the evidence report by two clinicians, and independent construction of each 2-by-2 table by two clinicians and over-reading by a third. No data or other information was inserted into the evidence table or data analyses without absolute agreement by two, and often three, clinicians. Further, once the content was agreed upon by the clinicians, the information for the evidence table and data for the meta-analysis and cost-effectiveness analysis were double-entered and compared, and any discrepancies reconciled. Where appropriate, kappa statistics were calculated to show the strength of agreement between clinician-reviewers, particularly in the article screening stage and the construction of the 2X2 tables. When the strength of agreement was low, all reviewers met to discuss the nature of the discrepancies and to come to common agreement on the interpretation of the criteria and their application to the articles as well as agreeing on responses to articles that were in a "gray" area. In the latter case, the decision was to err on the side of including the article, recognizing that the data abstraction process and the development of the 2X2 tables would further illuminate the inclusion/exclusion status of the article.
Quality control in the cost-effectiveness analysis was provided by extensive sensitivity analyses, by validation with epidemiological data, and by comparisons with previously published cost-effectiveness models.
Two external oversight and review panels were established. The Advisory Panel of Technical Experts was initiated early in the 11-month time frame for the study. The Advisory Panel of Technical Experts reviewed progress on the evidence report at four key stages of its development: the identification of the key literature search questions, the first drafts of the evidence tables, the plans for the cost-effectiveness analysis and meta-analysis, and the draft of the evidence report. The Technical Experts were sent draft documents that were discussed as a group in two conference calls and also in individual phone calls and written communications. The eight-member Panel consisted of clinical and methodological experts in relevant specialty areas, including clinical and laboratory pathology, obstetrics/gynecology, family practice, oncology, women's health, meta-analysis, cost-effectiveness analysis, and epidemiology. Consumers were represented on the Advisory Board of Technical Experts by Family Health International, a not-for-profit organization committed to improving the health of women and children; helping women and men have access to safe, effective, acceptable, and affordable family planning methods; and preventing the spread of AIDS and other sexually transmitted diseases (STDs).
The primary function of the second group, the Peer Review Panel, was to review and comment on the complete draft of the evidence report. The 23-member panel, which included AHCPR staff and the Advisory Panel of Technical Experts, consisted of clinical and methodological experts in relevant specialty areas, including clinical and laboratory pathology, obstetrics/gynecology, family practice, oncology, women's health, meta-analysis, cost-effectiveness analysis, and epidemiology, as well as the manufacturers of the Pap screening devices reviewed. The report was modified on the basis of their comments, with close attention to relevant studies not included in the draft report, misinterpretation of findings, and other issues deserving revision within the constraints of the methodology, time frame, and budget. The format for the report was designed by AHCPR.
This section describes results on the diagnostic accuracy of cervical cytology screening tests, the meta-analysis of Pap test accuracy, and cost estimates of screening, diagnosis, and treatment of cervical cancer and its precursors. A description is also provided of the review of studies and models of effectiveness and cost-effectiveness of cervical cancer screening. Finally, results are provided from the cost-effectiveness analysis.
| Total Number of Articles | 939 |
| Articles eliminated in initial screening | 561 |
| Articles eliminated in second screening | 292 |
| Articles included and abstracted | 86 |
Because very few studies of the new technologies met the original Step 1 and Step 2 screening criteria (no studies of AutoPap®, one study of Papnet®, and one study of ThinPrep®), we modified the criteria to include studies pertaining to any of the new technologies that used a cytological reference standard if applied by an independent panel and if at least half of high-grade cytological results were verified histologically, as suggested in guidelines produced by the Intersociety Working Group for Cytology Technologies (1998). We further modified the screening criteria to include studies that failed to verify diagnosis in patients negative on two independent tests being compared. Though such studies do not estimate sensitivity and specificity, they can still provide estimates of relative TPR and relative FPR, as described by Chock, Irwig, Berry et al. (1997).
We considered a total of 59 studies (12 on AutoPap®, 27 on Papnet® and 20 on ThinPrep®) during this final stage of the screening process (Step 3). Forty-three of these articles had initially been excluded in Step 2, and 16 were brought to our attention at this time by reviewers or manufacturers (7 unpublished manuscripts and 2 published only in abstract form). Ultimately, we decided to describe in the evidence table and the text of this report all the studies considered for Step 3 that also met the criteria requiring the application of a concurrent reference standard (histology, colposcopy, or cytology). The net result was the inclusion of 6 studies of AutoPap®, 11 of Papnet®, and 8 of ThinPrep®.
Six studies of either AutoPap 300 QC® or the AutoPap Primary Screening System® permitted estimates of sensitivity. Little information was available on which to base an estimate of the specificity of rescreening. In three studies of the performance of AutoPap 300 QC® in slides manually screened as negative, the estimated sensitivity for detecting ASCUS or worse at the 20 percent review rate was 0.43 (Stevens, Milne, James et al., 1997), 0.51 (Patten, Lee, Wilbur et al., 1997a), and 0.663 (Colgan, Patten, and Lee, 1995). The estimated sensitivity for detecting LSIL or worse at the 20 percent review rate was 0.66 and 0.77 in the latter two studies. Studies suggested that sensitivity at the 20 percent review rate for ASCUS or worse was higher when these technologies were used for screening than when they were used for rescreening: 0.86 (Wilbur, Bonfiglio, Rutkowski et al., 1996) for the AutoPap 300 QC®, and 0.86 (Wilbur, Prey, Miller et al., 1998) and 0.92 (Lee, Kuan, Oh et al., 1998) for the AutoPap Primary Screening System®. The estimated sensitivity for detecting LSIL or worse at the 20 percent review rate was 0.92 (Wilbur, Prey, Miller et al., 1998). An estimate of specificity of 0.98 was obtained for screening use of the AutoPap Primary Screening System® (Lee, Kuan, Oh et al., 1998); however, this estimate was based on initial screening rather than rescreening. Because the spectrum of disease is different in patients whose smears are initially manually screened as negative compared with patients undergoing initial screening, this specificity estimate may be inaccurate. Furthermore, the Primary Screening System uses a somewhat different algorithm for classification than does the AutoPap 300 QC® and therefore could be expected to have a different specificity.
| Quality Criterion | Number of Studies (N = 84) | Percent |
|---|---|---|
| Reference standard | ||
| Histology | 70 | 83 |
| Histology or negative colposcopy | 14 | 17 |
| Independence of assessments Blinded Not blinded | 22 62 | 26 74 |
| Verification All test positives and test negatives verified Test positives and random fraction of test negatives verified Test positives and selected sample or none of test negatives verified | 23 3 58 | 27 4 69 |
| Study sample Consecutive or random Other | 73 11 | 87 13 |
| Spectrum of disease/nondisease Defined Not defined | 9 75 | 11 89 |
| Publication type Paper Abstract | 84 0 | 100 |
| Industry relationship Neither done nor supported by a manufacturer Supported by a manufacturer Done by a manufacturer | 82 1 1 | 98 1 1 |
Data were available according to different combinations of test thresholds and reference standard thresholds: ASCUS/CIN1 (31 studies), LSIL/CIN1 (69 studies), LSIL/CIN2-3 (43 studies), and HSIL/CIN2-3 (45 studies). In two studies, the test and reference standard thresholds were not explicitly stated and could not be precisely inferred. Most studies permitted us to construct 2-by-2 tables using more than one combination of test and reference standard threshold.
Sensitivity estimates from these studies covered nearly the entire spectrum, ranging from 0.06 to 0.99. Similarly, specificity estimates ranged from 0.06 to 0.99. Because sensitivity and specificity are interdependent, simple statistics such as means do not provide an accurate description of the diagnostic performance of the test. We used several analytical strategies to determine the cause of the large amount of variability in sensitivity and specificity estimates and to ascertain a reasonable estimate of the performance of Pap tests in a low prevalence screened population. These analyses are described in the section on "Meta-analysis of Pap Smear Accuracy."
The setting in which most of these studies were conducted has important implications for interpreting their results and explaining the differences among them. Most of the studies were conducted in colposcopy clinics in patients who had been referred for colposcopy because of a cytological abnormality on an initial Pap smear screening. In such studies, the repeat Pap smear taken at the time of colposcopy usually served as the "test" result that was compared to the result of either colposcopy (available for all subjects) or histology (available only for women who underwent biopsy; that is, women with a negative colposcopy would be systematically excluded).
The assembly of the study sample from a population referred because of cytological abnormality on initial Pap screening can be expected to bias the spectrum of disease. Women who have a negative Pap smear at the time of colposcopy (following an initial positive smear) are likely to be different from women with a negative result on initial smear. Some have suggested that the removal of cervical epithelium with the initial Pap smear might remove enough abnormal cells that the subsequent Pap may be normal. Our analysis assumes that women with subsequent normal Pap smears had an initial false positive smear. If untrue, this assumption would lead to underestimation of Pap test accuracy.
Returning to the above example, when disease status verification is preferentially obtained in women with a colposcopically visible lesion who undergo biopsy, the study sample is further biased. This selection of subjects for verification results not only in a high prevalence of histological abnormalities in the study sample, but also in "workup" bias (Ransohoff and Feinstein, 1978). Although sensitivity and specificity are relatively invariant to prevalence, they are subject to workup bias. This bias can be expected to lead to elevated estimates of sensitivity and lowered estimates of specificity (Feinstein, 1985).
Many studies comparing conventional Pap screening with new technologies applied the reference standard (adjudicated cytology or histology) only to discrepant cases. Concordant positive and concordant negative test results are assumed to be true positives and true negatives, respectively, but may actually be concordant false results. This study design has a consistent underlying bias that can be expected to overestimate sensitivity and specificity of the new test (Miller, 1998). When the conventional and new test are conditionally dependent, as when tests may have similar problems with sample collection or interpretation of mild disease, this bias can be substantial.
Our goal was to obtain estimates of Pap test performance that are applicable to a low prevalence screened population. However, only three studies identified patients undergoing initial Pap smear screening and verified all (or a random fraction) of test negative subjects. The study of Baldauf, Dreyfus, Lehmann et al. (1995) was the largest of these, comprising 1,539 women. In this study, a 10 percent random sample of Pap negative women underwent colposcopy and biopsy to verify their disease status; this allowed for correction of any verification bias. Davison and Marty (1994) studied 200 women, and Hockstad (1992) tested 73 women; all test negative women in these two studies had their disease status verified with the reference standard test. These studies estimated sensitivity at 56 percent, 53 percent, and 29 percent, respectively, and specificity at 98 percent, 100 percent, and 97 percent, respectively.
| First Author | Year | Test Threshold | Reference Standard Threshold | Number of Subjects | True Positives | False Negatives | False Postives | True Negatives | Prevalence of Disease 1 |
|---|---|---|---|---|---|---|---|---|---|
| Anderson | 1992 | HSIL | CIN2-3 | 228 | 60 | 99 | 22 | 47 | 0.7 |
| Andrews | 1989 | ASCUS | CIN1 | 353 | 62 | 56 | 89 | 146 | 0.33 |
| LSIL | CIN1 | 353 | 31 | 87 | 29 | 206 | 0.33 | ||
| Baldauf | 1995 | ASCUS | CIN1 | 117 | 84 | 1 | 25 | 7 | 0.73 |
| HSIL | CIN2-3 | 117 | 24 | 21 | 3 | 69 | 0.38 | ||
| LSIL | CIN1 | 117 | 79 | 6 | 22 | 10 | 0.73 | ||
| LSIL | CIN2-3 | 117 | 42 | 3 | 59 | 13 | 0.38 | ||
| Baldauf | 1995 | ASCUS | CIN1 | 324 | 35 | 27 | 24 | 238 | 0.19 |
| ASCUS | CIN2-3 | 324 | 15 | 8 | 44 | 256 | 0.07 | ||
| Beeby | 1993 | LSIL | CIN1 | 1000 | 417 | 218 | 80 | 285 | 0.64 |
| LSIL | CIN2-3 | 1000 | 420 | 51 | 304 | 225 | 0.47 | ||
| Bigrigg | 1990 | HSIL | CIN2-3 | 981 | 567 | 140 | 117 | 157 | 0.72 |
| LSIL | CIN1 | 981 | 900 | 34 | 31 | 16 | 0.95 | ||
| LSIL | CIN2-3 | 981 | 691 | 16 | 240 | 34 | 0.72 | ||
| Bolick | 1998 | LSIL | CIN1 | 89 | 57 | 10 | 14 | 8 | 0.75 |
| Cecchini | 1993 | ASCUS | CIN2-3 | 486 | 5 | 3 | 69 | 409 | 0.02 |
| LSIL | CIN2-3 | 486 | 5 | 3 | 15 | 463 | 0.02 | ||
| Chomet | 1987 | HSIL | CIN2-3 | 14 | 2 | 1 | 0 | 11 | 0.21 |
| HSIL | CIN2-3 | 130 | 14 | 44 | 5 | 67 | 0.45 | ||
| LSIL | CIN1 | 130 | 63 | 31 | 14 | 22 | 0.72 | ||
| LSIL | CIN1 | 14 | 3 | 6 | 1 | 4 | 0.64 | ||
| LSIL | CIN2-3 | 130 | 44 | 14 | 33 | 39 | 0.45 | ||
| LSIL | CIN2-3 | 14 | 2 | 1 | 2 | 9 | 0.21 | ||
| Coibion | 1994 | LSIL | CIN1 | 163 | 27 | 96 | 6 | 34 | 0.75 |
| LSIL | CIN2-3 | 163 | 14 | 10 | 19 | 120 | 0.15 | ||
| Cox | 1992 | ASCUS | CIN1 | 482 | 113 | 24 | 111 | 234 | 0.28 |
| LSIL | CIN1 | 482 | 60 | 77 | 27 | 318 | 0.28 | ||
| Cox | 1995 | ASCUS | CIN1 | 217 | 30 | 20 | 38 | 129 | 0.23 |
| LSIL | CIN1 | 217 | 19 | 31 | 7 | 160 | 0.23 | ||
| Davis | 1981 | HSIL | CIN2-3 | 87 | 40 | 31 | 3 | 13 | 0.82 |
| LSIL | CIN1 | 87 | 57 | 25 | 0 | 5 | 0.94 | ||
| LSIL | CIN2-3 | 87 | 51 | 20 | 6 | 10 | 0.82 | ||
| Davison | 1994 | LSIL | CIN1 | 196 | 16 | 14 | 0 | 166 | 0.15 |
| Del Priore | 1995 | LSIL | CIN1 | 52 | 16 | 4 | 8 | 24 | 0.38 |
| LSIL | CIN2-3 | 52 | 5 | 1 | 19 | 27 | 0.12 | ||
| DiBonito | 1993 | ASCUS | CIN1 | 918 | 190 | 59 | 47 | 622 | 0.27 |
| ASCUS | CIN2-3 | 918 | 88 | 4 | 149 | 677 | 0.1 | ||
| LSIL | CIN1 | 918 | 152 | 97 | 29 | 640 | 0.27 | ||
| LSIL | CIN2-3 | 918 | 88 | 4 | 93 | 733 | 0.1 | ||
| Fahim | 1991 | LSIL | CIN1 | 757 | 275 | 48 | 165 | 269 | 0.43 |
| Ferenczy | 1996 | ASCUS | CIN1 | 364 | 130 | 56 | 45 | 133 | 0.51 |
| ASCUS | CIN2-3 | 364 | 67 | 14 | 108 | 175 | 0.22 | ||
| Ferris | 1998 | ASCUS | CIN1 | 279 | 92 | 8 | 87 | 92 | 0.36 |
| ASCUS | CIN2-3 | 279 | 19 | 4 | 160 | 96 | 0.08 | ||
| LSIL | CIN1 | 279 | 37 | 63 | 26 | 153 | 0.36 | ||
| LSIL | CIN2-3 | 279 | 5 | 18 | 58 | 198 | 0.08 | ||
| Frisch | 1994 | LSIL | CIN1 | 51 | 14 | 26 | 3 | 8 | 0.78 |
| Fung | 1997 | LSIL | CIN1 | 301 | 188 | 29 | 8 | 76 | 0.72 |
| LSIL | CIN2-3 | 301 | 96 | 14 | 100 | 91 | 0.37 | ||
| Garutti | 1994 | LSIL | CIN1 | 200 | 30 | 42 | 18 | 110 | 0.36 |
| Germain | 1994 | LSIL | CIN1 | 430 | 80 | 40 | 65 | 245 | 0.28 |
| Giles | 1989 | LSIL | CIN1 | 112 | 61 | 14 | 0 | 37 | 0.67 |
| LSIL | CIN2-3 | 112 | 38 | 3 | 23 | 48 | 0.37 | ||
| Giles | 1988 | LSIL | CIN1 | 200 | 17 | 15 | 25 | 143 | 0.16 |
| Glenthoj | 1988 | HSIL | CIN2-3 | 93 | 64 | 21 | 1 | 7 | 0.91 |
| LSIL | CIN1 | 93 | 71 | 16 | 0 | 6 | 0.94 | ||
| LSIL | CIN2-3 | 93 | 70 | 15 | 1 | 7 | 0.91 | ||
| Gonzalez | 1996 | ASCUS | CIN1 | 62 | 17 | 4 | 17 | 24 | 0.34 |
| LSIL | CIN1 | 62 | 8 | 13 | 5 | 36 | 0.34 | ||
| LSIL | CIN2-3 | 62 | 1 | 0 | 12 | 49 | 0.02 | ||
| Gundersen | 1988 | LSIL | CIN1 | 56 | 4 | 22 | 3 | 27 | 0.46 |
| LSIL | CIN2-3 | 56 | 3 | 2 | 4 | 47 | 0.09 | ||
| Haddad | 1988 | HSIL | CIN2-3 | 121 | 4 | 70 | 0 | 47 | 0.61 |
| LSIL | CIN1 | 121 | 91 | 14 | 9 | 7 | 0.87 | ||
| LSIL | CIN2-3 | 121 | 69 | 5 | 31 | 16 | 0.61 | ||
| Hellberg | 1987 | HSIL | CIN2-3 | 51 | 31 | 10 | 2 | 8 | 0.8 |
| LSIL | CIN1 | 51 | 40 | 5 | 2 | 4 | 0.88 | ||
| LSIL | CIN2-3 | 51 | 37 | 4 | 5 | 5 | 0.8 | ||
| Helmerhorst | 1987 | HSIL | CIN2-3 | 132 | 41 | 61 | 1 | 29 | 0.77 |
| Herrington | 1995 | ASCUS | CIN1 | 141 | 103 | 25 | 6 | 7 | 0.91 |
| ASCUS | CIN1 | 165 | 103 | 25 | 15 | 22 | 0.78 | ||
| HSIL | CIN2-3 | 141 | 19 | 21 | 0 | 101 | 0.28 | ||
| HSIL | CIN2-3 | 165 | 19 | 21 | 0 | 125 | 0.24 | ||
| LSIL | CIN1 | 141 | 88 | 40 | 5 | 8 | 0.91 | ||
| LSIL | CIN1 | 165 | 88 | 40 | 11 | 26 | 0.78 | ||
| LSIL | CIN2-3 | 141 | 35 | 5 | 58 | 43 | 0.28 | ||
| LSIL | CIN2-3 | 165 | 35 | 5 | 64 | 61 | 0.24 | ||
| Hirschowitz | 1992 | HSIL | CIN2-3 | 111 | 76 | 11 | 12 | 12 | 0.78 |
| Hockstad | 1992 | ASCUS | CIN1 | 70 | 2 | 5 | 2 | 61 | 0.1 |
| Johansen | 1979 | ASCUS | CIN1 | 182 | 111 | 11 | 17 | 43 | 0.67 |
| LSIL | CIN1 | 182 | 110 | 12 | 17 | 43 | 0.67 | ||
| Jones | 1996 | ASCUS | CIN1 | 22412 | 14948 | 1444 | 3206 | 2814 | 0.73 |
| LSIL | CIN1 | 22412 | 12210 | 4182 | 1526 | 4494 | 0.73 | ||
| Jones | 1992 | HSIL | CIN2-3 | 143 | 7 | 18 | 0 | 118 | 0.17 |
| LSIL | CIN1 | 143 | 30 | 52 | 8 | 53 | 0.57 | ||
| Jones | 1987 | HSIL | CIN2-3 | 236 | 3 | 7 | 2 | 224 | 0.04 |
| LSIL | CIN1 | 236 | 10 | 48 | 4 | 174 | 0.25 | ||
| LSIL | CIN2-3 | 236 | 5 | 5 | 9 | 217 | 0.04 | ||
| Kaufman | 1997 | ASCUS | CIN1 | 462 | 158 | 167 | 36 | 101 | 0.7 |
| ASCUS | CIN2-3 | 462 | 42 | 25 | 152 | 243 | 0.15 | ||
| HSIL | CIN2-3 | 462 | 17 | 50 | 17 | 378 | 0.15 | ||
| Kealy | 1986 | HSIL | CIN2-3 | 300 | 63 | 11 | 10 | 216 | 0.25 |
| LSIL | CIN1 | 300 | 80 | 13 | 25 | 182 | 0.31 | ||
| LSIL | CIN2-3 | 300 | 70 | 4 | 35 | 191 | 0.25 | ||
| Koonings | 1992 | HSIL | CIN2-3 | 147 | 39 | 19 | 11 | 78 | 0.39 |
| HSIL | CIN2-3 | 143 | 34 | 28 | 11 | 70 | 0.43 | ||
| LSIL | CIN1 | 143 | 61 | 27 | 20 | 35 | 0.62 | ||
| LSIL | CIN1 | 147 | 62 | 16 | 20 | 49 | 0.53 | ||
| LSIL | CIN2-3 | 143 | 48 | 14 | 33 | 48 | 0.43 | ||
| LSIL | CIN2-3 | 147 | 49 | 9 | 33 | 56 | 0.39 | ||
| Korach | 1995 | NS | NS | 50 | 31 | 3 | 6 | 10 | 0.68 |
| NS | NS | 95 | 31 | 3 | 20 | 41 | 0.36 | ||
| Korn | 1994 | HSIL | CIN2-3 | 85 | 10 | 17 | 3 | 55 | 0.32 |
| HSIL | CIN2-3 | 73 | 4 | 5 | 0 | 64 | 0.12 | ||
| LSIL | CIN1 | 85 | 42 | 24 | 5 | 14 | 0.78 | ||
| LSIL | CIN1 | 73 | 22 | 13 | 6 | 32 | 0.48 | ||
| LSIL | CIN2-3 | 85 | 19 | 8 | 52 | 6 | 0.32 | ||
| LSIL | CIN2-3 | 734 | 7 | 2 | 21 | 43 | 0.12 | ||
| Kwikkel | 1986 | HSIL | CIN2-3 | 184 | 79 | 21 | 26 | 58 | 0.54 |
| LSIL | CIN1 | 184 | 144 | 3 | 34 | 3 | 0.8 | ||
| LSIL | CIN2-3 | 184 | 99 | 1 | 79 | 5 | 0.54 | ||
| Lederer | 1973 | HSIL | CIN2-3 | 3073 | 267 | 99 | 37 | 2670 | 0.12 |
| Lozowski | 1982 | HSIL | CIN2-3 | 155 | 66 | 20 | 25 | 44 | 0.55 |
| LSIL | CIN1 | 155 | 107 | 8 | 20 | 20 | 0.74 | ||
| LSIL | CIN2-3 | 155 | 83 | 3 | 44 | 25 | 0.55 | ||
| MacCormac | 1988 | ASCUS | CIN1 | 6680 | 2016 | 221 | 372 | 4071 | 0.33 |
| LSIL | CIN1 | 6680 | 1893 | 344 | 239 | 4204 | 0.33 | ||
| Maggi | 1989 | HSIL | CIN2-3 | 142 | 12 | 17 | 2 | 111 | 0.2 |
| LSIL | CIN1 | 142 | 40 | 12 | 43 | 47 | 0.37 | ||
| LSIL | CIN2-3 | 142 | 24 | 5 | 59 | 54 | 0.2 | ||
| Mann | 1993 | ASCUS | CIN1 | 243 | 9 | 20 | 1 | 213 | 0.12 |
| Mayeaux | 1995 | ASCUS | CIN1 | 428 | 149 | 180 | 20 | 79 | 0.77 |
| HSIL | CIN2-3 | 428 | 27 | 83 | 18 | 300 | 0.26 | ||
| LSIL | CIN1 | 428 | 144 | 185 | 19 | 80 | 0.77 | ||
| LSIL | CIN2-3 | 428 | 65 | 45 | 98 | 220 | 0.26 | ||
| Melnikow | 1997 | LSIL | CIN1 | 771 | 583 | 75 | 100 | 13 | 0.85 |
| LSIL | CIN1 | 6018 | 5830 | 75 | 100 | 13 | 0.98 | ||
| Morrison | 1988 | ASCUS | CIN1 | 115 | 17 | 4 | 78 | 16 | 0.18 |
| Morrison | 1992 | HSIL | CIN2-3 | 15 | 5 | 3 | 2 | 5 | 0.53 |
| LSIL | CIN1 | 15 | 10 | 2 | 0 | 3 | 0.8 | ||
| LSIL | CIN2-3 | 15 | 8 | 0 | 2 | 5 | 0.53 | ||
| Naslund | 1986 | HSIL | CIN2-3 | 36 | 19 | 2 | 4 | 11 | 0.58 |
| LSIL | CIN1 | 36 | 23 | 0 | 3 | 10 | 0.64 | ||
| LSIL | CIN2-3 | 36 | 21 | 0 | 5 | 10 | 0.58 | ||
| Nyirjesy | 1972 | LSIL | CIN1 | 136 | 66 | 44 | 13 | 13 | 0.81 |
| Okagaki | 1991 | HSIL | CIN2-3 | 3545 | 340 | 271 | 366 | 2568 | 0.17 |
| Oyer | 1986 | LSIL | CIN1 | 402 | 223 | 74 | 22 | 83 | 0.74 |
| Pang | 1977 | HSIL | CIN2-3 | 52 | 40 | 3 | 3 | 6 | 0.83 |
| Parham | 1991 | HSIL | CIN2-3 | 1262 | 624 | 166 | 131 | 341 | 0.63 |
| LSIL | CIN1 | 1262 | 1088 | 17 | 129 | 28 | 0.88 | ||
| LSIL | CIN2-3 | 1262 | 782 | 8 | 435 | 37 | 0.63 | ||
| Parker | 1986 | LSIL | CIN1 | 365 | 102 | 35 | 35 | 193 | 0.38 |
| Ramirez | 1990 | HSIL | CIN2-3 | 18 | 7 | 3 | 4 | 4 | 0.56 |
| Rasbridge | 1995 | HSIL | CIN2-3 | 595 | 197 | 104 | 76 | 218 | 0.51 |
| LSIL | CIN1 | 595 | 374 | 58 | 81 | 82 | 0.73 | ||
| LSIL | CIN2-3 | 595 | 269 | 32 | 186 | 108 | 0.51 | ||
| Ritter | 1988 | ASCUS | CIN1 | 191 | 111 | 39 | 11 | 30 | 0.79 |
| Ronco | 1996 | HSIL | CIN2-3 | 50 | 20 | 12 | 4 | 14 | 0.64 |
| Singh | 1985 | ASCUS | CIN1 | 107 | 99 | 2 | 5 | 1 | 0.94 |
| HSIL | CIN2-3 | 107 | 17 | 80 | 1 | 9 | 0.91 | ||
| LSIL | CIN1 | 107 | 76 | 25 | 4 | 2 | 0.94 | ||
| LSIL | CIN2-3 | 107 | 73 | 24 | 7 | 3 | 0.91 | ||
| Skehan | 1990 | HSIL | CIN2-3 | 97 | 40 | 20 | 18 | 19 | 0.62 |
| LSIL | CIN1 | 97 | 56 | 12 | 24 | 5 | 0.7 | ||
| LSIL | CIN2-3 | 97 | 51 | 9 | 29 | 8 | 0.62 | ||
| Slawson | 1992 | ASCUS | CIN1 | 121 | 28 | 38 | 17 | 38 | 0.55 |
| Smith | 1987 | LSIL | CIN1 | 122 | 71 | 20 | 13 | 18 | 0.75 |
| Soost | 1991 | ASCUS | CIN1 | 2086 | 1486 | 173 | 344 | 83 | 0.8 |
| LSIL | CIN1 | 2086 | 1205 | 186 | 454 | 241 | 0.67 | ||
| Soutter | 1986 | HSIL | CIN2-3 | 172 | 61 | 42 | 14 | 55 | 0.6 |
| LSIL | CIN1 | 172 | 89 | 18 | 37 | 28 | 0.62 | ||
| LSIL | CIN2-3 | 172 | 86 | 17 | 40 | 29 | 0.6 | ||
| Spinillo | 1998 | ASCUS | CIN1 | 932 | 39 | 16 | 3 | 874 | 0.06 |
| HSIL | CIN2-3 | 932 | 83 | 16 | 8 | 825 | 0.11 | ||
| ASCUS | CIN1 | 202 | 18 | 19 | 4 | 161 | 0.18 | ||
| HSIL | CIN2-3 | 202 | 47 | 17 | 4 | 134 | 0.32 | ||
| Spitzer | 1987 | ASCUS | CIN1 | 97 | 32 | 23 | 19 | 23 | 0.57 |
| LSIL | CIN1 | 97 | 10 | 45 | 2 | 40 | 0.57 | ||
| Stafl | 1981 | LSIL | CIN1 | 26 | 7 | 9 | 1 | 9 | 0.62 |
| Syrjanen | 1987 | LSIL | CIN1 | 385 | 118 | 44 | 40 | 183 | 0.42 |
| Tait | 1988 | HSIL | CIN2-3 | 127 | 20 | 21 | 20 | 66 | 0.32 |
| LSIL | CIN1 | 127 | 38 | 13 | 14 | 62 | 0.4 | ||
| LSIL | CIN2-3 | 127 | 32 | 9 | 20 | 66 | 0.32 | ||
| Tawa | 1988 | ASCUS | CIN1 | 397 | 14 | 67 | 25 | 291 | 0.2 |
| Tay | 1987 | HSIL | CIN2-3 | 44 | 3 | 15 | 0 | 26 | 0.41 |
| LSIL | CIN1 | 44 | 12 | 24 | 0 | 8 | 0.82 | ||
| LSIL | CIN2-3 | 44 | 7 | 11 | 5 | 21 | 0.41 | ||
| Upadhyay | 1984 | HSIL | CIN2-3 | 308 | 222 | 0 | 68 | 18 | 0.72 |
| LSIL | CIN1 | 308 | 239 | 1 | 53 | 15 | 0.78 | ||
| LSIL | CIN2-3 | 308 | 222 | 0 | 70 | 16 | 0.72 | ||
| Walker | 1986 | LSIL | CIN1 | 214 | 140 | 42 | 15 | 17 | 0.85 |
| Wetrich | 1986 | HSIL | CIN2-3 | 1607 | 491 | 250 | 164 | 702 | 0.46 |
| LSIL | CIN1 | 1607 | 954 | 221 | 143 | 289 | 0.73 | ||
| LSIL | CIN2-3 | 1607 | 657 | 84 | 440 | 426 | 0.46 | ||
| Wheelock | 1989 | ASCUS | CIN1 | 273 | 101 | 32 | 73 | 67 | 0.49 |
| HSIL | CIN2-3 | 273 | 17 | 69 | 9 | 178 | 0.32 | ||
| LSIL | CIN1 | 273 | 64 | 69 | 26 | 114 | 0.49 | ||
| LSIL | CIN2-3 | 273 | 48 | 38 | 42 | 145 | 0.32 | ||
| Wright | 1995 | ASCUS | CIN1 | 398 | 123 | 56 | 94 | 125 | 0.45 |
| Wright | 1994b | ASCUS | CIN1 | 353 | 8 | 7 | 22 | 316 | 0.04 |
| LSIL | CIN1 | 353 | 4 | 11 | 5 | 333 | 0.04 | ||
| Young | 1993 | ASCUS | CIN1 | 412 | 276 | 52 | 42 | 42 | 0.8 |
| HSIL | CIN2-3 | 412 | 56 | 43 | 11 | 302 | 0.24 | ||
| LSIL | CIN1 | 412 | 236 | 92 | 23 | 61 | 0.8 | ||
| LSIL | CIN2-3 | 412 | 91 | 8 | 168 | 145 | 0.24 |
1 Disease defined according to reference standard threshold.
ASCUS = atypical squamous cells of undetermined significance; CIN = Cervical intraepithelial neoplasia; LSIL = low-grade squamous intraepithelial lesion; HSIL = High-grade squamous intraepithelial lesion
| Test Thresholds (screening/reference) | Summary Effectiveness Score | 95% Confidence Interval | Log Odds Ratio | 95% Confidence Interval |
|---|---|---|---|---|
| ASCUS/CIN1 | 1.027 | 0.777, 1.144 | 1.863 | 1.409; 2.075 |
| LSIL/CIN1 | 1.084 | 0.942, 1.226 | 1.966 | 1.709; 2.224 |
| LSIL/CIN2-3 | 1.062 | 0.872, 1.252 | 1.926 | 1.582; 2.271 |
| HSIL/CIN2-3 | 1.287 | 1.075, 1.499 | 2.334 | 1.950; 2.719 |
ASCUS = atypical squamous cells of undetermined significance; CIN = cervical intraepithelial neoplasia; LSIL = low grade squamous intraepithelial lesion; HSIL = high grade squamous intraepithelial lesion
| Threshold | Parameter | Estimate | 95% Confidence Interval |
|---|---|---|---|
| ASCUS/CIN1 | Prevalence | -0.704 | -1.772 to 0.364 |
| Intercept | 1.346 | 1.058 to 1.634 | |
| LSIL/CIN1 | Prevalence | -0.756 | -1.212 to -0.300 |
| Intercept | 1.537 | 1.346 to 1.728 | |
| LSIL/CIN2-3 | Prevalence | -0.906 | -1.483 to -0.330 |
| Intercept | 1.401 | 1.292 to 1.510 | |
| HSIL/CIN2-3 | Prevalence | -2.107 | -2.692 to -1.522 |
| Intercept | 2.257 | 2.122 to 2.391 |
ASCUS = atypical squamous cells of undetermined significance; CIN = cervical intraepithelial neoplasia; LSIL = low grade squamous intraepithelial lesion; HSIL = high grade squamous intraepithelial lesion
Effectiveness score = intercept + beta 1 X prevalence
We next examined the effect of the various determinants of the quality of studies on effectiveness scores. We examined the presence of blinded evaluation of the screening test in a model that included prevalence for each of the four test threshold/reference standard threshold combinations. A significant effect was observed only for the HSIL/CIN2-3 threshold; the parameter estimate of 0.940 (95% CI; 0.455 to 1.425) was positive, indicating that studies employing blinded evaluation of cytology demonstrated better discrimination.
The effects of verification of test negative subjects and type of reference standard on between-study variation in effectiveness scores were difficult to assess together with prevalence because there were only a few studies and because of collinearity, leading to nonconvergence of maximum likelihood estimates for most thresholds. Because of these problems, we evaluated the effects of these variables separately.
We also evaluated the effect of the quality score on sensitivity, specificity, and effectiveness. Neither verification nor the numerical dichotomous quality score variables were significantly associated with the test operating characteristics. The type of reference standard was significantly associated with effectiveness only at the HSIL/CIN2-3 thresholds, where a histology reference standard led to improved effectiveness scores compared with a colposcopy reference standard.
To calculate estimates for Pap test performance that are applicable to a low-prevalence screened population we had two options. One option was to use predictions from the regression model that included prevalence, substituting appropriately low values for prevalence. Then from the resulting effectiveness score, we would need to choose an appropriate combination of sensitivity and specificity. The other option was to use the subset of studies that were conducted in low-prevalence screened populations and avoid workup bias.
We chose the latter option for two reasons. First, we suspected that the prevalence term was more an indicator of biases than of different prevalence in the underlying population of interest; thus, adjusting for "low prevalence" might have uncertain effects. Second, we had no clear rationale for choosing any particular combination of sensitivity and specificity from the joint distribution described by the adjusted effectiveness score that would result from the first option.
The following sections describe cost estimates for using the Pap smear (and other technologies) to screen for, diagnose, and treat cervical cancer and its precursors.
Estimating the costs of using the Pap smear to screen for cervical cancer is a complex process because costs need to be estimated for both obtaining and processing the smear. Estimates of these screening costs, and the methods used to obtain the estimates, are described below. Estimates of the total cost of Pap smear screening and of the cost of new screening techniques are also provided.
Several steps were taken in this study to estimate the cost of collecting cytology samples. In the past, studies have used either the cost of an entire office visit or the amount of reimbursement made for the sample collection as a proxy. This overestimates the cost, as Pap smears are usually performed as part of an annual gynecological exam, which involves other procedures as well. In this study, attempts were made to arrive at more accurate estimates. Only the cost of the proportion of the office visit that was directly attributable to obtaining a Pap smear was included. To do so, first the time spent by the physician and other health care staff in obtaining smears was estimated. Second, the cost per minute of office visit time was calculated. Finally, the time devoted to collecting the samples during the office visit was multiplied by the cost per minute to obtain the cost attributable to the collection process.
The physician's time spent to obtain the smears was estimated from the National Ambulatory Medical Care Survey (NAMCS). This survey provides data on ambulatory medical care provided in physicians' offices and is based on a sample of nationally representative patient visits. The survey contains information on age, race, and sex of the patient, organized by selected physician characteristics such as type of specialty and geographic location. Information provided about the clinical substance of the visit includes the patient's problem, the physician's diagnosis (ICD-9-CM), the diagnostic/screening services provided, the surgical procedures performed, and the duration of the visit. Health Economics Research, Inc., analyzed the 1992 NAMCS data, which contained 34,606 patient records from 1,558 doctors who participated in the survey.
The NAMCS data provide information on the amount of time (minutes) spent in direct face-to-face contact with the physician. Separate estimates of physician time were obtained for patients who were 20-64 years old and for those who were 65 years or older. A regression technique was employed to derive the proportion of time spent obtaining a Pap smear. The model used is shown below:
Log (MINUTES) = B1 + B2 PAP_TEST
where
PAP_TEST = 1 if performed; = 0 otherwise
The B1 value is an intercept, and the B2 value is the proportion of time spent obtaining the smears. A log distribution was used to model the skewed distribution of office visit minutes.
| Age (yr) | Percent. of Time Spent Obtaining Pap Smear 1 (%) | X | Average Duration of Pap Smear Visit (minutes) | = | Physician Time Spent Obtaining Pap Smear (minutes) | + | Other Medical Staff Time Spent 2 (minutes) | = | Total Time Spent Obtaining Pap Smears (minutes) |
|---|---|---|---|---|---|---|---|---|---|
| Age 20-64 | 7.67 | 18.78 | 1.44 | 5.86 | 7.30 | ||||
| Age 65 and older | 27.0 | 20.07 | 5.42 | 5.86 | 11.28 |
Percentage increase in time was obtained by econometric estimation: log(MINUTES) = B1 + B2 PAP_TEST, where PAP_TEST = 1 if performed; 0 otherwise.
Other medical staff time was calculated as total time minus physician time (7.30 - 1.44 = 5.86 minutes). For total time spent obtaining Pap smear, see Waugh, Smith, Robertson et al. 1996a
SOURCE: Analysis by Health Economics Research, Inc., of data from the National Ambulatory Medical Care Survey.
The NAMCS data only contain information on time the doctor spent with the patient, not on time the patient spent receiving care from someone else, for instance, a nurse or a technician. Therefore, time spent by other medical staff who assisted in obtaining the smears was estimated by subtracting physician time from total time, calculated by Waugh, Smith, Robertson et al. (1996a), for collecting smears from the under-65 population. When the time spent by other staff was included, the total time spent obtaining the smears was estimated as 7.30 minutes for the younger age group and 11.28 minutes for the older age group.
| Age 20-64 | Age 65 & Older | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| CPT Code | No. of Pap Visits | % Smears | Cost per Visit (MEDSTAT) | Time (minutes) | No. of Pap Visits | % Smears | Cost per Visit (MEDSTAT) | Cost per Visit (Medicare) | Time (minutes) |
| 99211 | 2,539 | 1.9 | $32.09 | 5 | 26 | 2.2 | $32.09 | $13.89 | 5 |
| 99212 | 10,651 | 8.0 | 38.59 | 10 | 41 | 3.5 | 38.59 | 24.85 | 10 |
| 99213 | 41,188 | 30.9 | 50.62 | 15 | 278 | 23.9 | 50.62 | 35.45 | 15 |
| 99214 | 44,059 | 33.1 | 67.21 | 25 | 366 | 31.5 | 67.21 | 54.83 | 25 |
| 99215 | 16,292 | 12.2 | 89.65 | 40 | 346 | 29.8 | 89.65 | 86.62 | 40 |
| 99201 | 690 | 0.5 | 54.63 | 10 | 8 | 0.7 | 54.63 | 28.88 | 10 |
| 99202 | 2,031 | 1.5 | 60.27 | 20 | 10 | 0.9 | 60.27 | 46.42 | 20 |
| 99203 | 6,486 | 4.9 | 75.01 | 30 | 23 | 2.0 | 75.01 | 63.60 | 30 |
| 99204 | 6,429 | 4.8 | 94.96 | 45 | 33 | 2.8 | 94.96 | 95.03 | 45 |
| 99205 | 2,725 | 2.0 | 113.06 | 60 | 30 | 2.6 | 113.06 | 119.16 | 60 |
| Average | -- | -- | $64.35 | 23.93 | -- | -- | $70.11 | $60.42 | 27.52 |
| Range | -- | -- | $50.62-$89.65 | 15-40 | -- | -- | $50.62-$89.65 | $35.45-$86.62 | 15-40 |
| Cost per minute | -- | -- | $2.70 | -- | -- | -- | $2.55 | $2.20 | -- |
| Cost per minute- range | -- | -- | $2.24-$3.37 | -- | -- | -- | $2.24-$3.37 | $2.6-$2.37 | -- |
NOTE: All cost estimates are reported in 1997 dollars. CPT = Current Procedures Terminology. SOURCE: Analysis by Health Economics Research, Inc., of MEDSTAT data.
| CPT Code | Description | % of Smears | Age 20-64 MEDSTAT Costs | Age 65 & Older Medicare Payments |
|---|---|---|---|---|
| 88150 | Screening by technician under physician supervision | 80.7 | $18.75 | $8.33 |
| 88151 | Requiring interpretation by physician | 6.1 | $19.08 | $36.16 |
| 88155 | With definitive hormonal evaluation | 13.2 | $20.81 | $9.65 |
| -- | Average cost per smear | -- | $18.97 | $10.19 |
| -- | STD | -- | $6.94 | -- |
| -- | Range +/- STD | -- | $12.03-$25.91 | -- |
All cost estimates are reported in 1997 dollars.
CPT = Current Procedures Terminology; RBRVS = resource based relative value scale; STD = standard deviation.
SOURCE: MEDSTAT data analysis; Medicare's RBRVS fee schedule and Clinical Laboratory fee schedule.
| Age (yrs) | Total Time Spent Obtaining Pap Smear (minutes) | X | Cost per Minute of Office Visit Time | = | Office Visit Cost for Pap Smear | + | Pap Smear Lab Processing cost | = | Total Cost of Performing Pap Smears |
|---|---|---|---|---|---|---|---|---|---|
| Age 20-64 | 7.30 | $2.70 | $19.71 | $18.97 | $38.68 | ||||
| Range | -- | $2.24-$3.37 | $16.35-$24.60 | -- | $35.32-$43.57 | ||||
| Age 65 and Older | |||||||||
| MEDSTAT | 11.28 | $2.55 | $28.76 | $18.97 | $47.73 | ||||
| Range | -- | $2.24-$3.37 | $25.27-$38.01 | -- | $44.24-$56.98 | ||||
| Medicare | 11.28 | $2.20 | $24.82 | $10.19 | $35.01 | ||||
| Range | -- | $2.6-$2.37 | $24.36-$26.73 | -- | $34.55-$36.92 |
All cost estimates are reported in 1997 dollars.
SOURCE: Analysis by Health Economics Research, Inc., of National Ambulatory Medical Care Survey and MEDSTAT data, and Clinical Laboratory fee schedule.
| Screening Procedure | Office Visit Cost for Pap Test | Cost of Conventional Pap Smear | Cost of New Technology | Total Cost Estimate |
|---|---|---|---|---|
| Conventional Pap Age 20-64 Average Range | $19.71 $16.35-$24.80 | $18.97 -- | -- -- | $38.68 $35.32-$43.77 |
| Age 65 and older -- MEDSTAT Average Range | $28.76 $25.27-$38.01 | $18.97 -- | -- -- | $47.73 $44.24-$56.98 |
| Age 65 and older -- Medicare Average Range | $24.82 $24.36-$26.73 | $10.19 -- | -- | $35.01 $34.55-$36.92 |
| AutoPap® Age 20-64 Average Range | $19.71 $16.35-$24.80 | $18.97 -- | $7.58 $7.15-$8.00 | $46.26 $42.47-$51.77 |
| Age 65 and older -- MEDSTAT Average Range | $28.76 $25.27-$38.01 | $18.97 -- | $7.58 $7.15-$8.00 | $55.31 $51.39-$64.98 |
| Age 65 and older -- Medicare Average Range | $24.82 $24.36-$26.73 | $10.19 -- | $7.58 $7.15-$8.00 | $42.59 $41.70-$44.92 |
| Papnet® Age 20 to 64 Average Range | $19.71 $16.35-$24.80 | $18.97 -- | $15.00 $10.00-$18.00 | $53.68 $45.32-$61.77 |
| Age 65 and older -- MEDSTAT Average Range | $28.76 $25.27-$38.01 | $18.97 -- | $15.00 $10.00-$18.00 | $62.73 $54.24-$74.98 |
| Age 65 and older -- Medicare Average Range | $24.82 $24.36-$26.73 | $10.19 -- | $15.00 $10.00-$18.00 | $50.01 $44.55-$54.92 |
| ThinPrep® Age 20 to 64 Average Range | $19.71 $16.35-$24.80 | -- -- | $24.57 $15.00-$30.00 | $44.28 $31.35-$54.80 |
| Age 65 and older -- MEDSTAT Average Range | $28.76 $25.27-$38.01 | -- -- | $24.57 $15.00-$30.00 | $53.33 $40.27-$68.01 |
| Age 65 and older -- Medicare Average Range | $24.82 $24.36-$26.73 | -- -- | $24.57 $15.00-$30.00 | $49.39 $39.36-$56.73 |
SOURCE: Analysis by Health Economics Research, Inc., of MEDSTAT data from the National Ambulatory Medical Care Survey, and information from ThinPrep, AutoPap, and Papnetdevelopers.
The costs to perform a ThinPrep® Pap test can be broken into the following categories for a clinical laboratory: supplies, preparation costs, evaluation costs, and indirect expenses. The laboratory supplies consist of the ThinPrep® Pap test kit that is distributed to clinicians along with stains and fixatives. The current retail price to laboratories for the ThinPrep® Pap test kit and related supplies is $9.75. Preparation costs include the ThinPrep® 2000 Processor cost of $45,000 as well as costs associated with personnel for preparing the slide for processing and disposing of waste after processing. Amortizing the cost of the ThinPrep® 2000 Processor over a 5-year period and assuming that 40,000 slides are processed annually yields an estimated $0.22 equipment cost.
The last two categories of costs, evaluation and indirect expenses, as well as the personnel costs for preparing the slide for processing and disposing of waste, are essentially equivalent between the ThinPrep® technology and the conventional Pap smear technology. A recent study conducted by the College of American Pathologists (CAP) estimated these costs for conventional Pap smears to be approximately $14.60. Thus, a total cost estimate of $24.57 can be derived by summing the components.
This estimate, however, excludes the cost savings attributed to a reduction in the number of Pap smears that may be repeated because the specimen is determined to have a "Satisfactory but limited by" (SBLB) diagnosis. The manufacturer of ThinPrep® argues that specimen quality is significantly improved with this new technology, and it can reduce the number of specimens labeled SBLB (currently estimated to be 40 percent) as well as the number of repeat Pap smears that result from this labeling. Cytyc Corporation estimates a 50 percent reduction in the number of specimens labeled as SBLB; however, estimates on the number of women who return for a repeat Pap smear as a result of an SBLB diagnosis are not well known. The cost estimate of $24.57 does not include any adjustments to reflect these potential cost savings. Ignoring these costs will tend to favor conventional Pap smears over the new ThinPrep® technology, and this is a conservative assumption.
However, the cost analysis has been conducted from the perspective of the health care system, with the majority of costs estimated using payments or allowed charges as a proxy for costs. Thus, it seems reasonable to take a similar approach to the estimation of costs for the new technologies. Cytyc Corporation estimates from a large number of private insurers that the average payment to laboratories for primary screening using the ThinPrep® technology ranges from $15.00 to $30.00 and includes all nonoffice costs associated with the processing and evaluation of the ThinPrep® smears, including pathologist fees for reviewing abnormal slides. Because Medicare does not currently reimburse laboratories at a different level from conventional Pap smears when ThinPrep® smears are performed, we do not have current payment estimates for the Medicare population. Thus, the ThinPrep® payment rates from the private sector will be used as an estimate for the Medicare population as well.
The incremental costs associated with the Papnet® technology may be broken down into the following categories: the costs of preparation and shipment of the "normal" slides to a central facility for processing, the costs associated with the Papnet® neural network computer system's processing of the slide at a central facility, the costs of a viewing station to review the data shipped back to the laboratory on compact disks, and the costs of triaging and reviewing selected slides by the cytotechnologist. Neuromedical Systems Inc. (NSI) estimates that the average cost to laboratories for these costs ranges from $12.00 to $18.00 depending on the laboratory's volume. These costs are in addition to the cytotechnologist and pathologist costs associated with the manual review of normal and abnormal slides and the costs of supplies for obtaining and processing the sample, costs that are already occurring with conventional Pap smears.
The incremental costs associated with the AutoPap® technology may be broken down into the following categories: equipment, software, service, and software licensing. NeoPath, Inc., estimates the total average cost to laboratories for these four cost components to be $4.50, with a range from $3.75 to $4.50. These costs are in addition to the cytotechnologist and pathologist costs associated with manual review of normal and abnormal slides and the costs of supplies for obtaining and processing the sample, which are already occurring with conventional Pap smears. When AutoPap® is used for quality control purposes, there are no additional incremental costs or savings associated with labor. In contrast, when it is used as a primary screening vehicle, NeoPath, Inc., estimates that there is an overall reduction in cytotechnologist labor of about 20 percent across all slides. The reduction is a function of fewer slides being processed for review by the laboratory's cytotechnologist.
This savings in cytotechnologist labor can be demonstrated through a simple example. Assuming a laboratory currently processes 100 patient slides using the conventional technique of manual review of all 100 slides by a cytotechnologist, the cytotechnologist must actually review 110 slides because of the CLIA quality control requirement of a 10 percent re-review. In contrast, the AutoPap® scores all 100 slides and archives 25 percent of them as the "most normal" of all slides. There is no manual review of these slides by the cytotechnologist. There is manual review of the remaining 75 slides as well as a review of 15 of the slides to meet CLIA quality control standards (set at 15 percent for AutoPap® versus 10 percent for conventional manual review). Thus, the total number of slides undergoing manual review by the cytotechnologist is actually 90 when the AutoPap® technology is used versus 110 when conventional manual review is done. Ignoring this cost will tend to favor conventional Pap smears relative to AutoPap®.
The cost analysis has been conducted from the perspective of the health care system, however, with the majority of costs estimated using payments or allowed charges as a proxy for costs. Thus, it seems reasonable to take a similar approach to the estimation of costs for the new technologies. NeoPath, Inc., estimates that average payment to laboratories for quality control screening using the AutoPap® technology ranges from $7.15 to $8.00 from a large number of public and private insurers, respectively. This payment is in addition to payment for the conventional Pap smear, which includes all nonoffice costs associated with the processing and evaluation of the conventional Pap smears, including pathologist fees for reviewing abnormal slides. Because Medicare does not currently reimburse laboratories at a different level from conventional Pap smears when AutoPap® is used to identify slides for review, we do not have current payment estimates for the Medicare population. Thus, the AutoPap® payment rates from the non-Medicare public and private sector will be used as an estimate for the Medicare population as well.
A claims analysis was performed to obtain the cost of diagnostic procedures and treatments for women 20-64 years old. Because no primary data analysis was undertaken for patients 65 years old and older, Medicare's RBRVS fee schedule, Clinical Laboratory fee schedule, diagnosis-related group payment rates, and ambulatory surgery center payment rates were reported.
All outpatient and inpatient claims for women (20-64 years old) with ICD-9 diagnosis codes for cervical cancer, carcinoma in situ, and dysplasia were included in the sample analyzed. The specific diagnosis codes used are shown below:
| ICD-9 Code | Description |
|---|---|
| 180.x | Neoplasm of cervix, primary (malignant) |
| 233.1 | Carcinoma in situ of cervix |
| 622.1 | Dysplasia of cervix |
Pregnant women were excluded from the analysis, as costs associated with diagnosing and treating this group differ from those of the general population. Cases with only an ICD-9 code of 180.8, "other specified sites of cervix," and no other relevant diagnostic codes (180.0, 180.1, or 180.9) were excluded from the analysis as they may have indicated cases requiring specialized treatment. Very few cases were excluded because of this selection criterion.
Costs for procedures performed on an inpatient and on an outpatient basis are provided. Certain procedures are provided only on an inpatient basis (for example, exenteration), though others (colposcopy, cone biopsy) may be performed in either an inpatient or an outpatient setting. For the procedures in the latter category, the appropriate setting or the setting where the procedure was most often performed for women with cervical cancer was utilized to derive the estimates. Most of these procedures were estimated using outpatient (physician and hospital outpatient department) costs. For procedures where both inpatient and outpatient settings were deemed reasonable, the cost of the procedure was estimated for both the outpatient and the inpatient setting.
The analysis was limited to those individuals with a single procedure on a given day or admission to minimize differences related to multiple procedures. Thus, individuals who had both cone biopsy and LEEP performed on the same day would not have been included in the estimation because accurate allocation of costs to each specific procedure was not always possible. For outpatient procedures, in addition to the costs specifically related to the procedure (identified by the procedure code), we also estimated the cost of related services inherent in the performance of the procedure. Therefore, we evaluated all other outpatient services that were performed on the same day to determine if they were appropriate to include in our cost definitions. For instance, costs associated with anesthesia were added to the cost estimates to capture the total cost of performing the procedure. Supplementing procedure-specific costs with other related services provided allowed for a more comprehensive estimate of the costs associated with the procedure. In addition, all separately billed pathology services (CPT codes 88305 and 88307) were included in the final procedure cost.
| Age 20-64 | Age 65 and Older Medicare Payments | ||||||
|---|---|---|---|---|---|---|---|
| Procedure | No. of Obs. | MEDSTAT Average Cost | Standard Deviation | 25% Q1 | 50% Med. | 75% Q3 | |
| Colposcopy | |||||||
| Exploration | 5,412 | $142.63 | 73.32 | $81 | $130 | $179 | $68.60 |
| With Biopsy | 16,878 | $276.57 | 101.30 | $195 | $267 | $351 | 172.63 |
| Cervical Biopsy | 1,266 | $189.81 | 121.64 | $115 | $164 | $217 | 131.46 |
All costs presented are in 1997 dollars.
Med. = median; Obs. = observations; Q1 = quartile 1; Q3 = quartile 3; RBRVS = resource based relative value scale.
SOURCE: Analysis by Health Economics Research, Inc., of MEDSTAT data and Medicare's RBRVS fee schedule.
| Age 20 to 64 | Age 65 and Older Medicare Payments | |||||||
|---|---|---|---|---|---|---|---|---|
| Treatment | No. of Obs. | MEDSTAT Average Cost | Standard Deviation | Days | 25% Q1 | 50% Med | 75% Q3 | |
| 1. Cryotherapy | 5,659 | $156 | $63 | -- | $108 | $142 | $184 | $111 |
| 2. Loop electrode excision procedure | 1,972 | 564 | 268 | -- | 352 | 542 | 710 | 205 |
| 3. Laser ablation | 1,393 | 730 | 457 | -- | 390 | 650 | 976 | 627 |
| 4. Cone biopsy | ||||||||
| Oupatient | 4,856 | 919 | 443 | -- | 623 | 821 | 1,114 | 831 |
| Inpatient | 10 | 4,052 | 2,200 | 1.40 | 2,640 | 3,268 | 4,564 | 2,459 |
| 5. Simple hysterectomy | 518 | 5,899 | 2,348 | 3.05 | 4,364 | 5,407 | 6,934 | 5,694 |
| 6. Radical hysterectomy | 110 | 13,077 | 6,395 | 6.54 | 8,688 | 11,453 | 15,105 | 8,535 |
| 7. Exenteration | 9 | 51,413 | 61,663 | 21.78 | 17,011 | 19,373 | 53,261 | 9,370 |
| 8. Radiation therapy | ||||||||
| Inpatient (per admission) | 3,653 | |||||||
| STAGE I | 72 | 3,766 | 1,631 | 2.38 | 2,616 | 3,397 | 4,537 | -- |
| STAGE II-III | 17 | 3,883 | 2,335 | 2.82 | 2,439 | 3,861 | 4,363 | -- |
| STAGE IV | 14 | 7,074 | 5,810 | 7.36 | 3,332 | 4,695 | 10,838 | -- |
| Outpatient (per service) | 4,531 | 410 | 468 | -- | 133 | 258 | 526 | 196 2 |
| 9. Chemotherapy | ||||||||
| Inpatient (per admission) | 2,523 | |||||||
| All cases | 51 | 5,226 | 4,970 | 3.63 | 2,253 | 3,492 | 6,228 | -- |
| Less than 5 days | 39 | 3,543 | 1,676 | 2.64 | 2,212 | 3,246 | 3,932 | -- |
| Outpatient (per service) | 2,756 | 191 | 241 | -- | 56 | 105 | 222 | 56 3 |
All costs presented are in 1997 dollars.
Medicare payment for most common radiation treatment received by women with cervical cancer -- weekly therapy (CPT code 77430).
Medicare payment for most common chemotherapy treatment received by women with cervical cancer -- infusion technique (CPT code 96410). This estimate does not include the cost of the chemotherapeutic agent.
CPT = Current Procedures Terminololgy; DRG = diagnostic-related group; Med. = median; Obs. = observations; Q1 = quartile 1; Q3 = quartile 3; RBRVS = resource based relative value scale.
SOURCE: Analysis by Health Economics Research, Inc., of MEDSTAT data and Medicare's RBRVS fee schedule/DRG payment.
| Procedures | Age 20-64 | Age 65 and Older Medicare Payments | |||||
|---|---|---|---|---|---|---|---|
| No. of Obs. | MEDSTAT Average Cost | Standard Deviation | 25% Q1 | 50% Med. | 75% Q3 | ||
| Endocervical curettage | 2,546 | 98 | 75 | $49 | $81 | $126 | $71 |
| Dilation and curettage | 925 | 252 | 73 | $163 | $280 | $322 | 642 |
| Dilation of cervical canal | 97 | 119 | 88 | $75 | $82 | $136 | 52 |
| Endometrial sampling (biopsy) | 2,833 | 199 | 6 | $162 | $199 | $236 | 58 |
| Cauterization of cervix | 703 | 178 | 247 | $55 | $108 | $201 | 95 |
| Pelvic exam under anesthesia | 199 | 201 | 170 | $81 | $184 | $341 | 136 |
| Chest X-ray | 5,534 | 43 | 58 | $24 | $31 | $48 | 35 |
| CAT scan | 2,104 | 312 | 248 | $170 | $193 | $379 | 271 |
| Barium enema | 1,204 | 108 | 56 | $56 | $107 | $132 | 101 |
| Bone scan | 23 | 236 | 251 | $56 | $143 | $261 | 122 2 |
| Magnetic resonance imaging | 156 | 563 | 453 | $314 | $490 | $916 | 497 |
| Intravenous pyelogram | 7 | 339 | 267 | $105 | $248 | $477 | 149 |
| Cystoscopy | 28 | 982 | 851 | $226 | $674 | $1,546 | 184 3 |
| Proctoscopy | 7 | 774 | 543 | $335 | $672 | $1,179 | 196 3 |
All costs presented are in 1997 dollars.
This estimate does not include the cost of the scanning agent.
These estimates only reflect costs associated with the professional component.
Med. = median; Obs. = observations; Q1 = quartile 1; Q3 = quartile 3; RBRVS = resource based relative value scale; CAT = computerized axial tomography.
SOURCE: Analysis by Health Economics Research, Inc., of MEDSTAT data and Medicare's RBRVS fee schedule.
| Stage | Primary ICD-9 Codes | Secondary ICD-9 Codes |
|---|---|---|
| LSIL | 622.1 | No diagnosis indicating cancer or CIS |
| HSIL | 233.1 | No diagnosis indicating cancer |
| Stage I | 180.0, 180.1, 180.9 | No diagnosis indicating other forms of cancer |
| Stage II | 180.0, 180.1, 180.9 | 198.82, 198.6, 179, 181, 182.x, 183.x, 184.x |
| Stage III | 180.0, 180.1, 180.9 | 198.82, 198.6, 179, 181, 182.x, 183.x, 184.x and either of these codes 593.5, 599.6, 593.89, 591 |
| Stage IV | 180.0, 180.1, 180.9 | 199.1, 199.0, 140.x - 176.x, 188.x - 195.x, 197.x , 198.0, 198.1, 198.2, 198.3, 198.4, 198.5, 198.7, 198.81, 198.89 |
HSIL = high-grade squamous intraepithelial lesions; ICD-9 = International Classification of Diseases, version 9; LSIL = low-grade squamous intraepithelial lesions; CIS = Carcinoma in situ
SOURCE: ICD-9-CM Code Book, St. Anthony Publishing.
| Lesions/Stage of Disease | No. of Obs. | Average Inpatient Cost | Average Outpatient Cost | Average Total Cost | Hospital Days |
|---|---|---|---|---|---|
| LSIL | 9,329 | -- | $1,728 | $1,728 | -- |
| HSIL | |||||
| With oupatient services only | 1,257 | -- | 3,049 | 3,049 | -- |
| With oupatient & inpatient Services | 337 | $6,352 | 2,806 | 9,159 | 3.17 |
| Stage I | 248 | 10,948 | 6,697 | 17,645 | 6.31 |
| Stage II/III | |||||
| Only Stage II/III costs | 39 | 20,298 | 6,771 | 27,069 | 9.59 |
| Stages I and II/III | 54 | 19,734 | 11,439 | 31,173 | 13.17 |
| Stage IV | |||||
| Only Stage IV costs | 17 | 31,115 | 9,164 | 40,280 | 20.29 |
| All stages (final diagnosis Stage IV) | 70 | 37,258 | 19,552 | 56,810 | 28.46 |
All costs presented are in 1997 dollars; costs are associated with women ages 20-64 years old diagnosed with cervical cancer.
HSIL = high-grade squamous intraepithelial lesions; LSIL = low-grade squamous intraepithelial lesions; Obs. = observations.
SOURCE: Analysis by Health Economics Research, Inc., of MEDSTAT data.
| Lesions/Stage of Disease | Total Cost | Standard Deviation | 25% Q1 | 50% Med. | 75% Q3 |
|---|---|---|---|---|---|
| LSIL | $1,728 | $1,498 | $675 | $1,143 | $2,274 |
| HSIL | |||||
| With oupatient services only | 3,049 | 2,033 | 1,384 | 2,733 | 4,203 |
| With oupatient & inpatient services | 9,159 | 4,207 | 6,710 | 8,556 | 11,540 |
| Stage I | 17,645 | 17,572 | 9,439 | 14,005 | 21,946 |
| Stage II/III | |||||
| Only Stage II/III costs | 27,069 | 55,953 | 11,734 | 15,392 | 29,089 |
| Stages I and II/III | 31,173 | 51,231 | 13,303 | 22,631 | 32,858 |
| Stage IV | |||||
| Only Stage IV costs | 40,280 | 53,390 | 12,670 | 17,772 | 46,509 |
| All stages (final diagnosis Stage IV) | 56,810 | 43,930 | 27,595 | 50,705 | 75,350 |
All costs presented are in 1997 dollars; costs are associated with women 20-64 years old diagnosed with cervical cancer.
HSIL = high-grade squamous intraepithelial lesions; LSIL = low-grade squamous intraepithelial lesions; Med. = median; Q1 = quartile 1; Q3 = quartile 3. SOURCE: Analysis by Health Economics Research, Inc., of MEDSTAT data.
| Study | Intervention | Study Design | Economic Impact? | Health Impact? | Cost-Effective? |
|---|---|---|---|---|---|
| Barnes, 1981 | Screening | Mathematical model | Yes | Yes | Yes |
| Barnes and Barnes, 1977 | Screening | Mathematical model | Yes | Yes | Cost-benefit |
| Bergstrom, Adami, Gustafsson et al., 1993 | Screening | Population-based cohort | - | Yes | - |
| Bethwaite, Rayner, and Bethwaite, 1986 | Screening interval | Mathematical model | Yes | Yes | Yes |
| Boon, de Graaff Guilloud, 1981 | Screening | Mathematical model | Yes | Yes | Yes |
| Brown and Garber, 1998 | Screening test (Papnet® vs. AutoPap® vs. ThinPrep® vs. conventional Pap) | Mathematical model | Yes | Yes | Yes |
| Carter, Coburn, and Luszczak, 1993 | Screening in pregnancy | Mathematical model | Yes | Yes | Yes |
| Cecchini, Bonardi, Iossa et al., 1997 | Various screening strategies | Mathematical model | Yes | Yes | Yes |
| Chesebro and Everett, 1996 | Treatment of abnormal Pap | Mathematical model | Yes | Yes | Yes |
| Dickinson, 1972 | Screening | Mathematical model | Yes | Yes | Yes |
| Eddy, 1990 | Screening intervals, age of onset | Mathematical model | Yes | Yes | Yes |
| Fahs, Mandelblatt, Schechter et al., 1992 | Screening intervals | Mathematical model | Yes | Yes | Yes |
| Johnson, Sutton, Thornton et al., 1993 | Treatment of abnormal Pap | Mathematical model | - | Yes | - |
| Koopmanschap, van Oortmarssen, van Agt et al., 1990 | Screening program (organized vs. spontaneous) | Mathematical model | Yes | Yes | Yes |
| Luce, 1981 | Screening intervals | Mathematical model | Yes | Yes | Yes |
| Mandelblatt and Fahs, 1988 | Screening | Mathematical model | Yes | Yes | Yes |
| Mandelblatt, Freeman, Winczewski et al., 1997 | Opportunistic screening in emergency room> | Mathematical model | Yes | Yes | Yes |
| Matsunaga, Tsuji, Sato et al., 1997 | Screening | Mathematical model | Yes | Yes | Yes |
| Melnikow, Nuovo, and Paliescheskey, 1996 | Treatment of abnormal Pap | Mathematical model | Yes | Yes | Yes |
| Prabhakar, 1992 | Screening | Mathematical model | Yes | Yes | Yes |
| Raab, 1997 | Screening, rescreening strategies | Mathematical model | Yes | Yes | Yes |
| Radensky and Mango, 1998 | Screening and rescreening of negatives | Mathematical model | Yes | Yes | Yes |
| Richart and Barron, 1981 | Screening intervals | Mathematical model | - | Yes | - |
| Schechter, 1996 | Papnet® vs. conventional Pap screening | Mathematical model | Yes | Yes | Yes |
| Schweitzer, 1974 | Screening | Mathematical model | Yes | Yes | Yes |
| Schweitzer and Luce, 1978 | Screening intervals | Mathematical model | Yes | Yes | Yes |
| Sherlaw-Johnson, Gallivan, Jenkins et al., 1994 | Treatment of abnormal Pap | Mathematical model | Yes | Yes | Yes |
| van Ballegooijen, Habbema, van Oortmarssen et al., 1992 | Screening program (organized vs. spontaneous) | Mathematical model | Yes | Yes | Yes |
| van Ballegooijen, van den Akker-van Marle, Warmerdam et al., 1997 | Screening (Pap vs. Pap plus human papillomavirus) | Mathematical model | Yes | Yes | Yes |
| van Nagell, Dudik, Frank et al., 1987 | Screening | Mathematical model | Yes | Yes | Yes |
| van Oortmarssen, Habbema, and van Ballegooijen, 1992 | Screening | Mathematical model | Yes | Yes | Yes |
| van Oortmarssen and Habbema, 1995 | Screening | Mathematical model | - | Yes | - |
| Waugh and Robertson, 1996b | Screening interval | Mathematical model | Yes | Yes | Yes |
| Waugh, Smith, Robertson et al., 1996 | Screening program (organized vs. spontaneous) | Mathematical model | Yes | Yes | Yes |
Alternative screening programs or strategies addressed by the studies included the following:
Pap screening compared with no use of Pap screening
Screening intervals for repeating Pap smear screening
Rescreening of negative Pap smears
Management of women found to have abnormal Pap smears
The studies used various approaches to address the question of the impact of Pap screening on health and economic outcomes. Some studies used cost-benefit analysis, and others used Markov modeling to estimate cost-effectiveness. Some studies used data from actual populations, and others used hypothetical cohorts combined with incidence and prevalence data. The best-quality models (Eddy, 1990 ; Fahs, Mandelblatt, Schechter et al., 1992) are the most often cited and have both been used for updated analyses of related questions. Eddy (1990) initially addressed age of onset for Pap screening and Pap screening intervals, which was reused for the Blue Cross and Blue Shield's Technical Evaluation Center analysis of Papnet®, AutoPap®, and ThinPrep® (Brown and Garber, 1998) and Radensky and Mango's (1998) analysis of Papnet® rescreening versus conventional Pap testing. The analysis by Fahs, Mandelblatt, Schechter et al. (1992), based on a 1990 report for the U.S. Congress Office of Technology Assessment (OTA), was reused for the analysis by Schechter (1996) of Papnet® cost-effectiveness.
A series of comprehensive models of the cost-effectiveness of cervical cancer screening was initiated with a 1981 OTA report (U.S. Congress Office of Technology Assessment, 1981). Subsequently, efforts by Eddy (1990), Mandelblatt and Fahs (1988), and OTA (1990) (and a related study by Fahs, Mandelblatt, Schechter et al., 1992) addressed similar questions in somewhat different populations and perspectives. Eddy's target population for initial screening was asymptomatic 20-year-old women of average risk, whereas Fahs, Mandelblatt, Schechter et al. (1992) assessed women over the age of 65. Both analyses were conducted from the payer's perspective. All of these models used Markov processes, with the modification that transition probabilities change as patients or population age. Of all the models, Eddy's is considered to be the most general.
There are significant discrepancies in the results of different models of cervical cancer screening. For example, both Eddy (1990) and Fahs, Mandelblatt, Schechter et al. (1992) reported the marginal cost-effectiveness of triennial Pap smear screening for women age 65 or over under two conditions: (1) assuming no prior screening and (2) assuming previous regular screening. Triennial Pap screening beginning at age 65 in women with no prior screening was estimated to cost $22,448 per year of life saved, according to Eddy (1990), but was actually cost-saving in the Fahs, Mandelblatt, Schechter et al. (1992) analysis. When considering screening in women with prior regular screening, Eddy and Fahs et al. estimated different marginal cost-effectiveness ratios: $52,241 per year of life saved versus $33,572 per year of life saved, respectively (Eddy, 1990; Fahs, Mandelblatt, Schechter et al., 1992). It is difficult to assess which differences in model assumptions account for the differing results. However, sensitivity of Pap smears is known to strongly influence cost-effectiveness ratios. Eddy used 85 percent sensitivity in his base case analysis, whereas Fahs et al. used 75 percent sensitivity in their model. Another principal difference between Eddy (1990) and Fahs, Mandelblatt, Schechter et al. (1992) was whether the models assessed implementing a change in screening practices that might detect a large, one-time benefit by detecting prevalent cases in an unscreened population versus assessing the ongoing benefit one might see in a continuing policy. By attempting to recreate the results of Eddy (1990) and Fahs, Mandelblatt, Schechter et al. (1992) with the model created for this report, we were able to identify other differences, which are discussed in greater detail below.
Two analyses have addressed the cost-effectiveness of Papnet®, which provides interactive neural-network rescreening, versus unassisted manual interpretation of Pap smears (Schechter, 1996; Radensky and Mango, 1998). Schechter (1996) revised a previously published model constructed for OTA that examined the cost-effectiveness of implementing Pap screening as a Medicare benefit (Fahs, Mandelblatt, Schechter et al., 1992; U.S. Congress Office of Technology Assessment, 1990). Parameters in the model were updated to reflect the general U.S. population of women undergoing Pap testing, and the addition of a Papnet® strategy with information on the sensitivity of the new technology was supplied by the manufacturer. The primary result of this study, $48,474 per life-year saved for biennially screened women, is given in terms of the overall cost-effectiveness ratio (which compares Papnet® screening to no screening) rather than in terms of the conventional marginal (or incremental) cost-effectiveness ratio (which would require comparing Papnet® screening with conventional Pap screening). This makes it appear that Papnet® is highly cost-effective when, in fact, most of the cost-effectiveness is from conventional Pap testing; the incremental cost-effectiveness ratio for Papnet®-assisted compared with that for conventional Pap testing is much higher.
Radensky and Mango (1998) adapted Eddy's (1990) model to estimate the marginal cost-effectiveness ratio of using interactive neural network-assisted (INNA) rescreening of negative smears compared with unassisted manual examination of cervical smears at a frequency of every 3 years. The cost-effectiveness ratios ranged from $39,087 to $79,440, depending on the sensitivity of the Pap smear examination and on the specificity of INNA rescreening. More frequent screening intervals than every 3 years were not examined in this analysis.
Brown and Garber (1998) performed a cost-effectiveness analysis of Papnet®, AutoPap® and ThinPrep® for cervical cancer screening for the Blue Cross and Blue Shield Association's Technology Evaluation Center. This analysis was based upon the model by Eddy (1990). Results from this model show that Papnet®, AutoPap®, and ThinPrep® modestly improve on the diagnostic accuracy of conventional Pap screening. These screening technologies are likely to increase life expectancy by only 1 or 2 days for most women and only by a few hours for those women who already get Pap tests frequently. The marginal cost-effectiveness ratios of the new technologies were found to exceed $80,000 per additional year of life saved with biannual screening and far greater with annual screening. However, the cost-effectiveness ratios are sensitive to the costs of each technology, sensitivity of the initial test, prevalence of cervical cancer, and proportion of tumors that grow rapidly.
For the cost-effectiveness analysis, we compared conventional Pap smears at 1-, 2-, and 3-year intervals, with 10 percent manual rescreening of smears initially read as normal, with two types of technologies: one that improves the sensitivity of the initial screening step (10 percent of initially normal smears are rescreened), and one in which the sensitivity of the initial screening step is unchanged but 100 percent of initially normal smears are rescreened with improved sensitivity.
We attempted to define thresholds of cost, sensitivity, and specificity at which improved initial or rescreening technology would produce cost-effectiveness ratios of $50,000 per life-year or less. Because of the significant uncertainty surrounding both the effectiveness and the incremental costs of the new technologies, we did not attempt to estimate cost-effectiveness ratios for any specific technology.
The section begins with tables illustrating the effect of increasing conventional Pap sensitivity. We examine the effect of reducing the false negative rate by 40 percent, 60 percent, and 90 percent, using any technology, on life expectancy, cases of cancer, cancer deaths, and morbid treatments predicted by the model at 1-, 2-, and 3-year screening intervals. We also present estimates of the cumulative number of cases and deaths predicted for various age groups with each alternative screening strategy.
Because increasing the frequency of Pap smear screening is a legitimate option for increasing the sensitivity of a screening program, the results as incremental costs per life-year saved comparing all possible combinations of technology and frequency.
The sensitivity of a test is defined as the conditional probability that,
given the presence of disease, the test will be positive. In the setting of
our model, where the presence of disease is defined by the various Markov
states, the sensitivity is equivalent to the true positive rate. The FNR is
equal to the quantity:
1 - sensitivity.
Because modeling probabilities is easiest with variables between 0 and 1, we
expressed the potential improvement in Pap test performance as a reduction
in the FNR:
X (1 -sensitivity).
A reduction in the FNR is equivalent to improving sensitivity within the context of the model. The two expressions are used interchangeably throughout this section of the report.
| Parameter | Base Case | Range | References |
|---|---|---|---|
| Prevalence of HPV infection, age 15 | 0.10 | 0-0.25 | Assumption |
| Prevalence of LSIL, age 15 | 0.01 | 0-0.1 | Assumption |
| Age-specific incidence of HPV infection | Kiviat, 1996; Koutsky, 1997;Ho, Bierman, Beardsley et al., 1998;Hildesheim, Schiffman, Gravitt et al., 1994 | ||
| 15 | 0.1 | 0.5-2 X base estimate | |
| 16 | 0.1 | ||
| 17 | 0.12 | ||
| 18 | 0.15 | ||
| 19 | 0.17 | ||
| 20 | 0.15 | ||
| 21 | 0.12 | ||
| 22 | 0.10 | ||
| 23 | 0.10 | ||
| 24-29 | 0.05 | ||
| 30-49 | 0.01 | ||
| 50 + | 0.005 | ||
| Age-specific regression rate, HPV infection (HPV to Well) | Ho, Bierman, Beardsley, et al, 1998;Moscicki, Shiboski, Broering et al., 1998; Koutsky, Holmes, Critchlow et al., 1992 | ||
| 15-24 | 0.7/18 months | 0.6-0.9 | |
| 25-29 | 0.5/18 months | 0.45-0.6 | |
| 30+ | 0.15/18 months | 0.1-0.2 | |
| Progression rate (HPV to LSIL) | 0.2/36 months | 0.15-0.3 | |
| Proportion of infections progressing directly to HSIL | 0.1 | 0.05-0.5 | |
| Probability of progression after treatment for SIL | 0.05 | Eddy 1990; Fahs, Mandelblatt, Schechter et al., 1992; Schechter 1996 | |
| Regression rate (LSIL to unknown HPV) | Syrjanen, Kataja, Yliskoski et al., 1992;van Oortmarssen and Habbema 1991; Hildesheim, Schiffman, Gravitt et al., 1994; Munoz, Kato, Bosch et al., 1996;Bearman, MacMillan, and Creasman, 1987; Pretorius, Semrad, Watring et al., 1991; Janerich, Hadjimichael, Schwartz et al., 1995; Schwartz, Hadjimichael, Lowell et al., 1996 | ||
| Age 15-34 | 0.65/72 months | 0.6-0.8 | |
| 35+ | 0.4/72 months | 0.3-0.6 | |
| Progression rate (LSIL to HSIL) | |||
| Age 15-34 | 0.1/72 months | 0.1-0.3 | |
| 35+ | 0.35/72 months | 0.3-0.5 | |
| Regression rate (HSIL to LSIL) | 0.35/72 months | 0.3-0.5 | |
| Progression rate (HSIL to Stage I cancer) | 0.4/120 months | 0.3-0.5 | |
| Progression rates and probability of symptoms in unscreened patients with cancer | |||
| Stage I | |||
| Progression rate (Stage I to Stage II) | 0.9/4 years | ||
| Annual probability of symptoms | 0.15 | ||
| Stage II | |||
| Progression rate (Stage II to Stage III) | 0.9/3 years | ||
| Annual Probability of Symptoms | 0.225 | ||
| Stage III | |||
| Progression Rate (Stage III to Stage IV) | 0.9/2 years | ||
| Annual probability of symptoms | 0.6 | ||
| Stage IV | |||
| Annual probability of symptoms | 0.9 | ||
| Annual probability of survival after diagnosis by stage | Fremgen, personal communication; Jones, Shingleton, Russell et al., 1995 | ||
| Stage I | |||
| Year 1 | 0.97 | ||
| Year 2 | 0.97 | ||
| Year 3 | 0.97 | ||
| Year 4 | 0.97 | ||
| Year 5 | 0.97 | ||
| Stage II | |||
| Year 1 | 0.90 | ||
| Year 2 | 0.89 | ||
| Year 3 | 0.91 | ||
| Year 4 | 0.95 | ||
| Year 5 | 0.96 | ||
| Stage III | |||
| Year 1 | 0.70 | ||
| Year 2 | 0.74 | ||
| Year 3 | 0.86 | ||
| Year 4 | 0.93 | ||
| Year 5 | 0.83 | ||
| Stage IV | |||
| Year 1 | 0.40 | ||
| Year 2 | 0.50 | ||
| Year 3 | 0.86 | ||
| Year 4 | 0.85 | ||
| Year 5 | 0.90 |
| Parameter | Base Case | Range | References |
|---|---|---|---|
| Conventional Pap | |||
| Sensitivity, cytology ASCUS+ for histology LSIL + | 0.51 | 0.51-0.788 | See meta-analysis |
| Specificity, cytology ASCUS+ for histology LSIL + | 0.97 | 0.888-0.97 | See meta-analysis |
| Percentage of smears read as normal in initial screening step re-read at same sensitivity and specificity | 0.10 | NA | NA |
| Technology improving initial screening step | |||
| Relative decrease in false negative rate of initial screening step | 0.6 | 0.4-0.9 | Roberts, Gurley, Thurloe et al., 1997;Bolick and Hellman, 1998; Brown and Garber, 1998 |
| Relative specificity | 1 | 0.9-1.0 | Assumption |
| Percentage of smears read as normal in initial screening step re-read at same sensitivity and specificity | 10% | NA | NA |
| Technology improving rescreening step | |||
| Relative decrease in false negative rate of rescreening step | 0.6 | 0.4-0.9 | Stevens, Milne, James et al., 1997;Patten, Lee, Wilbur et al., 1997aColgan, Patten, and Lee, 1995;Brown and Garber, 1998; Kaufman, Schreiber and Carter, 1998; Jenny, Isenegger, Boon et al., 1997 |
| Relative specificity of rescreening step | 1 | 0.9-1.0 | Assumption |
| Percentage of smears read as normal in initial screening step re-read at same sensitivity and specificity | 100% | NA | NA |
NA = not available.
| Strategy | Primary Step | Rescreening Step | Overall Sensitivity |
|---|---|---|---|
| Sensitivity | Sensitivity X False negative rate | Sensitivity + (proportion rescreened X Sensitivity X False negative rate) | |
| Conventional Pap | 0.51 | 0.51 X (1-0.51) | 10% Rescreening: 0.51 + (0.1) X (0.51) X (0.49) = 0.535 |
| Improved initial screening | [0.51 + (0.6) X (1-0.51)] | [0.51 + (0.6) X (1-0.51)] X (1-0.51) | 10% Rescreening: [0.51 + (0.6) X (1-0.51)] + {(0.1) X [0.51 + (0.6) X (1-0.51)] X 0.49} = 0.843 |
| Improved rescreening | 0.51 | (0.6 X 0.49) | 0.51 + [(1.0) X (0.6 X 0.49] =0.804 |
Although life expectancy is the most common measure of effectiveness used in economic analyses of health policies, there are other important measures. We first present the predicted effects of various screening strategies on cervical cancer incidence, mortality, and treatment. Then we present the potential impact of improving the sensitivity of cervical cancer screening strategies on life expectancy, cost, and cost-effectiveness.
| Strategy | Cervical Cancer Cases/100,000 | Cancer Cases Prevented | Cervical Cancer Deaths/100,000 | Cancer Deaths Prevented |
|---|---|---|---|---|
| No Pap | 3014.6 | 1057.3 | ||
| Pap every 3 years: | ||||
| Base case | 506 | 2509 | 115.7 | 941.6 |
| FNR Reduced 0.4 | 322 | 184 | 68.3 | 47.4 |
| FNR Reduced 0.6 | 246 | 76 | 49.8 | 18.5 |
| FNR Reduced 0.9 | 161 | 86 | 29.9 | 19.9 |
| Pap every 2 years: | ||||
| Base case | 305 | 65 | ||
| FNR Reduced 0.4 | 181 | 124 | 35.9 | 29.1 |
| FNR Reduced 0.6 | 132 | 49 | 25.2 | 10.7 |
| FNR Reduced 0.9 | 79 | 54 | 14 | 11.2 |
| Pap every 1 year: | ||||
| Base case | 109 | 20.9 | ||
| FNR Reduced 0.4 | 55 | 54 | 9.9 | 11 |
| FNR Reduced 0.6 | 33 | 22 | 5.8 | 4.1 |
| FNR Reduced 0.9 | 9 | 24 | 1.5 | 4.3 |
FNR = false negative rate
| Strategy | Cumulative Cases/100,000 | Cumulative Deaths/100,000 | ||||
|---|---|---|---|---|---|---|
| 15-50 | 15-65 | 15-85 | 15-50 | 15-65 | 15-85 | |
| No Pap | 1438.2 | 2970.2 | 3014.6 | 450.1 | 786.0 | 1057.3 |
| Pap every 3 years: | ||||||
| Base case | 321.9 | 403.6 | 506 | 74.2 | 93.5 | 115.7 |
| FNR Reduced 0.4 | 209.7 | 258.2 | 322.2 | 45.2 | 55.6 | 68.3 |
| FNR Reduced 0.6 | 162.6 | 198.4 | 246.2 | 33.5 | 40.8 | 49.8 |
| FNR Reduced 0.9 | 108.8 | 130.8 | 160.6 | 20.7 | 24.8 | 29.9 |
| Pap every 2 years: | ||||||
| Base case | 200.7 | 249.9 | 305.4 | 42.8 | 52.7 | 65 |
| FNR Reduced 0.4 | 121.7 | 149.6 | 181.3 | 24.2 | 29.4 | 35.9 |
| FNR Reduced 0.6 | 90.2 | 110.0 | 132.4 | 17.2 | 20.7 | 25.2 |
| FNR Reduced 0.9 | 55.1 | 66.4 | 78.9 | 9.8 | 11.6 | 14 |
| Pap every 1 year: | ||||||
| Base case | 74.3 | 89.8 | 109 | 14.3 | 17.2 | 20.9 |
| FNR Reduced 0.4 | 38.1 | 45.6 | 54.9 | 6.9 | 8.2 | 9.9 |
| FNR Reduced 0.6 | 23.0 | 27.3 | 32.8 | 4.1 | 4.8 | 5.8 |
| FNR Reduced 0.9 | 6.4 | 7.5 | 9.0 | 1.1 | 1.3 | 1.5 |
FNR = false negative rate.
In infrequently screened populations, part of this phenomenon is a result of detection of early cancer prior to the development of symptoms. In frequently screened populations, this is more consistent with variation in tumor biology, where some tumors will be more rapidly growing. It is also consistent with length bias, where screening tests will be more likely to detect slower growing, less aggressive tumors. This prediction has several implications. First, improving the sensitivity of cervical cancer screening will reduce the overall numbers of cervical cancer cases and deaths, but it will not substantially change the proportion of cases and deaths that occur in younger women. Second, changes in the age distribution of cancer cases may be related to the success of screening programs as well as to changes in the epidemiology of HPV.
| Cumulative Treatments/100,000, ages 15-85 | ||||||
|---|---|---|---|---|---|---|
| Strategy | Simple Hysterec-tomy | Radical Hysterec-tomy | Radiation Only | Radiation + Surgery | Radiation + Chemo-therapy | Chemo-therapy Only |
| No Pap | 211 | 539 | 1209 | 296 | 416 | 35 |
| Pap every 3 years: | ||||||
| Base case | 52 | 127 | 171 | 61 | 37 | 2 |
| FNR reduced 0.4 | 36 | 90 | 101 | 41 | 22 | 2 |
| FNR reduced 0.6 | 29 | 72 | 71 | 32 | 13 | <1 |
| FNR reduced 0.9 | 21 | 51 | 41 | 22 | 7 | <1 |
| Pap every 2 years: | ||||||
| Base case | 34 | 84 | 96 | 39 | 19 | 1 |
| FNR reduced 0.4 | 22 | 66 | 51 | 24 | 9 | <1 |
| FNR reduced 0.6 | 17 | 42 | 34 | 18 | 5 | <1 |
| FNR reduced 0.9 | 11 | 26 | 18 | 11 | 3 | <1 |
| Pap every 1 year: | ||||||
| Base case | 14 | 33 | 30 | 15 | 5 | 1 |
| FNR reduced 0.4 | 7 | 18 | 13 | 8 | 2 | <1 |
| FNR reduced 0.6 | 5 | 11 | 8 | 5 | 1 | <1 |
| FNR reduced 0.9 | 1 | 3 | 2 | 1 | <1 | <1 |
FNR = false negative rate
These results show that increasing the sensitivity of the screening strategy by increasing the frequency and/or decreasing the FNR of the screening test results in significant decreases in morbid treatments. The relative proportion of treatments for early stage disease -- simple and radical hysterectomy, and radiation only -- also increases with increased sensitivity, as the distribution of cases shifts to primarily Stage I and II. As with all other outcomes, the incremental improvement with each successive strategy is small.
The shifts in stage distribution resulting from improved screening sensitivity would clearly reduce the number of cervical cancer treatments, many of which have significant short- and long-term morbidity. Given the potential impact of these treatments on quality of life, it is possible that adjustment for the effects of morbidity associated with cervical cancer treatment would alter cost-effectiveness estimates based on life expectancy alone. Further research on measurement of quality of life in women with cervical cancer is clearly needed.
| Strategy | Average Cost 1 | Incremental Cost 1 | Incremental Life Expectancy (days) 1 | Incremental Cost/Life-year 1 |
|---|---|---|---|---|
| No Pap | $893 | |||
| Pap every 3 years | $1,108 | $214 | 19.2 | $4079 |
| Improved primary screening every 3 years | $1,240 | $132 | 2.2 | $22,010 |
| Pap every 2 years | $1,255 | $15 | -0.65 | Dominated |
| Improved rescreening every 3 years | $1,276 | $21 | 0.46 | $16,396 |
| Improved initial screening every 2 years | $1,433 | $158 | 0.98 | $58,731 |
| Improved rescreening every 2 years | $1,490 | $56 | -0.11 | Dominated |
| Pap every 1 year | $1,702 | $212 | 0.17 | $448,469 |
| Improved initial screening every 1 year | $2,000 | $298 | 0.63 | $173,484 |
| Improved rescreening every 1 year | $2,128 | $129 | -0.05 | Dominated |
Discounted at 3 percent annually
| Strategy | Cost/Life-Year Saved |
|---|---|
| Conventional Pap every 3 years | $4,079 |
| Improved initial screening every 3 years | $22,010 |
| Improved initial screening every 2 years | $89,993 |
| Improved initial screening every 1 year | $302,277 |
| Strategy | Cost/Life-year Saved | ||
|---|---|---|---|
| Reduction in FNR 0.4 | Reduction in FNR 0.6 | Reduction in FNR 0.9 | |
| Conventional Pap every 3 years | $4,079 | $4,079 | $4,079 |
| Improved initial screening every 3 years | $26,678 | $22,010 | $19,480 |
| Improved initial screening every 2 years | $71,118 | $89,993 | $129,300 |
| Improved initial screening every 1 year | $234,911 | $302,277 | $439,605 |
FNR = false negative rate.
We can explain these findings as follows: At screening frequencies of less than every 3 years, the majority of lesions detected by improving sensitivity are LSIL lesions which would be less likely to progress. Therefore, the extra costs of diagnosis and treatment of low-grade (LSIL) lesions outweigh the increase in life expectancy and decreases in costs related to invasive cervical cancer diagnosis and treatment, thus resulting in higher cost per life-year as sensitivity increases. With every 3-year screening, enough significant lesions are detected that increasing sensitivity (i.e., reducing FNR) improves life expectancy. Similar results were obtained using an incremental cost of $5 per slide; incremental cost-effectiveness ratios were less than $50,000 per life-year at screening intervals of 3 years and decreased as the incremental sensitivity increased, but cost-effectiveness ratios were consistently above $50,000 per life-year at screening intervals of 1 or 2 years and increased as sensitivity increased.
Improving the sensitivity of the initial screening step had consistently favorable cost-effectiveness ratios compared with improving the sensitivity of the rescreening step provided that (1) a certain proportion of smears read as normal with the improved initial technology were rescreened at the same improved sensitivity, (2) the incremental costs per slide of both screening technologies were identical, (3) the sensitivity of the initial screening technology was at least as high as that of the rescreening technology, and (4) the specificity of the two technologies relative to conventional Pap smears was identical. We next present the results of threshold analyses performed to identify thresholds where (1) a greater reduction in FNR, (2) higher specificity relative to improved initial screening, or (3) lower cost for a rescreening technology would favor it over an initial screening technology, followed by sensitivity analyses of model and test parameters on cost-effectiveness.
| Parameter Inputs | |||
|---|---|---|---|
| Technology | Reduction in FNR | Specificity | Incremental Cost |
| Conventional Pap | 0.51 | 0.97 | -- |
| Improved initial screening step | 0.6 X (0.49) | 1.0 X 0.97 | $10 |
| Improved rescreening step | 0.85 X (0.49) | 1.0 X 0.97 | $10 |
FNR = false negative rate.
| Strategy | Cost/Life-year Saved |
|---|---|
| Conventional Pap every 3 years | $4,079 |
| Improved initial screening every 3 years | $22,010 |
| Improved rescreening every 3 years | $45,375 |
| Improved rescreening every 2 years | $121,103 |
| Improved initial screening every 1 year | $403,973 |
| Improved rescreening every 1 year | $511,993 |
FNR = false negative rate.
Varying the incremental cost of the technologies from $5 to $15 did not significantly change the threshold value. With incremental costs at $10 for both technologies, and a relative improvement of 1.4 for the initial screening technology, a relative reduction in FNR of 0.5 for the rescreening technology resulted in cost-effectiveness ratios of less than $50,000 per life-year at 3-year intervals. At a reduction in FNR of 0.9 for the initial screening technology, a relative reduction of more than 0.99 for the rescreening technology would be needed.
| Parameter Inputs | |||
|---|---|---|---|
| Technology | Reduction in FNR | Specificity | Incremental Cost |
| Conventional Pap | 0.51 | 0.97 | -- |
| Improved initial screening step | 0.6 X (0.49) | 0.96 X 0.97 | $10 |
| Improved rescreening step | 0.6 X (0.49) | 1.0 X 0.97 | $10 |
FNR = false negative rate.
| Strategy | Cost/Life-year Saved |
|---|---|
| Conventional Pap every 3 years | $4,079 |
| Conventional Pap every 2 years | $34,940 |
| Improved rescreening every 3 years | $16,396 |
| Improved initial screening every 3 years | $99,280 |
| Improved rescreening every 2 years | $87,754 |
| Improved initial screening every 2years | $243,017 |
| Conventional Pap every 1 year | $863,580 |
| Improved rescreening every 1 year | $270,452 |
| Improved initial screening every 1 year | $1,367,852 |
Because the overall sensitivity of the improved initial screening step is slightly higher than improved rescreening at every screening level, the overall increase in life expectancy will be slightly higher. However, with only a 4 percent reduction in relative specificity, the excess diagnostic costs make improved rescreening at 3-year intervals the only strategy with a cost-effectiveness ratio of less than $50,000 per life-year. Given that the overwhelming majority of screened patients will not have LSIL, HSIL, or cancer, a small decrease in relative specificity can greatly increase the overall costs of screening of the cohort.
| Parameter Inputs | |||
|---|---|---|---|
| Technology | Reduction in FNR | Specificity | Incremental Cost |
| Conventional Pap | 0.51 | 0.97 | -- |
| Improved initial screening step | 0.6 X (0.49) | 1.0 X 0.97 | $10 |
| Improved rescreening step | 0.6 X (0.49) | 1.0 X 0.97 | $3 |
FNR = false negative rate.
| Strategy | Cost/Life-year Saved |
|---|---|
| Conventional Pap every 3 years | $4,079 |
| Conventional Pap every 2 years | $34,940 |
| Improved rescreening every 3 years | $16,396 |
| Improved initial screening every 3 years | $99,280 |
| Improved rescreening every 2 years | $87,754 |
| Improved initial screening every 2 years | $243,017 |
| Conventional Pap every 1 year | $863,580 |
| Improved rescreening every 1 year | $270,452 |
| Improved initial screening every 1 year | $1,367,852 |
| Model Parameter | Threshold Value where Improved Rescreening every 3 years Favored at Cost/Life-year of $50,000 or less | ||
|---|---|---|---|
| Reduction in FNR of Initial Screening Step 0.4 | Reduction in FNR of Initial Screening Step 0.6 | Reduction in FNR of Initial Screening Step 0.9 | |
| Reduction in FNR of rescreening technology | 0.48 | 0.85 | >0.99 |
| Relative specificity of primary screening technology | Conventional Pap every 2 years favored over both technologies below 0.95 | 0.96 | 0.98 |
| Incremental cost of rescreening technology | $2 | $3 | $4.50 |
FNR = false negative rate.
There are several patterns that emerge from these results. First, as the reduction in FNR of the initial screening technology increases, the additional relative reduction in FNR required for a rescreening technology to be favored also increases. Proportionately greater improvements in detection are needed as the initial screening technology detection rate increases. Second, the relative decrease in specificity required from the initial screening technology in order to favor the rescreening technology decreases as sensitivity increases -- as the overall diagnostic costs increase, relatively small increases in false positive rates have a greater impact on costs. Finally, if test characteristics are equivalent, the incremental cost of the rescreening technology must be substantially lower in order for rescreening to be favored. The threshold differences decrease as sensitivity increases -- again, since most of the lesions detected by increased sensitivity do not contribute substantially to mortality risk, relatively small reductions in overall costs can cause large changes in the cost-effectiveness ratios.
These results have several important implications. First, small decreases in specificity can significantly affect the cost-effectiveness estimates of any technology that improves sensitivity. This makes sense intuitively: Since a large majority of the population, even a "high-risk" population, will be "normal" at any given point in time, a small increase in the false positive rate can lead to substantial increases in cost. Because the incremental gain in life expectancy is small (because of the relative rarity of the disease and the high survival rate for early invasive disease), the cost-effectiveness ratio increases rapidly as specificity decreases. Second, there are likely to be significant interactions between sensitivity, specificity, and cost-effectiveness thresholds. We did not perform formal two-way analyses, but clearly there will be combinations of sensitivity and specificity on either side of the threshold that result in one strategy being favored over the other. Third, the range of relative sensitivity, specificity, and incremental cost for the initial and rescreening technologies that we evaluated is well within the range reported in the literature or claimed by the manufacturers of ThinPrep®, AutoPap®, and Papnet®. Given the uncertainty surrounding these estimates, it is possible that all three technologies fall within accepted ranges of cost-effectiveness at 3-year screening intervals. No strategy or technology used for screening more often than every 3 years results in estimates of less than $50,000 per life-year. Under some scenarios, decreasing the screening interval to every 3 years and using an improved technology was more effective, at an acceptable cost, than using conventional Pap screening every 2 years.
To see the relative effects of varying other model parameters on cost-effectiveness estimates, we elected to use sensitivity and specificity estimates for initial and rescreening technologies that resulted in incremental cost-effectiveness ratios of $50,000 per life-year or less for both technologies. Based on the threshold analysis above, these estimates were a reduction in FNR of 0.6 for initial screening and 0.85 for rescreening. Incremental costs for both were set at $10 and relative specificities were set at 1.0. Thus, the sensitivity analyses on model parameters were performed using parameters for the two types of technologies that resulted in cost-effectiveness ratios of $50,000 per life-year or less for both types in the base case model.
Our threshold analyses suggested that a large portion of the excess costs associated with reduction of FNR is a result of the cost of diagnosis and treatment of low-grade lesions. Our base case assumption was that smears originally read as ASCUS would be repeated in 6 months, an assumption that should reduce overall diagnostic costs, but that might miss high-grade histologic lesions with cytologic diagnoses of ASCUS.
| Cost/Life-year Gained | ||
|---|---|---|
| Strategy | Repeat Pap in 6 Months | Immediate Colposcopy |
| Conventional Pap every 3 years | $4,079 | $5,254 |
| Improved initial screening every 3 years | $22,010 | $22,333 |
| Conventional Pap every 2 years | Dominated | Dominated |
| Improved rescreening every 3 years | $11,035 | $12,707 |
| Improved initial screening every 2 years | $129,912 | $163,788 |
| Improved rescreening every 2 years | $105,540 | $150,517 |
| Conventional Pap every 1 year | Dominated | Dominated |
| Improved initial screening every 1 year | $173,484 | $178,367 |
| Improved rescreening every 1 year | $511,993 | $702,748 |
ASCUS = atypical squamous cells of uncertain significance.
Changing the strategy does not change the rankings. Again, because the bulk of the lesions detected through improved sensitivity will be low-grade lesions, an aggressive diagnostic workup will increase costs with minimal gain in life expectancy.
| Strategy | Cost/Life-year Gained | |
|---|---|---|
| Base Case: Cost of Managing LSIL $1,728 | Cost of Managing LSIL $675 | |
| Conventional Pap every 3 years | $4,079 | $1810 |
| Improved initial screening every 3 years | $22,010 | $15,525 |
| Conventional Pap every 2 years | Dominated | Dominated |
| Improved rescreening every 3 years | $11,035 | $4,119 |
| Improved initial screening every 2 years | $129,912 | $116,211 |
| Improved rescreening every 2 years | $105,540 | $100,162 |
| Conventional Pap every 1 year | Dominated | Dominated |
| Improved initial screening every 1 year | $173,484 | $159,828 |
| Improved rescreening every 1 year | $511,993 | $486,790 |
LSIL = low-grade squamous intraepithelial lesion.
Decreasing the cost of managing LSIL significantly reduces the overall cost per life-year gained, especially at lower screening frequencies, but does not alter the ranking or put any other strategies below the $50,000 per life-year strategy.
| Stategy | Cost/Life-year Gained | ||
|---|---|---|---|
| Base Case: Peak Incidence 81/100,000 | Peak Incidence 106/100,000; HPV incidence increased 1.5X | Peak Incidence 106/100,000; HSIL progression rate increased | |
| Conventional Pap every 3 years | $4,079 | $2,123 | $1,042 |
| Improved initial screening every 3 years | $22,010 | $7,801(compared with Pap every 2 years) | $13,377 |
| Conventional Pap every 2 years | Dominated | $22,931 (compared with conventional Pap every 3 years) | Dominated |
| Improved rescreening every 3 years | $11,035 | $30,328 | $6,179 |
| Improved initial screening every 2 years | $129,912 | $89,520 | $85,166 |
| Improved rescreening every 2 years | $105,540 | $6,991 | $70,229 |
| Conventional Pap every 1 year | Dominated | Dominated | Dominated |
| Improved initial screening every 1 year | $173,484 | $135,216 | $116,780 |
| Improved rescreening every 1 year | $511,993 | $341,279 | $344,290 |
HPV = human papillamavirus; HSIL = high-grade squamous intraepithelial lesion.
When the increase in peak incidence is due to an increase in HPV incidence, improved initial screening every 3 years is more expensive but more effective than conventional Pap smears every 2 years. When the increase in peak incidence (and lifetime risk) is due to a more rapid progression of HSIL lesions to cancer, improved initial screening every 3 years is less expensive and more effective than conventional Pap every 2 years. These results illustrate several points. First, although the cost-effectiveness ratios decrease with increasing cancer incidence, very sensitive tests at frequent intervals are still well above the $50,000 threshold. Second, lifetime risk of cancer alone is not the only important component in determining cost-effectiveness. Clearly, the natural history of premalignant changes plays an important role in determining the degree of benefit obtained from a screening program. Third, changing different parameters within the model can result in similar patterns of cancer incidence. This may well explain some of the difference in results between models. Additional data on the natural history of HPV infection and premalignant and malignant changes, as well as further modeling efforts, are needed.
| Strategy | Cost/Life-year Gained | |
|---|---|---|
| Base Case: Sensitivity 51%, Specificity 97% | Alternative: Sensitivity 78.8%, Specificity 88.8% | |
| Conventional Pap every 3 years | $4,079 | $8,272 |
| Improved initial screening every 3 years | $22,010 | $73,300 |
| Conventional Pap every 2 years | Dominated | $219,637 |
| Improved rescreening every 3 years | $11,035 | Dominated |
| Improved initial screening every 2 years | $129,912 | $117,206 |
| Improved rescreening every 2 years | $105,540 | $975,591 |
| Conventional Pap every 1 year | Dominated | $515,216 |
| Improved initial screening every 1 year | $173,484 | $711,828 |
| Improved rescreening every 1 year | $511,993 | $4,482,673 |
Technologies that reduce the FNR of the conventional Pap smear when the FNR is 21.2 percent (sensitivity = 0.788) result in increased cost-effectiveness ratios, and no strategy falls below the $50,000 per life-year threshold. If relative specificity were also lowered, ratios would be even higher. These higher base-case estimates also explain some of the differences between our findings and those of other authors.
Another important point is that increasing the sensitivity and decreasing the specificity, even without incremental costs associated with a new technology, significantly increases the cost-effectiveness ratio.
| Strategy | Cost/Life-year Gained | |
|---|---|---|
| Base Case: Adjusted for Hysterectomy | No Adjustment for Hysterectomy | |
| Conventional Pap every 3 years | $4,079 | $3,579 |
| Improved initial screening every 3 years | $22,010 | $21,746 |
| Conventional Pap every 2 years | Dominated | Dominated |
| Improved rescreening every 3 years | $11,035 | $8,585 |
| Improved initial screening every 2 years | $129,912 | $139,536 |
| Improved rescreening every 2 years | $105,540 | $111,657 |
| Conventional Pap every 1 year | Dominated | Dominated |
| Improved initial screening every 1 year | $173,484 | $180,965 |
| Improved rescreening every 1 year | $511,993 | $549,649 |
At higher screening frequencies, the cost-effectiveness ratios increase somewhat when hysterectomy rates are not included. This is because including hysterectomy reduces the number of tests during the lifetime of the cohort, an effect that is most obvious at higher screening frequencies, and also because the actual risk of cervical cancer in women without hysterectomy is higher than the population-based risk that does not correct for hysterectomy, since the denominator in the incidence fraction is smaller. However, the inclusion of hysterectomy risk does not change the rankings of strategies.
| Strategy | Cost per Life-year Gained | ||
|---|---|---|---|
| Discount Rate 0% | Discount Rate 3% | Discount Rate 5% | |
| Conventional Pap every 3 years | $545 | $4,079 | $10,157 |
| Improved initial screening every 3 years | $9,903 | 22,010 | $51,756 |
| Conventional Pap every 2 years | Dominated (by improved rescreening every 3 years) | Dominated | $6,537 |
| Improved rescreening every 3 years | $26,984 (compared with improved initial screening every 3 years) | $11,035 | $67,787 |
| Improved initial screening every 2 years | $23,512 | $129,912 | $171,985 |
| Improved rescreening every 2 years | $67,274 | $105,540 | $148,306 |
| Conventional Pap every 1 year | Dominated | Dominated | Dominated |
| Improved initial screening every 1 year | 104,905 | $173,484 | $250,175 |
| Improved rescreening every 1 year | 355,979 | $511,993 | $669,308 |
| Age at Beginning Screening | ||||||
|---|---|---|---|---|---|---|
| 15 (base case) | 18 | 20 | 35 | 50 | 65 | |
| Conventional Pap every 3 years | $545 | $366, | $275 | $246 | $3,681 | $35,951 |
| Improved initial screening every 3 years | $9,903 | $9,450 | $8,897 | $6,074 | $11,022 | $33,855 |
| Improved rescreening every 3 years | $26,984 (compared with improved initial screening every 3 years) | $25,254 | $23,800 | $20,302 | $23,348 | Dominated (by Conventional Pap every 2 years) |
| Conventional Pap every 2 years | Dominated | Dominated | Dominated | Dominated | Dominated | $34,766 (compared with improved initial screening every 3 years) |
| Improved initial screening every 2 years | $23,512 | $21,959 | $21,014 | $16,042 | $23,567 | $48,462 |
| Improved rescreening every 2 years | $67,274 | $61,792 | $59,227 | $52,293 | $53,809 | $89,147 |
| Conventional Pap every 1 year | Dominated | Dominated | Dominated | Dominated | $127,659 | Dominated |
| Improved initial screening every 1 year | 104,905 | $99,011 | $93,726 | $71,088 | $81,633 | $151,864 |
| Improved rescreening every 1 year | 355,979 | $334,433 | $317,291 | $259,147 | $224,587 | $349,767 |
| Strategy | Cost | Incremental Cost | Days of Life Gained | Cost/Life-year |
|---|---|---|---|---|
| Pap every 3 years: | ||||
| Conventional Pap | $1,107.74 | $214.40 (compared with no Pap) | 19.2 (compared with no Pap) | $4,079 |
| Initial screening technology | $1,239.90 | $132.16 | 2.2 | $22,010 |
| Rescreening technology | $1,285.94 | $46.04 | 0.4 | $45,375 |
| Pap every 2 years: | ||||
| Conventional Pap | $1,254.99 | $361.65 (compared with no Pap) | 20.7 (compared with no Pap) | $6,370 |
| Initial screening technology | $1,433.29 | $178.30 | 1.4 | $45,265 |
| Rescreening technology | $1,500.59 | $67.31 | 0.2 | $105,450 |
| Pap every 1 year: | ||||
| Conventional Pap | $1,701.76 | $808.42 | 22.2 | $13,281 |
| Initial screening technology | $1,999.64 | $297.89 | 0.6 | $173,484 |
| Rescreening technology | $2,137.96 | $138.32 | 0.1 | $511,993 |
The addition of either technology to annual screening does not meet the $50,000 per life-year threshold. At every 3 years, both technologies do meet that threshold under these assumptions but, as illustrated above, reasonable changes in costs or test characteristics can change both the actual cost estimate and the favored strategy. At every 2 years, it is possible that, given these assumptions and no comparison to less frequent screening at higher sensitivity, one (or both) technologies might meet the $50,000 threshold. However, in several scenarios above, less frequent testing with a more sensitive test was both less expensive and more effective than alternatives with higher screening frequencies. Policymakers considering this information should look at all reasonable alternatives for improving the cost-effectiveness of cervical cancer prevention programs.
| Strategy | Average Cost ($) | Average Life Expectancy | Cancer Cases/100,000 | Cancer Deaths/100,000 |
|---|---|---|---|---|
| No Pap | 893.34 | 27.7038193 | 3014.6 | 1057.3 |
| Every 3 years: | ||||
| Conventional Pap | 1,107.74 | 27.7563797 | 506 | 115.7 |
| FNR reduction 0.4 | 1,258.37 | 27.7581568 | 322 | 68.3 |
| FNR reduction 0.6 | 1,262.49 | 27.7592261 | 246 | 49.8 |
| FNR reduction 0.9 | 1,268.76 | 27.7606046 | 161 | 29.9 |
| Every 2 years: | ||||
| Conventional Pap | 1,254.99 | 27.7605940 | 305 | 65 |
| FNR reduction 0.4 | 1,468.84 | 27.7618176 | 181 | 35.9 |
| FNR reduction 0.6 | 1,474.28 | 27.7625318 | 132 | 25.2 |
| FNR reduction 0.9 | 1,481.97 | 27.7634295 | 79 | 14 |
| Every 1 year: | ||||
| Conventional Pap | 1,701.76 | 27.7646895 | 109 | 20.9 |
| FNR reduction 0.4 | 2,107.31 | 27.7651945 | 55 | 9.9 |
| FNR reduction 0.6 | 2,113.25 | 27.7655211 | 33 | 5.8 |
| FNR reduction 0.9 | 2,121.10 | 27.7659225 | 9 | 1.5 |
| Strategy | Average Cost ($) | Average Life Expectancy | Cancer Cases/ 100,000 | Cancer Deaths/100,000 |
|---|---|---|---|---|
| No Pap | 893.34 | 27.7038193 | 3014.6 | 1057.3 |
| Every 3 years: | ||||
| Conventional Pap | 1,107.74 | 27.7563797 | 506 | 115.7 |
| FNR reduction 0.4 | 1,210.18 | 27.7581568 | 322 | 68.3 |
| FNR reduction 0.6 | 1,214.29 | 27.7592261 | 246 | 49.8 |
| FNR reduction 0.9 | 1,220.56 | 27.7606046 | 161 | 29.9 |
| Every 2 years: | ||||
| Conventional Pap | 1,254.99 | 27.7605940 | 305 | 65 |
| FNR reduction 0.4 | 1,405.98 | 27.7618176 | 181 | 35.9 |
| FNR reduction 0.6 | 1,468.84 | 27.7625318 | 132 | 25.2 |
| FNR reduction 0.9 | 1,481.97 | 27.7634295 | 79 | 14 |
| Every 1 year: | ||||
| Conventional Pap | 1,701.76 | 27.7646895 | 109 | 20.9 |
| FNR reduction 0.4 | 1,978.09 | 27.7651945 | 55 | 9.9 |
| FNR reduction 0.6 | 1,984.01 | 27.7655211 | 33 | 5.8 |
| FNR reduction 0.9 | 1,991.82 | 27.7659225 | 9 | 1.5 |
| Strategy | Average Cost ($) | Average Life Expectancy | Cancer Cases/100,000 | Cancer Deaths/100,000 |
|---|---|---|---|---|
| No Pap | 893.34 | 27.7038193 | 3014.6 | 1057.3 |
| Every 3 years: | ||||
| Conventional Pap | 1,107.74 | 27.7563797 | 506 | 115.7 |
| FNR reduction 0.4 | 1,306.57 | 27.7581568 | 322 | 68.3 |
| FNR reduction 0.6 | 1,310.69 | 27.7592261 | 246 | 49.8 |
| FNR reduction 0.9 | 1,216.95 | 27.7606046 | 161 | 29.9 |
| Every 2 years: | ||||
| Conventional Pap | 1,254.99 | 27.7605940 | 305 | 65 |
| FNR reduction 0.4 | 1,468.84 | 27.7618176 | 181 | 35.9 |
| FNR reduction 0.6 | 1,481.97 | 27.7625318 | 132 | 25.2 |
| FNR reduction 0.9 | 1,542.58 | 27.7634295 | 79 | 14 |
| Every 1 year: | ||||
| Conventional Pap | 1,701.76 | 27.7646895 | 109 | 20.9 |
| FNR reduction 0.4 | 2,236.53 | 27.7651945 | 55 | 9.9 |
| FNR reduction 0.6 | 2,242.49 | 27.7655211 | 33 | 5.8 |
| FNR reduction 0.9 | 2,250.37 | 27.7659225 | 9 | 1.5 |
In using these tables to estimate cost-effectiveness of a technology that improves the overall sensitivity of conventional Pap smears (i.e., reduces the FNR), the reader can choose an estimate for incremental cost and reduction in FNR and then use the table to calculate incremental costs per life-year gained, cancer case prevented, or cancer death prevented.
For example, if the estimated incremental cost of a new technology is $10 per
slide, and the estimated reduction in FNR rate is 0.6, then the incremental
cost per life-year saved for that technology every 3 years would
be:
(Average cost for every 3-year screening at FNR
reduction of 0.6-Average cost for every 3-year conventional Pap)/(Life
expectancy for every 3-year screening at FNR reduction of 0.6-Life
expectancy for every 3-year conventional Pap).
The model was validated against two types of external data: epidemiological data and previously published models of cervical cytological screening.
The model predictions of age-specific prevalence of HPV infection, LSIL, and HSIL were compared with external epidemiological data from the literature and expected consequences of changing various parameters.
The age-specific prevalence of HPV infection in women with normal cytology predicted by the model using base case estimates is shown in Figure 22
We tested the impact of varying the age-specific incidence of HPV from one-half to twice the base case estimates. As shown in Figure 25
We also tested the impact of varying the prevalence of HPV and LSIL at age 15 on subsequent cervical cancer incidence (Figure 26
| Parameter | Base Case | Range | Lifetime Risk of Cervical Cancer (Base Case 3.67%) |
|---|---|---|---|
| Relative risk of HPV infection (age-specific) | 1.0 | 0.5-2.0 | 2.15-6.00% |
| Proportion of HPV progressing directly to HSIL | 0.1 | 0-0.3 | 2.84-5.35% |
| Proportion of LSIL regressing to Well instead of HPV | 0.05 | 0-1 | 3.61-4.27% |
| Proportion of HSIL regressing to Well instead of LSIL | 0.01 | 0.1 | 3.47-3.67% |
| HPV prevalence at age 15 | 0.1 | 0-0.15 | 3.57-3.72% |
| HPV progression rate | 0.2/36 months | 0.15-0.3 | 2.5-4.7% |
| LSIL progression rate | 0.1-0.3/72 months (age dependent) | 0.1-0.5 | 2.42-5.88% |
| LSIL regression rate | 0.4-0.65/120 months (age dependent) | 0.3-0.8 | 2.92-4.83% |
| HSIL progression rate | 0.35/72 months | 0.3-0.5 | 2.91-4.1% |
| HSIL regression rate | 0.4/120 months | 0.3-0.5 | 2.98-3.8% |
Additional epidemiological data on the natural history parameters that affect cervical cancer risk in the absence of screening are needed. Different combinations of parameter estimates can result in similar predictions of cancer risk. Further refinement of this model may help to determine which parameters are most important in determining cervical cancer incidence.
Direct comparison with other models is difficult because of differences in terminology, assumptions, parameter estimates, and modeling techniques. However, we adjusted our model estimates to approximate previously published models as both a means of comparison and a validation technique.
| Model Characteristic | Eddy (1990) | Fahs et al. (1992) | Current Model |
| Model type | Markov cohort | Markov cohort | Markov cohort |
| Population | 20-75 | 65-104 | 15-85 |
| Natural History Parameters: | |||
| HPV effect | No | No | Yes |
| Incidence in unscreened population | 3 X age-specific incidence in US, 1988 | Based on incidence, regression, progression rates; prevalent cases at start of cohort | Calibrated to shape of incidence curve in unscreened populations (Gustafsson, Ponten, Bergstrom et al., 1997; Gustafsson, Ponten, Zack et al., 1997) |
| Proportion of rapidly progressive lesions | 0.05 | Not given | 0.1 (Proportion of HPV progressing directly to HSIL) |
| Average duration of preclinical lesions | Long interval: 8 years, range 0-16 Rapidly progressive:1 year, 0-2 | Based on regression and progression rates | Based on regression and progression rates |
| Preinvasive regression rates | Not given | CIN: 3.81%/year CIS: 0/year | LSIL: 0.4-0.65/10 years (age dependent) HSIL: 0.4/10 years |
| Preinvasive progression rates | Not given | Well to CIN: 0.09/yearCIN to CIS: 17.8%/year CIS to Stage I: 26.1%/year | LSIL: 0.1-0.3/6 years (age dependent) HSIL: 0.35/6 years |
| Cancer progression rates | Not given | Stage I-Stage II-IV:39%/year | Stage I-Stage II: 90%/4 years Stage II to Stage III: 90%/3 years Stage III to Stage IV: 90%/2 years |
| Symptom recognition rate | Not given | Stage I: 12%/year Stage II-IV: 80%/year | Stage I: 15%/year Stage II: 22.5%/year Stage III: 60%/year Stage IV: 90%/year |
| Proportion of cases in each stage | Stage I: 46% Stage II: 26% Stage III: 16% Stage IV: 12% | Stage I: 50% Stage II-IV: 50% | (Predicted): Stage I: 46.4% Stage II: 27.0% Stage III: 18.1% Stage IV: 8.5% |
| 5-year survival by stage | Stage I: 86% Stage II:58% Stage III: 40% Stage IV: 18% | Stage I: age-specific, varies from 67.7% to 52% Stage II-IV: age-specific, varies from 44.4%-15% | Stage I: 83.9% Stage II: 65.7% Stage III: 37.9% Stage IV: 11.3% |
| Lifetime risk, cervical cancer | 2.5% (3 X SEER) | Not given | 3.67% |
| Model Characteristic | Eddy (1990) | Fahs et al. (1992) | Current Model |
| Lifetime risk, cervical cancer death | 1.18% | Not given | 1.26% |
| Discount Rate | 5% | 5% | 3%, 5% |
| Screening Parameters: | |||
| Pap sensitivity | 0.8 | 0.8 | 0.535 |
| Pap specificity | 0.995 | 0.95 | 0.967 |
| Sensitivity of combination of Pap and pelvic exam for Stages II-IV | 0.8 | 0.8 | 1.0 |
| Atypia/ASCUS considered? | No | No | Yes |
| Rescreening of smears initially read as normal? | No | No | Yes |
| Costs: | |||
| Treatment of preinvasive lesions | No difference between CIN/CIS, based on inpatient costs | CIN, CIS differentiated, outpatient and inpatient costs included | LSIL, HSIL differentiated, outpatient and inpatient costs included |
This widely-cited model is the basis for other recently published cost-effectiveness analyses (Brown and Garber, 1998; Radensky and Mango, 1998). We attempted to recreate Eddy's results using our model. Although Eddy's model parameters were adjusted to fit International Agency for Research on Cancer (IARC) data (IARC Working Group on Evaluation of Cervical Cancer Screening Programmes, 1986), the incidence of invasive cervical cancer in an unscreened U.S. population was estimated by assuming that it would be three times higher than that observed in a partially screened population. However, this assumption does not account for the fact that 30-50 percent of cancer cases in the United States are from an unscreened population. Figure 27
| Screening Frequency | Proportion of Cohort | Lifetime Risk of Cancer | Lifetime Risk of Cancer Mortality |
|---|---|---|---|
| None | 12.5% | 3.57% | 1.22% |
| Every 5 years | 7.5% | 0.98% | 0.25% |
| Every 3 years | 20% | 0.58% | 0.13% |
| Every 2 years | 40% | 0.35% | 0.07% |
| Every 1 year | 20% | 0.12% | 0.02% |
| Total for cohort | 0.798 | 0.23% | |
| Calculated from SEER data | 0.79% | 0.26% |
We were able to approximate lifetime risks for cervical cancer diagnosis and mortality based on current SEER data by simulating a cohort with a reasonable distribution of screening frequencies. The proportion of women not having had a Pap smear in the preceding 3 years is estimated at 5-10 percent (Martin, Calle, Wingo et al., 1996). Because our model assumes perfect patient and provider compliance with appropriate treatment, a higher proportion of unscreened and underscreened women are needed to approximate observed incidence and death rates. This provides further evidence for the overall validity of our model as well as our base case estimate of Pap sensitivity and specificity.
| No Pap | Pap Every 4 Years | Pap Every 3 Years | Pap Every 2 Years | Pap Every 1 Year | |
|---|---|---|---|---|---|
| Cases cancer/10,000: | |||||
| Eddy | 250 | 43 | 40 | 34 | 28 |
| Current model, Eddy estimates | 252 | 20 | 17 | 10 | 4.2 |
| Cancer deaths/10,000: | |||||
| Eddy | 118 | 14 | 13 | 11 | 10 |
| Current model, Eddy estimates | 114 | 6 | 4 | 2 | 0.8 |
| Increase in life expectancy in days: 1 | |||||
| Eddy | 9.54 | 9.72 | 9.88 | 10.07 | |
| Current model, Eddy estimates | 8.59 | 9.28 | 9.56 | 9.77 | |
| Average cost: 1 | |||||
| Eddy | $264 | $355 | $439 | $1,093 | |
| Current model, Eddy estimates | $913 | $1,130 | 1,355 | $1,924 | |
| Incremental cost/life-year: 1 | |||||
| Eddy | $10,000 | $184,528 | $262,800 | >$1,000,000 | |
| Current model, Eddy estimates | $25,726 | $114,485 | $296,915 | $971,377 | |
Discounted at 5 percent
By attempting to replicate Eddy's results, we were able to identify several features that may affect conclusions about the cost-effectiveness of cervical cancer screening strategies when Eddy's model is used. First, the age-specific incidence in younger women may be overestimated, since the majority of cases in younger women in a partially screened population (the basis for Eddy's estimates) will be early-stage disease detected through screening. This in turn will contribute to an overestimation of the case-fatality rate if the distribution of stages is not adjusted. Second, because cytological atypia reported as ASCUS is not considered, the overall screening costs will be underestimated.
We adjusted our model to include only two stages of cancer, "early" and "late," and used the parameters cited in the article, including the age-specific survival for early and late cervical cancer and began the cohort simulation at age 65. Using these parameters, we found that Pap smear screening every 5 or every 3 years results in cost-savings compared with no screening. Despite differences in total cost and effectiveness, our adjustments resulted in incremental costs for more frequent screening, which were of the same order of magnitude as those of Fahs, Mandelblatt, Schechter et al. (1992). For screening every 3 years versus every 5 years, our model predicted an incremental cost per life-year of $5,581 compared with the published value of $5,956. For screening every year versus every 3 years, our model predicted an incremental cost per life-year of $74,615 compared with the published value of $39,693.
| Screening Strategy | Incremental Cost/Life-year Saved | |
|---|---|---|
| Brown and Garber | Current Model Adjusted to Eddy Parameters | |
| Screening every 3 years: | ||
| Conventional Pap (compared with no screening) | $8,996 | $8,393 |
| ThinPrep® | $37,074 (dominated by AutoPap®) | $67,124 (dominated by AutoPap®) |
| AutoPap® | $16,259 (compared with conventional Pap) | $48,455 (compared with conventional Pap) |
| Papnet® | $146,783 | $230,902 |
| Screening every 2 years: | ||
| Conventional Pap (compared with no screening) | $13,334 | $13,712 |
| ThinPrep® | $94,258 (dominated by AutoPap®) | $181,705 (dominated by AutoPap®) |
| AutoPap® | $42,666 (compared with conventional Pap) | $132,705 (compared with conventional Pap) |
| Papnet® | $343,444 | $624,870 |
| Screening every 1 year: | ||
| Conventional Pap (compared with no screening) | $26,8881 | $30,655 |
| ThinPrep® | $369,893 (dominated by AutoPap®) | $979,183 (dominated by AutoPap®) |
| AutoPap® | $166,474 (compared with conventional Pap) | $726,610 (compared with conventional Pap) |
| Papnet® | $1,069,660 | $3,314,970 |
By making adjustments to certain natural history parameters in our basic model, and by using the cost and sensitivity parameters of Eddy (1990) and Brown and Garber (1998), we were able to reasonably approximate the cost-effectiveness estimates for varying frequencies of conventional Pap smears of these models. We were also able to come relatively close to the incremental cost-effectiveness values of different screening frequencies in the model of Fahs, Mandelblatt, Schechter et al. (1992), although we found cost-savings for every 3- and 5-year Pap smear screening compared with no screening using their cost and probability estimates. We were able to replicate the relative ranking for ThinPrep®, AutoPap®, and Papnet® from Brown and Garber (1998); however, our adjusted model resulted in significantly higher estimates of incremental cost-effectiveness for these technologies at each level of screening frequency.
Although we have been unable to identify specific characteristics of our model to explain these differences, there are several potential factors that may explain some of these observed differences.
Our model consistently produced lower gains in life expectancy than other models for a given increase in screening sensitivity. This may be because of differences in the source of other-cause mortality estimates, different distribution of cases within stages between different models, or the inclusion of other Markov states such as those indicating HPV infection. Given these lower gains in life expectancy, it is not surprising that the cost-effectiveness estimates are somewhat higher.
We used specific 1-, 2-, 3-, 4-, and 5-year survival rates rather than 5-year survival rates averaged over 5 years. Because mortality for Stages II, III, and IV is higher in the first 2 years after diagnosis than in the next 3 years, discounting may affect the marginal gains in life expectancy. At the very low gains that all of the models found, this may be sufficient to change cost-effectiveness ratios appreciably.
None of the other models considered the effect of ASCUS diagnoses. We attempted to correct for this by changing the relative probability of an LSIL cytological diagnosis within each histological state. However, this correction might result in higher diagnostic and treatment costs (our average costs were substantially higher than those seen in all the models, although our incremental costs were similar) and, subsequently, higher cost-effectiveness ratios.
Despite these differences, we were able to replicate the ranking of preferred screening strategies and reasonably approximate the incremental cost per life-year saved of other previously published models for conventional Pap smears. Further studies with the current model should be able to further elucidate the reasons for the differences between our model and those previously published.
Although there are numerous differences between our model and other previously published models, there are striking similarities. All of the models show little marginal gains in life expectancy at screening intervals of more than 3 years. All of the models show only modest improvements in life expectancy from increasing the sensitivity of conventional Pap smears and, for those that evaluate new technologies, cost-effectiveness ratios for technologies that improve sensitivity (or reduce false negative rates) when screening is performed at frequent intervals.
We found that the incremental cost per life-year gained increased dramatically as screening intervals increased, as have prior cost-effectiveness analyses of Pap smear screening. We also found that the specificity of the test has profound implications for cost-effectiveness: Because the majority of women screened will be normal, the number of additional diagnostic tests generated by a small decrease in specificity will be quite large and will in fact exceed the number of potentially significant lesions detected by increases in sensitivity in many populations.
We were able to identify thresholds of cost, reduction in false negative rate, and relative specificity compared with conventional Paps where both technologies to improve initial screening and to improve rescreening meet conventional cost-effectiveness thresholds at every 3-year screening intervals. However, under most scenarios, each technology resulted in cost-effectiveness estimates of more than $50,000 per life-year when used every 1 or 2 years. Further studies are needed in order to provide more precise estimates of the costs and effectiveness of specific technologies in order to make more reliable estimates of cost-effectiveness.
Recent studies provide important new evidence with which to estimate the accuracy of cervical cytological screening. Despite the demonstrated capability of such screening to reduce cervical cancer mortality, conventional Pap smear screening has a poorer capability of discriminating between diseased and nondiseased patients than has generally been believed to be true. Pap smear screening is more accurate when a higher cytological threshold (HSIL) is used to detect a high-grade lesion. Lower test thresholds, or use of the Pap smear to detect low-grade dysplasia, result in poorer discrimination.
The accuracy of Pap smear screening is strongly dependent on study characteristics. In the studies reviewed for this report, prevalence of disease was the strongest and most consistent factor associated with between-study variation in effectiveness scores. Higher disease prevalence was associated with higher estimates of sensitivity and lower estimates of specificity (with a greater effect on specificity). These findings are consistent with prevalence as a surrogate for "workup" bias and perhaps also reflect an imperfect reference standard that is more specific than sensitive.
The quality of the studies reviewed in this report varied widely; however, quality scores did not explain a statistically significant amount of the between-study variation in effectiveness scores, sensitivity, or specificity. It is possible that the separate components of the quality score affected the test effectiveness scores in different directions, leading to an overall lack of significance. However, when controlling for prevalence of disease, because of small numbers and collinearity, we were unable to separately assess the effects of verification of test-negative subjects and type of reference standard on between-study variation in sensitivity and specificity.
Few studies of initial screening were unaffected by workup bias, but the few that were provided estimates of the specificity of Pap smear screening of 97 to 100 percent and sensitivity of 29 to 56 percent, indicating sensitivity estimates much lower than those generally believed to be true.
Most of the studies reviewed were conducted on women referred for evaluation of a cytological abnormality. The cervical cytological test repeated at the time of colposcopy was often the cytological result reported, and that result was then compared with colposcopy or histology results. The fact that cytology-"negative" women in many of these studies had often had a recent positive Pap smear suggested that they may have been systematically different from women who had received a negative Pap test on initial screening.
Based on studies using a cytological or histological reference standard, ThinPrep® demonstrated improved sensitivity compared with conventional Pap smears. The only study allowing estimation of the relative FPR suggested that ThinPrep® had slightly lower specificity than did conventional smear; however, this difference was not statistically significant.
Both computer rescreening technologies (AutoPap®, Papnet®) have been shown to be capable of improving the diagnostic yield of false negatives, although the magnitude of the false negative reduction under different circumstances is difficult to estimate precisely. Similarly, estimates of the effect of rescreening on specificity are unreliable.
The evidence regarding the accuracy of the newer technologies on cervical cytological screening is insufficient for several reasons. First, little evidence is available with which to assess the effects of thin-layer cytology or computer rescreening on specificity. Second, most estimates of effect on sensitivity (or reducing the false negative rate) are based on a surrogate reference standard: cytology. Standards for cytological reference standards put forth by the FDA and the Intersociety Working Group for Cytology Technologies improve the validity of the cytological reference standard by requiring consensus among an independent panel of cytology professionals; however, this standard is often applied only to discrepant cases, and histological verification of high-grade lesions is also often lacking. Both of these deficiencies lead to overestimation of diagnostic performance.
Our analysis provides cost estimates in 1997 dollars for procedures associated with screening, diagnosis, and treatment of cervical cancer for women ages 20-64 years and those 65 and older. Previous cost-effectiveness analyses of cervical cancer diagnosis and treatment in the United States have relied on cost estimates derived from 1988 data (U.S. Congress Office of Technology Assessment, 1990). The present analysis adds substantially to the existing cost literature on cervical cancer because it is the first to estimate costs associated with younger (20-64 years) and older women (65 years and older) separately.
The cost of Pap smear screening was somewhat higher in older women than in younger women, chiefly because physician and total time spent in obtaining Pap smears during office visits was greater for older women. For costs associated with cervical cancer treatment, estimates were substantially lower for older women because the cost estimates for the under-65 age group were derived from primary analysis of claims data, whereas cost estimates for the 65-and-older age group were obtained from Medicare payments. Although the same time period was used to enhance comparability between the costs provided for the two groups, the use of different data sources likely affected the estimates derived. The claims data analyses reflect more comprehensive costs of providing the services than do Medicare payments, since costs for related services were not always added to the cost of the primary service.
Cost estimates for diagnosis and treatment of cervical dysplasia or cancer calculated from episodes of care are substantially higher than are estimates based on average procedure-specific costs because of both the provision of related services and the effect of complicated cases with unusually high costs. Estimates based on procedure-related costs alone will underestimate the true direct medical costs.
Published models examining the cost and effectiveness of Pap smear screening have consistently found Pap screening to have a significant impact on the incidence and mortality of cervical cancer and to have an acceptable range of cost-effectiveness ratios when compared with no screening. Existing models were all consistent in this finding despite different modeling techniques, parameters, and assumptions.
Estimates of Pap test accuracy used in these models generally overestimated Pap test performance, as determined by recent unbiased studies and previous meta-analyses. Pap test performance was not found to be a key determinant of cost-effectiveness in studies that included sensitivity analyses of Pap test accuracy; however, our best estimates of Pap test performance fall outside the range used in sensitivity analyses in some models.
The studies suggest that screening programs become less cost-effective when the age of onset for Pap screening is lowered (e.g., to 17 years), when applied to low-risk subpopulations (e.g., previously screened pregnant women), or when the screening interval is more frequent (e.g., annual). Screening programs were found to be more cost-effective when applied to higher risk subpopulations (e.g., previously unscreened women, older women) or at less frequent screening intervals (e.g., every 3-5 years).
The several high-quality existing models of effectiveness and costs associated with Pap testing for cervical cancer screening provided a valuable resource for this project. These models provided a conceptual framework for the assumptions, modeling techniques, and literature evidence. The number of similar models and their degree of detail were sufficient to support a significant effort to describe and critique them. These models indicated that high-quality cost-effectiveness analysis is possible, and they not only provided resources to support the creation of such a model, but also indicated areas for improvement (e.g., better cost parameters).
We found that under favorable assumptions, the use of technologies that improve initial screening sensitivity or rescreening sensitivity can have acceptable cost-effectiveness compared with conventional Pap smear screening at a frequency of every 3 years. However, the cost-effectiveness of these new technologies (and of conventional Pap smear screening) is directly related to the frequency of screening, with longer intervals resulting in lower cost-effectiveness estimates. Our findings were relatively insensitive to assumptions about cervical cancer incidence, the cost of technologies, diagnostic strategies for abnormal screening results, age at onset of screening, or most other variables tested. However, there is substantial uncertainty about the estimates of sensitivity and specificity of the new technologies compared with each other and with conventional Pap testing. It is clear from our sensitivity analysis that both sensitivity and specificity are important in determining cost-effectiveness. Although it is clear that both types of technology provide an improvement in effectiveness at higher cost, the imprecision in estimates of their effectiveness makes drawing conclusions about the relative cost-effectiveness of thin-layer cytology and computerized rescreening technologies problematic.
Although a societal perspective would be preferable, the inherent difficulties in estimating nondirect costs such as lost productivity led us to choose the simpler health care perspective. Invasive cervical cancer affects mostly women in midlife and older, whereas the much more common HPV infection and SIL (most of which do not progress to cervical cancer) are primarily seen in younger women. It therefore seems likely that the nonhealth care costs associated with screening and treatment of SIL may be greater than those associated with the diagnosis and treatment of invasive cervical cancer. If this is true, then the cost-effectiveness of screening, and of new technologies to improve screening performance, will be more favorable from a health care system perspective than from a societal perspective, especially if new technologies increase sensitivity at the expense of specificity.
Our model predicts an age-specific incidence of cervical cancer similar to that reported in unscreened populations. Our peak incidence, 81/100,000 at age 50, is higher and comes at an earlier age than that reported in an unscreened population in the United States in the 1930s (60/100,000 at age 60); is similar to that seen in India in the 1960s, Canada in the 1960s, or Singapore in the 1970s; and is somewhat lower than in Germany in the 1960s (peak 110/100,000) (Gustafsson, Ponten, Bergstrom et al., 1997). Our estimates of age-specific prevalence of HPV, LSIL, and HSIL are also within reported ranges. We are therefore confident that our model represents a reasonable approximation of the natural history of cervical cancer. Clearly, further refinements can be made, especially in the modeling of transitions between states. Further development of this model could be useful for analyzing the natural history of cervical cancer in greater detail, even at the molecular level. The model could also be used for future technology assessments of HPV testing or HPV vaccination programs.
Our findings that the incremental gain in life expectancy decreases dramatically when the screening interval is lengthened are consistent with those of other previously published models. We found a similar pattern for cervical cancer deaths, cervical cancer cases, and morbidity associated with cervical cancer. Although further research on quality-of-life issues associated with cervical cancer screening and treatment is clearly needed, it seems unlikely that data on quality-of-life issues would significantly change our findings. Given the rarity of cervical cancer relative to HPV infection and SIL, the inconvenience, potential discomfort, and psychological distress (Paskett and Rimer, 1995) associated with screening and treatment of cancer precursors (many of which will never progress to become cancer) might well outweigh the negative impact of cancer itself on women's quality of life at the population level.
Our model also predicts that, even with the most sensitive screening program, there will continue to be cervical cancer cases and that the bulk of these cases will be in younger women. There are clearly differences in biological behavior of different tumors. The occurrence of tumors in women under 30 is itself evidence that there are some rapidly growing tumors. Since one would expect screening to preferentially detect slower growing tumors, the proportion of younger women among cervical cancer cases should increase as screening sensitivity increases.
Several authors have discussed the difficulties involved in modeling cervical cancer (Prorok, 1986; Wilson and Woodman, 1995). In addition to the uncertainty surrounding almost all of the parameter estimates, which are discussed in detail elsewhere, and given our assumptions about natural history, diagnosis, and treatment, there are limitations inherent in the underlying choice of a Markov model for our cost-effectiveness analysis.
First, we assumed that the probabilities of regression and progression of HPV and SIL, although varying with age, are constant within specified age ranges. Because the progression from HPV to SIL to invasive cancer appears to be dependent on a series of specific mutations in the HPV genome (Mitchell, Tortolero-Luna, Wright et al., 1996), it is likely that the true transition probabilities are not constant, but a function of the specific HPV viral type and host co-factors. A different model structure, using either stochastic modeling or Monte Carlo simulation, might result in a closer approximation of the true natural history. We chose a Markov cohort simulation because (1) it allowed faster computation of results, an advantage given the large number of variables, clinical strategies, and sensitivity analyses to be evaluated; (2) the interpretation of the model and its results may be more intuitive to readers with less experience in decision-analytic techniques; and (3) we lack data that would allow more precise estimation of the distribution of regression and progression probabilities.
Given the robustness of our findings to variation over a wide range of possible parameter values, it seems unlikely that a different model structure would have resulted in substantially different conclusions.
Second, like all Markov models, our model also assumes that transitions occur only at the beginning of each cycle. For life expectancy, we applied the half-cycle correction (Sonnenberg and Beck, 1993); however, this assumption may affect some of the results with certain diagnostic strategies. For example, if a Pap test with a cytological result of ASCUS or LSIL is followed by a repeat Pap test in 6 months rather than by immediate colposcopy, the underlying histological state may progress, regress, or persist during that time period, whereas our model assumes that, in the absence of treatment, a given state persists for at least 1 year. However, because regression or persistence is more likely than progression for LSIL or HSIL within a given year, this assumption results in an overestimation of the likelihood of finding SIL on the subsequent Pap. This, in turn, is a bias in favor of screening and in favor of any strategy that improves the sensitivity of screening.
Our validation technique does not account for the fact that age-specific incidence data collected at any one point in time may be dependent on the distribution in the population of different cohorts, each with its own age-specific incidence (Gustafsson, Ponten, Bergstrom et al., 1997). Our choice of an earlier onset of sexual activity and HPV infection is consistent with current demographic trends, and, again, is biased in favor of screening.
Many issues have been raised to justify the use of improved technologies for cervical cytological screening. This report does not address them all. Attention has been focused on quality control in cytopathology laboratories in an attempt to reduce the problem of false negative Pap smear tests; in fact, this has been one of the primary motivations for the development of computerized rescreening technologies. Variability in sensitivity and specificity between laboratories could be expected to have a profound effect on the cost-effectiveness, yet our model assumed uniform performance among the simulated cohort. In theory, the 10 percent manual rescreening mandated by CLIA could be a useful tool to assure uniform quality among cytopathology laboratories; however, it is an ineffective strategy alone for solving the problem of false negative Pap smear screening tests. Computerized rescreening technologies offer, in addition to improvements in performance, the advantage of greater uniformity across laboratories.
The shortage of cytotechnologists and the difficulties in implementing screening recommendations in medically underserved areas are additional factors that may provide incentives for using computerized rescreening or initial screening technologies. These issues were not considered in this analysis because many of these factors are difficult to quantify and model.
Another set of issues not included in the model relate to compliance with Pap smear screening recommendations. We assumed uniform compliance by patients and providers; however, increased costs of screening from implementation of these new technologies could result in worse compliance with cervical cytological screening and, on a national level, result in no reductions in cervical cancer incidence.
Our model suggests that improving the sensitivity of screening, even at no additional cost, might actually increase the cost-effectiveness of a cervical cancer screening program. Most of the abnormalities found are low-grade, a finding that leads to costs for diagnosis and treatment, but yields little benefit in terms of reduced cervical cancer incidence or life-years saved. Along these lines, there may not be a price at which improved initial or rescreening technologies can be cost-effective according to years of life saved or cervical cancer incidence. Only if one places a high value on detecting and treating low-grade cytological abnormalities can improvements in Pap smear screening be justified. No analyses, including this one, have taken such factors as psychological distress, quality of life, or costs of litigation for failure to diagnose cervical lesions into account; it may be factors such as these that are needed to justify the implementation of these technologies that hover near the threshold of acceptable cost-effectiveness.
The currently available data on the accuracy of thin-layer cytology and computerized rescreening technologies used for cervical cytological screening fail to describe reliable estimates for test specificity. The increased sensitivity permitted by both types of technologies has implications for the cost-effectiveness of screening that, according to our model, would result in moderate improvements in life expectancy at much higher cost than conventional Pap screening alone. The level of precision in estimates of their accuracy precludes determining the relative cost-effectiveness of thin-layer cytology and computer rescreening. However, our assumptions about their diagnostic performance do allow us to use the model to provide estimates of the cost-effectiveness of these new technologies relative to conventional Pap screening or no screening. Especially when considered as part of a strategy with screening intervals of 3 years, the new technologies have incremental cost-effectiveness ratios that are within the range of accepted health care practices.
Few studies of primary Pap screening were conducted in low-prevalence populations and unaffected by "workup" bias, but these few studies provided estimates of the specificity of Pap smear screening of 98 percent and sensitivity 51 percent. The sensitivity estimate is much lower than that generally believed to be true. Future decision models, cost-effectiveness studies, and health policy decisions should consider this lower estimate.
Thin-layer cytology technology (ThinPrep®), the computerized rescreening device (Papnet®), and the algorithmic classifier (AutoPap®), have received regulatory approval from the FDA based on their demonstration of improved sensitivity compared with conventional Pap smear techniques. However, the evidence currently available does not fully describe the impact of these technologies on the specificity of the screening process. It is possible that a new technology might simultaneously raise both sensitivity and specificity; however, this has not been conclusively demonstrated for the devices reviewed in this report.
Thin-layer cytology and computerized rescreening devices primarily aim to reduce different sources of error: sampling error and detection error, respectively. Combined use of thin-layer cytology and computerized rescreening has the potential to exceed the effectiveness of either technology alone. Although with current pricing, this strategy would not be economically feasible, further research could lead to further improvements in sensitivity.
Comparisons with cytological reference standards attest to the validity of the new technologies compared with optimal Pap screening. Comparison with a histological reference standard provides a more relevant outcome for clinical decisionmakers, since histological diagnosis forms the basis of most clinical management decisions. Further research is needed on validating negative cytological diagnoses made with the new technologies with colposcopy in both low-prevalence and high-prevalence populations. This could be accomplished by subjecting a random sample of cytology-negative women to colposcopy, which would permit statistical correction for "workup" bias and estimation of test specificity. Precise estimates of the relative performance of new versus conventional technologies require verification of all patients in whom at least one test is positive. Methods such as discrepant analysis, in which subjects with concurrent test results (negative or positive) are not verified, should be avoided in future studies.
Differences in the performance of Pap testing have been observed between different types of laboratories. Attempts at quality control aimed at assessing and reducing between-laboratory variation in the interpretation of cervical cytology have had mixed success. It has been suggested that the 10 percent manual rescreening mandated by federal law is costly and ineffective in improving sensitivity; however, it may have an important role to play in assessing and maintaining quality in cervical cytology interpretation. The computerized rescreening strategy reviewed in this report functions as a quality control mechanism and will likely reduce between-laboratory variation in interpretation. Further research in this area may be warranted.
Many of the existing models of cervical cancer screening have provided insights into the effect of various screening intervals and strategies on cancer incidence and mortality and on the cost-effectiveness of screening. Despite consistent demonstration of extremely high marginal cost-effectiveness ratios for annual screening compared with less frequent biennial or triennial screening, many agencies recommend annual screening. This recommendation may be based on other considerations, including improving patient and provider compliance, providing marginal economies of scale, or even attempting to improve compliance with other health screening practices, such as breast or colorectal cancer screening. However, it is possible that a recommendation for annual screening might actually prevent those women who are apprehensive about pelvic examinations or Pap smears from receiving any care. Although this topic was beyond the scope of our analysis, increasing the number of women who have regular cervical cytological screening would appear to have a greater impact on mortality than improving the sensitivity of the screening test. Further research is needed in ways to improve patient and provider compliance with performing screening at appropriate intervals and in assuring followup of abnormal results, especially when screening is performed at less frequent intervals.
Deaths from cervical cancer have been substantially reduced since the introduction of the Pap test. By limiting one's assessment of the impact of new technologies for cervical cancer screening to reduction in cervical cancer mortality, one fails to consider a number of other health outcomes that may be quite important. Pap screening has resulted in three trends: an increased proportion of preinvasive lesions, an increased proportion of earlier stage invasive cancer and, as suggested by our model, an increasing proportion of cases among younger women. Although these trends improve the likelihood of curative treatment, they also lead to an increase in treatment and to a longer life living with the potential sequelae of treatment. The shift toward diagnosis of more premalignant lesions, and the inconvenience, potential discomfort, and psychological distress associated with screening and treatment of precursors (many of which will never progress to become cancer), might well outweigh the negative impact of cancer itself on quality of life at the population level.
Further research on quality of life issues associated with cervical cancer screening and treatment is needed to quantify screening behaviors, quality of life associated with various treatments for cervical cancer, and quality of life associated with low-grade lesions.
Our current understanding of the biology and natural history of cervical cancer places HPV as the main causative agent. HPV testing may have an important role in cervical cancer screening or in triage of women with abnormal cytology. Studies are currently under way to evaluate the utility of HPV testing in predicting progression of atypia and low-grade lesions. The results of these and future studies may substantially change the clinical practice of cervical cancer screening. Our model has been constructed to allow assessment of the potential role of HPV testing or vaccination on various strategies for cervical cancer prevention.
A new computerized device not reviewed in this report, the AutoPap® Primary Screening System, has received FDA approval to be used for primary screening rather than rescreening of Pap smears. Because of lack of data, we did not specifically assess the cost-effectiveness of using this type of computerized primary screening. Our model suggests that improving the sensitivity of the primary screening step is more cost-effective at a given screening interval than improving rescreening sensitivity, so it is possible that this strategy might compare favorably with the strategies we evaluated. However, more reliable estimates of the incremental cost, sensitivity, and specificity of the AutoPap® Primary Screening System are needed before meaningful comparisons can be made. Results of cervical cytological screening with this new device should also be compared with colposcopy or a histology reference standard, with verification of a random sample of test negative women.
ACHPR: Agency for Health Care Policy and Research
ACOG: American Collage of Obstetricians and Gynecologists
AGUS: atypical glandular cells of uncertain significance
AHA: American Hospital Association
AJCC: American Joint Committee on Cancer
ASCUS: atypical squamous cells of uncertain significance
CAP: College of American Pathologists
CAT: computerized axial tomography
CDC: Centers for Disease Control
CE: cost-effectiveness
CER: cost-effectiveness ratio
CI: confidence interval
CIN: cervical intraepithelial neoplasia
CINAHL: Cumulative Index to Nursing & Allied Health
CIS: carcinoma in situ
CLIA: Clinical Laboratory Improvement Amendments
COBRA: Consolidated Omnibus Budget Reconciliation Act
CPS: Consumer Price Survey
CPT: Current Procedures Terminology
DRG: diagnosis-related group
EPC: Evidence-based Practice Center
FDA: Food and Drug Administration
FIGO: Federation Internationale de Gynecologie et Obstetriques
FN: false negative
FNR: false negative rate
FPR: false positive rate
HCFA: Health Care Financing Administration
HPV: human papilloma virus
HSIL: high-grade squamous intraepithelial lesion
IARC: International Agency for Research on Cancer
ICD: International Classification of Diseases
INNA: interactive neural network-assisted
LEEP: loop electrosurgical excision procedure
LSIL: low-grade squamous intraepithelial lesion
Med.: median
MEI: Medicare Economic Index
MRI: magnetic resonance imaging
NAMCS: National Ambulatory Medical Care Survey
NSI: Neuromedical Systems Inc.
Obs.: observations
OTA: Office of Technology Assessment
Pap: Papanicolaou
PCE: patient care evaluation
Q1: quartile 1
Q3: quartile 3
RBRVS: resource-based relative value scale
ROC: receiver operating characteristic
SBLB: Satisfactory but limited by
SEER: Surveillance, Epidemiology, and End Results
SIL: squamous intraepithelial lesion
STD: sexually transmitted disease
TNM: Tumor, Nodes and Metastases
TPR: true positive rate
WHO: World Health Organization
Douglas C. McCrory, MD, MHSc
Task Order Director and EPC Co-Director
Duke Center for Clinical Health Policy Research;
Department of Medicine, Division of General Internal Medicine;
Durham Veterans Affairs Medical Center
Lori Bastian, MD
Department of Medicine, Division of General Internal Medicine;
Durham Veterans Affairs Medical Center
Santanu Datta, MS, MBA
Duke Center for Clinical Health Policy Research
Vic Hasselblad, PhD
Department of Community and Family Medicine
Jason Hickey, MS
Duke University Medical School
David B. Matchar, MD
EPC Co-Director
Duke Center for Clinical Health Policy Research;
Department of Medicine, Division of General Internal Medicine;
Durham Veterans Affairs Medical Center
Evan Myers, MD, MPH
Department of Obstetrics & Gynecology
Kavita Nanda, MD
Durham Veterans Affairs Medical Center;
Department of Obstetrics & Gynecology
(currently at Family Health International, Durham, NC)
Paul Abrahamse, MS
Duke Center for Clinical Health Policy Research
(currently in Ann Arbor, MI)
Ruth Goslin, MAT
Duke Center for Clinical Health Policy Research
Rebecca N. Gray, D.Phil.
Duke Center for Clinical Health Policy Research
Jane T. Kolimaga, MA
Project Manager
Duke Center for Clinical Health Policy Research
Nancy McCall, PhD
Sujha Subramanian, PhD
(currently at Boston Scientific Corporation)
Leslie Dodd, MD
Duke University Medical Center
Margaret Gradison, MD
Duke University Medical Center
Andrea W. McChesney, FNP
Durham Veterans Affairs Medical Center
Charles S. Morrison, PhD
Family Health International
Kenneth L. Noller, MD
University of Massachusetts
Joanne T. Piscitelli, MD
Duke University Medical Center
David N. Sundwall, MD
American Clinical Laboratory Association
Stanley Zinberg, MD, MS, FACOG
American College of Obstetricians and Gynecologists
Adalsteinn D. Brown, AB
University of Western Ontario
Charles Dunton, MD
Thomas Jefferson University Hospital
Alex Ferenczy, MD
Jewish General Hospital, Quebec
Daron G. Ferris, MD
Medical College of Georgia
Bruce Flamm, MD
Kaiser Permanente, Riverside, CA
Amy Fremgen, PhD
American College of Surgeons
Alan M. Garber, MD, PhD
Stanford University
Les Irwig, MBBCh, PhD
University of Sydney
Paul Krieger, MD
Quest Diagnostics, Inc.
Nancy C. Lee, MD
Centers for Disease Control and Prevention
James Linder, MD
University of Nebraska; Cytyc Corporation
Jeanne Mandelblatt, MD
Georgetown University
Laurie J. Mango, MD
Neuromedical Systems, Inc.
C. Jay Marshall, MD
ARUP Laboratories
Charles S. Morrison, PhD
Family Health International
Kenneth L. Noller, MD
University of Massachusetts
Jeffrey F. Peipert, MD, MPH
Womens & Infants Hospital, Providence, RI
Carolyn D. Runowicz, MD
Albert Einstein College of Medicine
David N. Sundwall, MD
American Clinical Laboratory Association
Robin T. Vollmer, MD
Durham Veterans Affairs Medical Center
David C. Wilbur, MD
University of Rochester; NeoPath, Inc.
Stanley Zinberg, MD, MS, FACOG
American College of Obstetricians and Gynecologists
Jean Slutsky, PA, MSPH
1 Code appears when article fails to meet criterion.
For articles that model health outcomes:
| ||||||||||||||||
For articles that model costs:
| ||||||||||||||||
For articles that compare health outcomes with and without
Pap smear screening, or with and without use of new Pap
technology:
|
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]
Free Full text in PMC]