NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Gutman SI, Piper M, Grant MD, et al. Progression-Free Survival: What Does It Mean for Psychological Well-Being or Quality of Life? [Internet] Rockville (MD): Agency for Healthcare Research and Quality (US); 2013 Apr.

Cover of Progression-Free Survival: What Does It Mean for Psychological Well-Being or Quality of Life?

Progression-Free Survival: What Does It Mean for Psychological Well-Being or Quality of Life? [Internet]

Show details


Progression-free survival (PFS) is defined as the time from random assignment in a clinical trial to disease progression or death from any cause. PFS as an outcome is of interest to a variety of disciplines, most especially, for purposes of this project, to oncologists, pharmacologists, trialists, social scientists, and other scientists with interest in designing or interpreting clinical trials. This background section addresses how PFS is used, its role as a surrogate for overall survival (OS), the challenges it presents in obtaining accurate and reproducible measurements, and finally its role as a health outcome.

Overall Survival

This section briefly reviews the use of OS as a standard outcome, and the reasons why other survival outcomes, such as PFS, have garnered interest from the clinical research community. OS has long been considered by the Food and Drug Administration (FDA) and the European Medicines Agency as the gold standard for the evaluation of new oncologic therapies.1 It is defined as the time from random assignment to the date of death due to any cause, or to the date of censoring at the last time the subject was known to be alive in intention-to-treat populations. OS is “an unambiguous endpoint measure because it is evaluated on a continuous time scale, which gives precise accuracy for the time of the event.”2

However, the use of OS can be challenging. For example, if survival is only incrementally improved by a new treatment, the demonstration of increased OS may require large patient populations, several years of accrual and followup, and higher costs.3, 4 This is especially true if the natural history of the disease course is lengthy.

The Growing Interest in PFS

Over the past 10 years there has been increasing interest in the use of outcomes other than OS to study new drugs, including PFS. The interest in PFS stems in part from the challenges associated with OS as an endpoint, but it has also been fueled by the fact that many new drugs are targeted toward molecular mechanisms of action that are cytostatic rather than cytotoxic. These drugs are not expected to provide the same objective response rates of earlier drugs, and instead act to prevent progression rather than cause tumors to regress and thereby impact mortality. Interest in PFS has also been sparked by the increasingly common use of treatment paradigms that allow for multiple rounds of drug treatment (first-, second-, third-, and even fourth-stage therapies), each producing incremental changes difficult to capture in the context of a single study using OS as the primary endpoint. In contrast, PFS can be studied in the short-term context of each treatment, without the confounding influence of the next. The FDA has recently published a regulation (21CFR813, subpart H) that allows the use of PFS or other surrogate clinical endpoints other than survival or irreversible morbidity in the accelerated approval of new drugs for serious or life-threatening illnesses.

While this methods project focuses specifically on PFS, it is recognized that there is widespread interest in a number of alternative endpoints, including disease-free survival, relapse-free survival, time to progression, and objective response rate. For informational interest, definitions of these can be found in Appendix A.

Relationship of PFS to OS

The Issue of Surrogacy

The term “clinical endpoint” has been defined by the Biomarkers Definition Working Group as an outcome that measures how a patient feels, functions, or survives.5 The term “surrogate endpoint” is an outcome measure that has been validated as an adequate substitute for the clinical endpoint. Ideally, identification of a surrogate endpoint in a drug study provides a reliable signal that the clinical endpoint of the study has been met. Surrogate endpoints may be laboratory variables, single measures of disease activity (recurrence, progression, etc.), or composite measures of disease activity. It is important to note that the validation of a surrogate endpoint requires evidence that goes beyond merely showing a statistical association between the surrogate and clinical endpoints. As noted by Shi and Sargent,6 as a guiding principle, “the treatment effect observed on a valid surrogate endpoint (substitute) should reliably and precisely predict the treatment effect on the clinical endpoint (entity being replaced).” A number of statistical methodologies can be used, including hypothesis testing,7 estimation and prediction,8, 9 and meta-analytical approaches.6

PFS as a Surrogate for OS

Considerable interest has been focused on the use of PFS as a surrogate endpoint for predicting OS. It is now recognized that the correlation between PFS and OS is both variable and unpredictable and depends on tumor type and tumor stage, as well as the particular drug being investigated.6 That PFS is not always a reliable surrogate for OS is not entirely surprising, given that the tumor pathways affected by new drugs and the nature of drug and tumor interaction, as well as drug toxicity, are often incompletely known.

Broglio and Berry10 have recently performed simulation studies partitioning OS into two parts, the first PFS, and the second what they call survival post-progression (SPP). They defined SPP as OS minus PFS. Using preset 6- and 9-month medians for PFS in each arm of a hypothetical two-arm study, they concluded that a statistically significant increase in OS was detected with 90 percent probability if median SPP was 2 months, but less than 20 percent if median SPP was 24 months. They recommended PFS be used as a primary endpoint only when median SPP is short. These conclusions are confirmed by Amir et al.,11 who evaluated 26 studies of chemotherapy for solid tumors in which a hazard ratio was reported for both OS and PFS (or time to progression, related to PFS, see definition, Appendix A). They also found a higher correlation between OS and PFS when SPP is short than when SPP is long. However, even in instances in which SPP was less than 12 months, they identified only a moderate correlation coefficient of 0.64 between PFS and OS.

PFS as an OS Surrogate for Specific Cancers

Efforts to establish PFS as a surrogate for OS in oncology trials have had variable results depending on the specific cancer. For example, several studies have shown that PFS is a valid surrogate for OS in colorectal cancer,1215 and it has been argued that PFS is a reasonable primary endpoint for the disease on its own merit.16, 17 Similar conclusions have been reached about PFS as a surrogate for OS in first-line therapy for ovarian cancer.1820 Expert panelists at two major workshops agreed, however, that the PFS to OS relationship with regard to ovarian cancer may be different for different patient groups or for first-line compared with second- or third-line therapy.18, 20

Contrary to the relative success of PFS as a surrogate endpoint in first-line treatment of colorectal and ovarian cancer, a strong relationship between PFS and OS has not been demonstrated in studies of metastatic breast cancer.4, 12, 14, 21

Depending on the toxicity of a new drug that has been found to increase PFS, it is possible to postulate scenarios in which treatment accelerates both psychological and physical morbidity, resulting in decreased patient quality of life (QOL). In the worst case scenario, use of a drug to increase PFS may have the unanticipated downside of actually compromising the balance between tumor and patient resistance to tumor, causing a shorter rather than a longer duration in OS. As witnessed in the recent FDA decision to remove the indication for use of bevacizumab (Avastin®, Genentech/Roche) in breast cancer, there are strong feelings about both the use and interpretation of PFS, as well as differing opinions on the risk-to-benefit value of drug efficacy and toxicity.

Issues in the Measurement and Reporting of PFS

There are several potential sources of measurement bias or variability in studies using PFS as a primary endpoint. Potential for bias should be addressed prospectively in trial designs to ensure the validity of any differences in PFS found between treatment arms.3, 2225 This section describes the main sources of potential bias, as well as suggested mechanisms for controlling their impact. In addition to discussing four major sources of bias, including assessment, evaluation, performance, and attrition, the role of detection error is also described. Like bias, this measurement issue can lead to incorrect conclusions about the performance of drugs.

Assessment Bias

The exact date of progression cannot be known, since it is determined based on the types and timing of assessments. At the point in time that progression is identified, it is only known that this event occurred at some point between the last negative evaluation and the one at which this reclassification of disease status occurs. In general, the date of first progression is taken as the date of the evaluation at which progression was first evident, which is likely to be an overestimate of PFS. As Panageas et al.26 have recently noted, in a trial this overestimation of median PFS can lead to erroneous conclusions about new treatments, suggesting benefits that in fact may not actually exist.

Use of the last date the patient was identified to be progression-free or an intermediate interval, such as the midpoint between the two dates, has been considered as possible alternative or additional mechanisms for reporting PFS. The former may underestimate PFS and the latter, like the date of first identification, likely overestimates PFS. According to Panageas et al.,26 what is important in capturing an accurate PFS measurement is the timing of the measurement interval in relationship to the true median PFS. They suggested PFS be characterized using interval reporting in which estimates of this event are characterized by the time interval in which they occur. Zhuang et al.,25 recommended that the assessment interval not exceed the expected improvement in median PFS in the experimental versus control arm.

Freidlin et al.27 recommended the use of two preselected scan timepoints with strictly chosen schedule limits, instead of multiple regular testing intervals. For optimal evaluation of performance, they suggested the selected timepoints represent the median PFS and twice the median PFS expected in the control arm of a study, and that a significance test of the difference in PFS rates at the two scan timepoints be assessed on the grouped data.

It is important to note that progression may be detected as a result of the occurrence of symptoms which cause the patient to receive what would otherwise be an unscheduled imaging exam. These unforeseen events clearly must be addressed in the protocol of studies and data should be collected and analyzed in a manner that accounts for them.

Evaluation Bias

Another timing bias has to do with unevenness in the timing of tumor assessment between the two treatment arms. This can result in progression being identified earlier in one arm than the other, even when there is no actual difference in efficacy.3, 25 Asymmetry can result when assessments are scheduled around the treatment cycle and one arm has more cycle delays than the other, or when there is disparity in unscheduled or missed visits between the two treatment arms. Small treatment-related differences in measurement time (as short as 2 days) have been reported to result in false study conclusions.

For a reliable measurement, patients must be evaluated on a regular and balanced basis across treatment arms. While most well-designed comparative studies address this issue, they should still be monitored for asymmetry. It is important to report and analyze progression events confirmed at preplanned timepoints and at unscheduled visits. One statistical technique for assessing this form of bias is to perform sensitivity analyses to examine the strength of a positive result in a clinical trial relative to the sources of bias.28

Performance Bias

The patient’s response to treatment or progression status may be influenced by knowledge of the treatment arm.3, 29, 30 Physicians treating patients in an experimental drug study may believe the drug offers the best treatment outcome, and as a result are inclined to under-diagnosis progression, leaving patients on the experimental drug for a longer period of time. Conversely, physicians may over-diagnose progression in patients receiving control therapy in order to assure an opportunity for cross-over to the experimental drug.

The ideal mechanism for addressing performance bias is to perform a double-blinded study. Unfortunately, because of differences in drug administration or in the toxic profiles of treatments, blinding is not always possible. One mechanism for addressing bias in local evaluations, particularly in studies using standardized radiologic endpoints, is use of blinded independent central review (BICR).3, 24, 29, 30 Radiological images being evaluated are blindly and independently reviewed by an outside centralized, often expert, group of readers. Conditions for reading are standardized as much as possible (e.g., images are evaluated serially to assure changes from baseline are carefully tracked).

Although recommended in regulatory guidance,31 BICR has generated some controversy because of its complexity and cost.24, 29, 32 In addition, depending on the timing of central review, discrepancies in evaluation of progression between local and central reviews can lead to informative censoring (i.e., removal of patients from study who were identified as progressed by local evaluators, but not confirmed by central review), leading to potential bias. Suggestions to address this problem include performing BICR in real time and feeding back results to local study sites or designing studies to allow for continued evaluation of progression for at least one scan after local progression is called.

Tang et al.33 studied eight trials using PFS as an endpoint comparing results of local evaluations with those of BICR. They concluded that although benefits of treatment could be quite variable (−2 to 2.4 months), there was no evidence of systemic bias. Amit et al.,34 in a meta-analysis of 27 blinded studies with independent central review of progression performed by the Pharmaceutical Research and Manufacturers Association, reported a strong correlation between local evaluation and BICR. They concluded that when studies are blinded, and/or when large study size effects are anticipated, a BICR might not be warranted. They described a sample-based approach to BICR that defines early and late discrepancy rates between local and central review. Although they did not define a threshold for differential discordance, they suggested this discordance be used in decision-making about whether a full BICR is necessary. Unfortunately, it appears that small, but potentially significant differences (up to 15 percent discordance) are not detected by their approach.32

Attrition Bias

When too many patients withdraw from a study or are lost to followup, and when losses are not at random, remaining results may be biased. This is especially problematic when attrition is greater in one arm than the other.3, 25 PFS data are censored at the time of last available assessment, so the proportion of censored patients should be reported for both treatment arms, along with the reasons for censorship. Short assessment intervals and ongoing physician and patient education regarding the goal of treatment have been found to help minimize patient withdrawal and loss to followup.25 Sensitivity analysis can be performed to look at various subgroups of patients subject to attrition bias, as well as the effect of attrition bias in total, to assess the impact of this bias on study conclusions.28

Detection Error

PFS is most commonly a composite endpoint including radiologic progression, death, and in some cases, nonradiologic criteria, such as symptomatic progression. While death is an absolute endpoint, radiologic progression is a subjective measurement prone to reading errors by the radiologist. Errors in identification of these endpoints are referred to as detection errors.

Dancey et al.,3 have identified four criteria to establish progression: the appearance of new radiologic lesions; an increase in the size of measurable target lesions; a clear, unequivocal increase in nontarget disease; and/or worsening of nonradiologic signs and symptoms.

There are numerous caveats associated with these criteria. New radiologic lesions, for example, must be unequivocal and significant enough in size to avoid the measurement error in the methodology being used. Oxnard et al.,35 recently studied 30 patients with non-small cell carcinoma of the lung (1 cm or larger in size), undergoing two computed tomography (CT) scan evaluations within a 15-minute interval. All scans were read side-by-side by three radiologists. Measurement changes were within ± 10 percent for 84 percent of measurements. Changes of 20 percent or more were observed in 3 percent of measurements. They concluded that CT scan measurement of lung lesions has clinically meaningful variability and suggest caution in the interpretation of small changes in lesion size in the care of individual patients and in the interpretation of clinical trial results.

Identification of new lesions should unequivocally demonstrate a metastatic deposit. In order to be certain this is the case, baseline anatomic scanning is required to detect the presence of disease in all areas likely to be the site of metastases based on what is known about the tumor being evaluated.

Most problematic is establishing progression in disease that is detectable, but not measurable. Assessing a worsening of disease burden, such as disease-related symptoms, from a nonmeasurable baseline, is a largely subjective determination. Efforts should be directed at creating an operational definition of progression of nonmeasurable disease. Commonly, these will be imaging studies at lesion sites difficult to quantify and/or evidence of adverse events related to disease. The additional data elements identified should be relevant to the disease setting, clearly understood, collected in case report forms, and appropriately included in the analysis plan. Optimally, these elements would lend themselves to independent verification.

If changes indicating disease progression are equivocal, Dancey et al.,3 recommend, when medically possible, that the patient remain on study until progression is unequivocal. At that time, a decision would need to be made as to whether the progression date is backdated to the first equivocal finding or recorded as the date of the unequivocal determination. Of note, measurement variability will generally not lead to study bias since it occurs in all treatment arms of the study. It is, however, likely to lead to failure to identify changes in disease status that result from use of a new drug.

PFS as a Health Outcome

Many advocates for the use of PFS as an endpoint contend that delaying tumor progression independently confers clinical benefit, since being progression-free is considered an indication of disease control and stabilization. A direct result should be stability, and perhaps even reduction, in disease symptoms, thus improving QOL for patients. While in PFS patients are spared the symptoms of progressive disease, from undergoing further treatment with additional therapies and their attendant toxicities, and from the psychological burden and uncertainty associated with disease progression.36 In this scenario, the main impact of PFS is expected to be in QOL, which may or may not represent a causal relationship. Of interest in exploring this relationship is the question of how knowledge about PFS can impact perception of patient symptoms and other more global measures of QOL.

As noted by Fallowfield and Fleissig in their abstract, “New treatments that increase PFS may not be of sufficient value to patients with advanced-stage cancer unless accompanied by tangible quantity or QOL advantages. Any symptom relief that patients gain from treatment resulting in tumor shrinkage or stabilization must be balanced against the toxic effects that drug therapy itself creates.”37 A task force—Assessing the Symptoms of Cancer Using Patient-Reported Outcomes (ASCPRO) Multisymptom Task Force—has recently proposed that the measurement of symptomatic change, as a subset of QOL, may be a sufficient outcome in clinical trials to allow health providers, patients and regulators confidence in the use of new treatments.38 Of note, the FDA has explicitly included a symptom benefit as an option for the pharmaceutical industry for both anticancer and supportive-care agents in its 2007 “Guidance for Industry: Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics.”39

Recent studies have used conjoint analysis to try to understand and quantify the value of PFS to patients versus avoidance of risk of toxicities. Mohamed et al.,40 evaluated the benefit-risk preferences of patients with renal cell carcinoma, using a series of 12 tradeoff questions to determine what magnitude of PFS improvement was worth significant treatment-related risks. Patients were willing to accept significant treatment-related risks of 2 to 3 percent for liver failure and blood clot to increase PFS by 11 months.

Bridges et al.,41 in a conjoint analysis of patients with advanced non–small-cell lung cancer, examined the tradeoffs patients were willing to make between increased PFS and the risk of experiencing disease symptoms, such as fatigue, diarrhea, nausea and vomiting, fever, infection, and rash. They concluded the value patients attribute to an increase in PFS was conditional upon the severity of disease symptoms experienced. These studies suggest that QOL, in relation to PFS, is important to patients. As Hartzband and Groopman42 have recently noted “basing decisions on the outcome of death ignores vital dimensions of life that are not easily quantified. …There is more to life than death.”

Because PFS itself is an outcome, it is not possible to study it in the same manner applied to a drug treatment. Normally, randomization of patients to either intervention or control arms in a clinical trial ensures the equal distribution of confounding variables. However, because PFS is an outcome, it cannot be predicted in advance, and patients cannot be randomized according to its improvement or lack thereof. Thus, an important aspect to the investigation of PFS, and its impact on other outcomes of importance to patients, is an examination of how the relationship is studied and whether it is feasible to clearly define the relationship exclusive of other factors.

The objective of this methods project was to determine whether PFS is an outcome related to psychological well-being or QOL.


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (511K)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...