NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Simpkin AJ, Rooshenas L, Wade J, et al. Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study. Southampton (UK): NIHR Journals Library; 2015 Jul. (Health Services and Delivery Research, No. 3.30.)

Cover of Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study

Development, validation and evaluation of an instrument for active monitoring of men with clinically localised prostate cancer: systematic review, cohort studies and qualitative study.

Show details

Chapter 4Testing the accuracy of the prostate-specific antigen model in four separate cohorts

Aims and background

Here, we test whether or not the model for PSA presented in Chapter 3 is suitable for general application, by using it to predict circulating PSA levels in four external cohorts of men on AM/AS. We summarise the predictions for each cohort comparing the ProtecT model with a model based on all the data from each cohort in turn. If the ProtecT model is found to describe age-related PSA changes in other cohorts, then each PSA measured for a man in any clinic can be compared with the value expected with age-related change.

Methods

Study populations

The RMH AS cohort is an ongoing study into the impact of initial conservative management of clinically localised PCa.101 Data from 499 men, comprising 9472 PSA tests along with Gleason score and several other clinical covariates, were available. PSA test results were obtained between 1999 and 2012. The study eligibility criteria were baseline PSA < 15 ng/ml, Gleason score of ≤ 3 + 4 and percentage of positive biopsy cores ≤ 50%. In most men, diagnosis was based on a raised PSA and subsequent positive biopsy, so they represent a modern AS cohort. Men on AS were followed up with PSA tests every 3–4 months in the first 2 years and every 6 months thereafter. Biochemical progression was defined as PSA velocity > 1 ng/ml/year, whereas histological progression on rebiopsy was defined as having a Gleason score of 4 + 3 or > 50% positive biopsy cores. Clinical outcomes of metastases and PCSM were collected.

The Johns Hopkins (JH) AS programme began recruitment of men with clinically localised PCa in 1995.82 Men were eligible if they had a Gleason score of ≤ 3 + 3, T1c, PSAD < 0.15 ng/ml/cm3, two or fewer positive biopsy cores and maximum involvement of 50% per core. Data from 961 men, comprising 9993 PSA test results performed between 1993 and 2012, along with diagnostic Gleason score and several other clinical covariates, were available. Radical treatment was recommended once men no longer met the eligibility criteria described above. Clinical outcomes of all-cause mortality and PCSM were included in the data.

The SPCG4 data contained 290 men with 2987 PSA tests.12 These men were randomised to WW as part of the SPCG4 RCT comparing WW and radical prostatectomy. They were diagnosed between 1989 and 1999, for the most part through clinical presentation with symptoms. To be suitable for randomisation the men were required to be < 75 years of age, with a life expectancy of > 10 years. Data were available on whether or not the men developed metastases and/or died from PCa.

Data received from the University of Connecticut Health Center (UCHC) cohort consist of 114 men with 884 PSA test results102 followed for an average of 4.7 years (SD 3.9 years). Men were diagnosed between 1989 and 1993 (before the advent of widespread PSA testing), for the most part by clinical presentation with symptoms. Fewer baseline clinical exposures were available than from other cohorts, and information was not present on whether or not men developed metastases or died from PCa.

These four cohorts come from two eras, the PSA detection era (RMH and JH) and the clinical detection era (UCHC and SPCG4). As described above, the advent of PSA screening resulted in many more men being diagnosed with PCa at an earlier stage than before. The modern cohorts also come from populations with different screening prevalence. In the USA there is widespread PSA screening, whereby men are likely to have several PSA tests through their lifetime. Thus, men diagnosed with PCa in the USA are likely to have had mostly ‘normal’ PSA results before the raised value, which resulted in a biopsy and subsequent diagnosis. In the UK there is no such screening programme, and the men diagnosed with PCa in RMH may have had any level of PSA before the single high PSA that led to their diagnosis. Hence, the US men are likely to be a lower-risk group with lower PSA on average than their UK counterparts. These issues, and their impact on the poorly understood natural history of PCa, need to be considered when interpreting findings from the cohorts.

The coefficients found in the ProtecT trial model were applied to data from the RMH and JH AS cohorts as well as the UCHC and SPCG4 WW cohorts. However, all data were restricted to PSA test results ≤ 50 ng/ml and to men with an initial PSA ≤ 20 ng/ml. This is to eliminate any atypical values, in terms of a modern AM/AS study. Two cohorts of men without PCa were also included to examine differences in PSA change between men with and without cancer. Data from the Baltimore Longitudinal Study of Aging (BLSA)103 contained repeated 5012 PSA measurements of 1032 men. A model for PSA change in 1432 men without cancer from the Krimpen study,104 a large prospective community-based study in the Netherlands, has appeared elsewhere.35

Measuring performance

First, to investigate the behaviour of PSA between cohorts, we fit a simple random intercept and slope multilevel model to log(PSA) in each of these four cohorts:

log(PSAij)=β0+u0i+(β1+u1i)ageij+εij,
(4)

where u0i is the random intercept, which allows each man to have their own adjustment to the intercept β0 [i.e. the average value of log(PSA) at age 50 years] and u1i is the random slope, which allows each man to have their own adjustment to the slope β1 (i.e. the average change in log(PSA) per year increase in age). This is carried out to compare the age-related change in PSA for men with and without PCa.

The coefficients for our model of PSA change have been previously estimated using a cohort of 512 men with clinically localised PCa participating in the ProtecT study.93 In the present analysis, the model is used to predict PSA in each of four external cohorts. We measure the accuracy of the predictions using the average difference between observed and predicted PSA value per PSA test:

i=1nj=1ni(PSAijPSAiJ^)2N,
(5)

where PSAij is the observed PSA test result for person i = 1, . . . , n measured at time j = 1, . . . , ni and PSAiJ^ is the predicted PSA test result for person i at time j. The number of PSA tests can be different from person to person, and the total number of PSA tests in the cohort is N. The average absolute difference between observed and predicted PSA will always be positive, with a value of zero if the model predicted PSA perfectly.

For each of the four cohorts, predictions are made using (a) a model with coefficients derived from the external data themselves; (b) the ProtecT model coefficients; and (c) the ProtecT model updated using the first three PSA values for each man in the external cohort.97,98 Prediction (a) gives the hypothetical upper limit of performance but is not clinically useful, because the model coefficients cannot be estimated until all the PSA measurements have been taken over the duration of AM. Predictions (b) and (c) indicate what could be achieved in clinical practice, as they apply coefficients estimated using the ProtecT cohort to the other data sets, and so can be applied each time a new measure becomes available for a man on monitoring.

To calculate the coverage of the model in predicting PSA, we check whether or not a 95% prediction interval (calculated using unconditional standard errors) from the ProtecT model contains the corresponding observed value of PSA. We measure performance of the models further by tabulating the model failures, which we define as absolute difference between predicted and observed PSA > 5 ng/ml and model successes, defined as predicted PSA within 2 ng/ml of observed PSA. These are tabulated to obtain the proportion of test level failures/successes (i.e. for how many PSA test results does the model fail/succeed) and subject level failures/successes (i.e. for how many men does the model fail/succeed on average across all their PSA test results). These cut-offs were chosen to reflect what we believe to be clinically significant ranges.

Results

Age-related prostate-specific antigen change

Men on AM/AS and men without PCa have similar age-related PSA change (Table 8). For example, the PSA change per year is very similar in the Krimpen, BLSA, RMH and JH cohorts. Figure 6 shows the predicted average pattern of change if each cohort had an average PSA of 2 ng/ml at age 50 years. This hypothetical graph shows the similarities of the four modern cohorts involving men with or without PCa. However, the results from the multilevel models suggest that men without cancer have much lower average PSA values at age 50 years. In Figure 7 we see that the estimated average PSA at age 50 years is much lower in the Krimpen and BLSA cohorts.

TABLE 8

TABLE 8

Coefficients from simple models of PSA change across six cohorts

FIGURE 6. Hypothetical PSA change in the cohorts if average PSA at 50 years was 2 ng/ml in each.

FIGURE 6

Hypothetical PSA change in the cohorts if average PSA at 50 years was 2 ng/ml in each.

FIGURE 7. Estimated PSA change in the cohorts.

FIGURE 7

Estimated PSA change in the cohorts.

Screen-detected cohorts

The model updated with the initial three PSA results is, on average, 2 ng/ml away from the true PSA for RMH and 1.8 ng/ml away from the true PSA for JH (Table 9). Models fitted using all the PSA data from these AS cohorts lead to an average difference between observed and predicted PSA of 1.1 ng/ml for RMH and 0.82 ng/ml for JH. This indicates that the best prediction that can be achieved after collecting all PSA data from this cohort. Furthermore, the coverage of the model (i.e. whether the observed value lies within the 95% prediction interval) is close to the nominal value of 95%, at 98% for RMH and 97% for JH. Updating the model using the initial three PSA values has improved the accuracy of the ProtecT model. In both RMH and JH the difference between the observed and predicted PSA values reduces by 1.3 ng/ml per PSA test when using the updated model instead of the standard one.

TABLE 9

TABLE 9

Average absolute difference between observed and predicted PSA and coverage of observed PSA using the protect model on four separate cohorts

Clinically detected cohorts

The ProtecT model achieves less accurate predictions in the older cohorts of SPCG4 and UCHC (see Table 9). Although an optimal model fitted using all the SPCG4 data gives values of PSA within an average of 1.7 ng/ml from the observed PSA, the ProtecT predictions differ by 4.6 ng/ml on average. Similarly, in the UCHC cohort, the ProtecT model leads to predictions that differ by 3.7 ng/ml from the observed PSA on average, whereas, for an optimal model fitted using all the UCHC data, the average difference between observed and predicted PSA is 1.3 ng/ml. However, coverage was 95% for SPCG4 and 96% for UCHC. The results from both these cohorts further demonstrate the improvements gained using the model updated with the initial three PSAs, with the average difference per test reducing by 2.8 ng/ml in SPCG4 and 1.9 ng/ml in UCHC.

Comparison across cohorts

A clear dichotomy between the cohorts is found in Table 10. Roughly 3–4% of men have an average absolute difference between predicted and observed PSA of > 5 ng/ml using the external model in the modern AS cohorts. However, this rises to 14% and 30% for the UCHC and SPCG4 cohorts, respectively. A similar divide is apparent in the percentage of PSA tests where the model fails. This is 7–8% for the modern AS cohorts but 20% and 25% in the UCHC and SPCG4 cohorts, respectively. The reverse of these results is also true, in that the amount of model successes shows up a dichotomy between the cohorts (Table 11). Between 67% and 79% of men in the modern AS cohorts have an average difference between predicted and observed PSA of < 2 ng/ml across all their PSA test results. The two older cohorts have a lower proportion of men, between 39% and 51%, whose average absolute difference between predicted and observed PSA is within 2 ng/ml. The percentage of individual predicted PSAs within 2 ng/ml of the actual test results is 70–73% in the modern cohorts but 55% and 47% in UCHC and SPCG4 respectively.

TABLE 10

TABLE 10

Model failure by study using prediction conditioned on first, second and third PSA

TABLE 11

TABLE 11

Model success by study using prediction conditioned on first, second and third PSA

Summary

The ProtecT trial model leads to a useful prediction of PSA in AS cohorts, with up to 79% of men having an average difference between predicted and observed PSA of < 2 ng/ml. Predictions from the model were less accurate in the two older cohorts, made up of men probably at a later stage of disease and with higher PSA values. The model predicts with an average difference between predicted and observed PSA of roughly 2 ng/ml in both screen-detected cohorts and roughly 4 ng/ml in the symptom-presenting cohorts. However, using a hypothetical model based on all the collected data from these cohorts, the average difference between observed and predicted PSA is still between 0.8 ng/ml and 1.7 ng/ml. The dichotomy of performance was evident in the number of men not well described by the model (i.e. men whose average difference between observed and predicted PSA across their PSA test results was > 5 ng/ml). In the AS cohorts there were 3–4% of men with an average difference between observed and predicted PSA > 5 ng/ml in the prediction model, whereas 14–30% had at least this average difference between observed and predicted PSA in the older cohorts. In predicting PSAs across all models, the benefit of using an updated model resulted in reductions in difference between observed and predicted PSA for each cohort.

A large number of data were available for model development and validation. The model was developed using data from over 7000 PSA tests, and we have attempted to predict over 15,000 PSA test results in further cohorts using this model. Our data come from both sides of the Atlantic and traverse two eras of PCa detection: the symptomatic-presenting man from the early 1990s and the screen-detected man of the 2000s. The model performs well in both the modern US and European cohorts even though it was developed on a UK cohort. This suggests that, although PSA at diagnosis is likely to be lower in the USA, PSA change is similar. The model performs less well in predicting PSA for men in the older cohorts, but some of these men with clinically detected disease could be seeing PSA changes beyond those expected. These rapid rises of PSA would not be predicted by the model and could potentially be above a 95% reference range. In some ways, the lack of performance in these cohorts can be viewed as evidence for the use of PSARRs in AM.

Copyright © Queen’s Printer and Controller of HMSO 2015. This work was produced by Simpkin et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.

Included under terms of UK Non-commercial Government License.

Bookshelf ID: NBK305589

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (2.6M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...