Assessment of clinical utility of F-18-FDG PET in patients with head and neck cancer: a probability analysis

Goerres GW, Mosna-Firlejczyk K, Steurer J, et al.

Publication Details

CRD summary

This review assessed the use of positron emission tomography (PET) to diagnose and detect recurrence of head and neck cancer. It concluded that PET is most useful to rule out disease in patients whose probability of having the disease is low. The results need careful interpretation because of uncertainty about the quality of the studies and the types of patients in the studies reviewed.

Authors' objectives

To determine the diagnostic accuracy of positron emission tomography (PET) in the primary assessment and detection of recurrence of head and neck cancer.

Searching

MEDLINE and PREMEDLINE (via PubMed) were searched up until October 2001. The search terms were reported and there were no language restrictions. In addition, reference lists were checked and the Science Citation Index (ISI Web of Science) was searched.

Study selection

Study designs of evaluations included in the review

Prospective and retrospective observational cohort or cross-sectional studies were eligible for inclusion. Studies with patient populations (or sub-populations) of fewer than 10 were excluded. The designs of the studies included in the review were not stated.

Specific interventions included in the review

Studies of primary or secondary assessment using a dedicated PET scanner were eligible for inclusion. The specific test involved the use of fluorine-18 fluorodeoxyglucose (FDG). For studies of recurrence, only those reporting accuracy data for testing at least 1 month after the end of treatment were included. Studies that evaluated treatment response for quantitative FDG measurements were excluded.

Reference standard test against which the new test was compared

Studies that used histopathological verification or suitable follow-up were selected for inclusion. Histological verification was the reference standard used in the review.

Participants included in the review

Studies of patients with known primary head and neck cancer (squamous cell cancer or adenocarcinoma) were eligible for inclusion. There was no information on the characteristics of the patients in the included studies.

Outcomes assessed in the review

Studies that provided or enabled the construction of a 2x2 table were eligible for inclusion. The outcome measures used in the review were sensitivities, specificities, likelihood ratios (LRs), diagnostic odds ratio (DORs), and the natural logarithm of the DOR (1nDORs).

How were decisions on the relevance of primary studies made?

Two reviewers independently selected studies from the searches. Decisions on the final selection were made upon receipt of the full papers, and any disagreements were resolved by consensus.

Assessment of study quality

The studies were assessed for validity using existing checklists (references given). A study was considered to be of good quality if it employed a prospective or retrospective design, consecutive enrolment, adequate description and blinding of the test result. The authors did not state how the papers were assessed for validity, or how many reviewers performed the validity assessment.

Data extraction

The authors did not state how the data were extracted for the review, or how many reviewers performed the data extraction. The PET test result and histological verification data for the primary diagnosis and detection of locoregional recurrence were extracted. These data were used to calculate measures of test accuracy and their standard error. Data were extracted on two levels: patient and lymph node. Where data included one or more zeros, 0.5 was added to each cell in the 2x2 table.

Methods of synthesis

How were the studies combined?

At the patient and at the lymph node level, lnDORs were combined using a fixed-effect (where the chi-squared test of heterogeneity was greater than 0.1) or random-effects (otherwise) meta-analysis. Summary receiver operating characteristic (ROC) curves were presented. LRs were pooled and presented in forest plots, shown separately for primary diagnosis and assessment of recurrence.

How were differences between studies investigated?

Differences between the studies were investigated with meta-regression. The 1nDOR was regressed on predefined independent variables: blinding of test results, study design (prospective or retrospective), publication language, assessment of lymph node detection (yes or no) and PET for primary or secondary assessment. The results were used to stratify further analyses, in which forest and Galbraith plots were used to visually investigate residual heterogeneity.

Results of the review

Eighteen studies were included in the review, from which 25 2x2tables were generated. The total number of participants could not be derived from the data shown.

The meta-regression found that assessment of lymph node detection was associated with a significant change in DOR. Therefore, summary results were presented separately for lymph node and patient level data. However, the authors appear to have drawn their conclusions from the latter. The results were presented using pre-test disease probabilities of 20, 40 and 60%. Those for 20% and 60% are summarised below. The Galbraith plots showed no evidence of heterogeneity between the studies.

For primary diagnosis, positive and negative LRs were 3.90 (95% confidence interval, CI: 2.56, 5.93) and 0.24 (95% CI: 0.14, 0.41), respectively. For restaging (recurrence assessment), the positive LR used was 3.96 (95% CI: 2.79, 5.63) and the negative LR was 0.16 (95% CI: 0.10, 0.25).

Using a 20% pre-test disease probability, a meta-analysis of patient level data for primary diagnosis showed positive and negative post-test probabilities of 49.4% and 5.7%, respectively. For restaging (recurrence assessment) the positive and negative post-test probabilities were 49.7% and 3.8%, respectively. Those for lymph node level data for primary diagnosis were 81.2% and 4.5% respectively. For restaging, the results were 73.3% and 3.4%.

Given the results for negative post-test probabilities in patient level data, the authors suggested that (in patients with low pre-test probability) the FDG PET test was able to rule out disease with approximately 6% post-test disease probability. Therefore, further imaging could be suspended.

Using a 60% pre-test disease probability, a meta-analysis of patient level data for primary diagnosis showed positive and negative post-test probabilities of 85.4% and 26.5%, respectively. For restaging, the positive and negative post-test probabilities were 85.6% and 19.4%, respectively. Those for lymph node level data for primary diagnosis were 96.3% and 22.2%. For restaging, the results were 94.3% and 17.4%.

Given the higher results for positive post-test probabilities in patient level data, the authors suggested that (in patients with higher pre-test probability) the FDG PET test would be considered accurate enough to start or modify treatment.

The authors advised cautious interpretation of the results, given the possibility of inconsistent LRs at different pre-test values. Observer agreement at the data extraction stage was 90 to 100%, with kappa values ranging from 0.9 to 1.0.

Authors' conclusions

The FDG PET test can reliably rule out the presence of lymph node metastasis, or residual disease or locoregional recurrence, in patients with a low pre-test probability of the disease.

CRD commentary

This review addressed a specific question with clear inclusion criteria and an adequate search strategy. A limited number of sources were searched. The validity assessment was appropriately carried out using published checklists, although failure to include the results of this assessment meant that the reliability of the findings was difficult to determine. Steps were taken to minimise selection bias, but it was unclear how many reviewers were involved in the data extraction and validity assessment processes and whether these were carried out independently.

Detail on the primary studies was lacking, especially with regard to patient characteristics. The possibility of publication bias could not be ruled out. Due to the absence of reporting on statistical heterogeneity between the studies, it was not clear whether the pooling of LRs was appropriate. Pooled LRs were used, in combination with a range of pre-test probability values (not derived from the review), to generate post-test probabilities upon which the authors based their conclusions. The reliability of the pooled LRs on which the calculations were based is unclear. In general terms, the pooled LR quoted would only be considered of moderate to low use in ruling out disease, hence only useful in populations where disease is relatively unlikely. Given these limitations, the extent to which the authors' conclusions are reliable is not clear.

Implications of the review for practice and research

Practice: The authors stated that FDG PET can be used in all patients with head and neck cancer, for both staging purposes and to exclude disease in cases of suspected recurrence or residual disease.

Research: The authors did not state any implications for further research.

Bibliographic details

Goerres G W, Mosna-Firlejczyk K, Steurer J, von Schulthess G K, Bachmann L M. Assessment of clinical utility of F-18-FDG PET in patients with head and neck cancer: a probability analysis. European Journal of Nuclear Medicine and Molecular Imaging 2003; 30(4): 562-571. [PubMed: 12589477]

Indexing Status

Subject indexing assigned by NLM

MeSH

Adult; Aged; Aged, 80 and over; Female; Fluorodeoxyglucose F18 /diagnostic use; Head and Neck Neoplasms /epidemiology /pathology /radionuclide imaging; Humans; Lymphatic Metastasis; Male; Middle Aged; Models, Biological; Models, Statistical; Neoplasm Recurrence, Local /epidemiology /radionuclide imaging; Prevalence; Radiopharmaceuticals /diagnostic use; Reproducibility of Results; Risk Assessment /methods; Sensitivity and Specificity; Switzerland /epidemiology; Tomography, Emission-Computed /methods /statistics & numerical data

AccessionNumber

12003001052

Database entry date

30/09/2005

Record Status

This is a critical abstract of a systematic review that meets the criteria for inclusion on DARE. Each critical abstract contains a brief summary of the review methods, results and conclusions followed by a detailed critical assessment on the reliability of the review and the conclusions drawn.