NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Newberry SJ, FitzGerald J, Maglione MA, et al. Diagnosis of Gout [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2016 Feb. (Comparative Effectiveness Reviews, No. 158.)
Key Findings and Strength of Evidence
The key findings and strength of evidence are summarized below and in Table 6.
Table 6
Summary of findings and strength of evidence.
Accuracy of Tests for the Diagnosis of Gout
- Few studies that assessed the accuracy of clinical signs and symptoms consistently applied the same reference standard (either analysis of MSU crystals in synovial fluid or a single clinical algorithm) to all participants with suspected gout.
- Studies that assessed the use of diagnostic clinical algorithms compared with synovial fluid analysis for MSU crystals reported widely varying sensitivities and specificities; however, an algorithm developed from clinical signs and symptoms used by primary care physicians reported good positive and negative predictive value and was validated in a small secondary care population but needs further validation. The strength of evidence for this conclusion is low, based on the identification of only two studies that assessed this particular clinical algorithm.
- In three studies that enrolled only patients not previously diagnosed with gout, the sensitivities and specificities of DECT for predicting gout ranged from 85% to 100% and 83% to 92%, respectively, compared with synovial fluid analysis for MSU crystals or a validated clinical algorithm.
- Ultrasound was more variable than DECT in its ability to detect gout: Four studies of ultrasound showed sensitivities that ranged from 37% to 100% and specificities that ranged from 68% to 97%, depending on the signs assessed. The strength of evidence for this conclusion is low.
- No studies were identified that assessed the validity of serum urate, CT scan, or plain x-ray for diagnosing gout. The strength of evidence for these tests is insufficient.
- No studies were identified that directly assessed the effect of joint site or number of affected joints on diagnostic accuracy. The strength of evidence for this question is insufficient for all diagnostic methods.
- No studies were identified that directly assessed the effect of duration of symptoms on the accuracy of diagnostic tests. The strength of evidence for this question is insufficient for all diagnostic methods.
- Agreement among personnel examining synovial fluid using polarizing microscopy for detection of MSU crystals appears to be poor, but it is unclear whether the experience and training of analysts are a factor. No studies examined the effect of the type of practitioner performing fluid aspiration on the ability to obtain a sample. Because of the relatively small number of studies identified, the strength of evidence for definitive influential factors is insufficient.
Adverse Events Associated With Testing for Gout
- Considering potential adverse effects that might be associated with diagnostic tests for gout, including pain, infection (at the aspiration site), or the short- or long-term effects of radiation exposure, no studies documented any adverse events associated with diagnostic tests in any studies included in this report. The strength of evidence for this conclusion is low, based on one study that reported no adverse events associated with joint fluid aspiration for MSU analysis or DECT, and no studies that reported on adverse events associated with ultrasound or clinical examination.
- Missed diagnosis or delayed diagnosis of acute gout (failure to find MSU crystals in synovial fluid) was reported in a retrospective two-center study to be associated with a longer interval between the onset of attack and joint aspiration. A negative MSU finding was associated with higher risk for undergoing arthroscopic drainage, longer hospital stay, and delays in anti-inflammatory treatment. The strength of evidence for this conclusion is insufficient.
Findings in Relation to What Is Already Known
Over the past 25 to 30 years, gout diagnosis has been an area of some controversy. Efforts have been aimed at determining whether the assessment of MSU crystals in synovial fluid aspirated from joints is really the gold standard, validating algorithms comprising various combinations of clinical and laboratory criteria, and validating the use of ultrasound and DECT imaging.
The focus of this report was on evaluating the validity and safety of existing diagnostic methods for use in a primary, urgent, and emergent care setting, where the majority of gout patients are first seen and diagnosed. Patients who present in these settings with an inflamed joint and who have not had a prior diagnosis of gout (or another rheumatic condition) are almost certainly having an acute attack, which may be the first or the latest of a number of attacks. Thus they may be in an early stage of the disease or at least will be less advanced in the disease process than patients seen in the rheumatology setting. Important considerations in diagnosing gout in these patients include ensuring criteria are sensitive enough to diagnose less advanced disease and specific enough to rule out other conditions, including septic arthritis and calcium pyrophosphate deposition disease.
Monosodium Urate Crystal Assessment
The validity of assessment of MSU crystals in synovial fluid for the diagnosis of gout has been questioned, as noted in the introduction to this report and confirmed by several studies we reviewed, suggesting its suboptimal nature as the gold standard against which potential diagnostic methods are measured.37 Further confirming these findings, an abstract presented at the 2013 EULAR meetings that tested the MSU and calcium pyrophosphate crystal identification competence of a group of rheumatologists, lab technicians, and rheumatology residents found that fewer than half identified all samples correctly, that rheumatologist, resident, and technician performance was fairly comparable, although residents performed much more poorly on identification of calcium pyrophosphate crystals.48
Nevertheless, recent guidelines continue to recommend the use of MSU assessment for definitive diagnosis. For example, the 2011 Postgraduate Medicine guidelines for diagnosis of gout (which aimed to update the EULAR 2006 guidelines, neither of which have been clinically validated) emphasize that diagnosis based on clinical signs and symptoms alone has reasonable accuracy when patients have typical presentation of gout but that that MSU constitutes the definitive diagnosis.49 The 2014 3e (Evidence, Expertise, Exchange) initiative (3ei is a multinational effort to promote evidence-based practice) recommendations on the diagnosis and treatment of gout recognizes the use of MSU as the gold standard but also notes the difficulty in performing this test under some circumstances, asserting that if MSU cannot be performed , the diagnosis “can be supported by classical clinical features, and/or characteristic imaging findings.47
At the 2014 ACR Meeting, new ACR/EULAR diagnostic criteria were presented (updating the 2006 EULAR diagnostic criteria). Based on a systematic review (yet to be published) and consensus panel, the new guidelines advocate the use of MSU for any patient with suspected gout. However, the authors of these latest guidelines also acknowledge the difficulty of assessing MSU, and note that in its absence, a combination of clinical signs and symptoms are suggestive of, but not definitive for gout.50
Accuracy of Algorithms Comprising Clinical Signs and Symptoms for the Diagnosis of Gout
This report identified a series of algorithms, some intended for classification of gout for research purposes (but used in diagnosis as well) and some intended for diagnosis.
Comparing the more recent diagnostic algorithms with the earlier algorithms highlights the likely importance of patient population and duration of disease in determining diagnostic criteria. Janssens' Diagnostic Rule and the CGD were developed and validated on patients first identified in primary care; these patients were likely to be in an earlier stage of the disease than the patients on whom earlier diagnostic criteria, such as the ACR criteria, were based. The patients in the earlier validation studies were hand-picked by rheumatologists (which would have increased the sensitivity of the tests compared with their use on a more typical population with a less certain diagnosis.
The incremental utility of MSU over clinical diagnostic criteria alone was recently assessed and compared in patients with shorter (2 years or less) and longer durations of symptoms (history of attacks). This study compared the sensitivities of the classification criteria that include the use of MSU (the Rome, New York, ARA, and CGD criteria) with and without the MSU findings. They found that in patients with shorter symptom duration, inclusion of MSU assessment improved sensitivity considerably over those same criteria without MSU. Nevertheless, the sensitivities of the CGD criteria without including an MSU assessment and the Diagnostic Rule (which does not include MSU) were still fairly high (87.2% and 87.9%, respectively). In patients with symptom duration longer than 2 years, the sensitivities of all clinical diagnostic and classification criteria is greater than for newer patients. In addition, omission of MSU and reliance on the clinical diagnostic criteria alone resulted in a much smaller decrease in sensitivity for these more advanced patients. None of the studies we identified limited inclusion to patients having a first attack.
Accuracy of DECT for the Diagnosis of Gout
DECT is a non-invasive study method that can detect urate deposits in joints, tendons, bursa, and soft tissues. The radiographic signature of urate can be distinguished from that of calcium. DECT requires special machines and software to process the images and currently is not widely available. Radiation exposure is not greater than standard CT scanning and is limited to extremities, which are not radio-sensitive organs.
Studies looking at diagnostic utility of DECT are promising, generally demonstrating good sensitivity and specificity for gout.
A recent (2014) study 28 sought to determine the additive value of DECT to a clinically unclear presentation among 30 patients without clear diagnoses. Of these 30, 14 had a positive DECT, and of those 14, 11 of 12 of those with positive DECT findings (2 patients refused aspiration) had crystal confirmation of gout using ultrasound guided aspiration, suggesting DECT may be a useful adjunct to clinical algorithms. However, among another group of 40 patients with newly diagnosed gout (confirmed with MSU assessment), all four patients with false negative DECT had new onset gout (first attack and symptom duration less than 6 months). This finding suggests DECT may be less useful in very early cases than in patients with disease of longer duration. The low sensitivity reported for the knee DCS in a 2011 study by Lai and colleagues was also thought to be attributable to the (shorter) duration of disease in the included patients.35 Another 2011 study had also prospectively studied inflammatory mono-arthritis patients, demonstrating high sensitivity and specificity for crystal-confirmed gout cases.18
The summary of the literature demonstrates that DECT is both specific and sensitive for gout. Utility of DECT may be best for evaluating urate burden in established gout patients. Limited data suggest that for patients with recurrent attacks of inflammatory mono- or oligo- arthritis where the question of gout is unresolved (for example, no fluid available for aspiration or negative study), DECT should demonstrate good diagnostic value. However, for patients with a first inflammatory mono-articular attack (due to gout), DECT may not be sensitive. The availability of DECT machines in most regions also may limit application of this technology.
Accuracy of Ultrasound for the Diagnosis of Gout
Use of ultrasound as a diagnostic test for gout has promising potential. Sensitivity and specificity for specific ultrasound characteristics or signals (such as the “double contour sign” or combinations of these signals)were typically high, with one exception. In addition, it is relatively inexpensive, non-invasive, and well accepted by patients.
However, several challenges must be overcome prior to ultrasound being accepted as a standard diagnostic technique for gout. The various signals, which include the “double contour sign,” characteristic intra-articular findings (bright spots or “snow”), and tophaceous findings, can present in many different joints, and the analyses we reviewed each used different methodology for identifying which joints they studied. The number of joints studied ranged from a single target (inflamed) joint to 26 joints. Additionally, up to 20 tendon areas and 6 bursae were also examined. Such exhaustive scanning is not practical. Some authors (notably Lamers-Karnebeck and colleagues)36 described limited systematic evaluation of inflammatory mono-arthritis patients with sensitivities and likelihood ratios for specific findings. Nevertheless, even this focused methodology (4 to 6 joints) may be beyond what would be available from most radiology centers, which typically focus on more comprehensive examinations of single joints. The tendency to conduct multisite scans to diagnose and characterize gout appears to be greatest in the rheumatology community.
The low sensitivity reported for the knee double contour sign by Lai and colleagues was attributed to the (shorter) duration of disease in the included patients,35 suggesting better diagnostic value in patients with more advanced disease (although another study reported no differences between patients having their first attack and those having had several attacks).36
Furthermore, we did not find any studies that evaluated the marginal utility of using ultrasound data to diagnose gout, above that of clinical criteria or in lieu of joint aspiration.
Thus, the present review also confirms the results of several relatively recent systematic reviews on the validity and potential superiority of both DECT and ultrasound for the diagnosis of gout. However, as the 3e Recommendations note, the “availability, cost, and the need for trained personnel and specific equipment…” might limit their use in routine clinical practice. Thus, these guidelines seem to suggest that in primary care settings, diagnosis can be based on a set of clinical criteria.51
Applicability
Two factors may reduce the applicability of this review.
First, of the studies we identified that assessed the validity of clinical diagnostic algorithms and imaging for the diagnosis of gout, most included at least some participants who had already had a definitive diagnosis. Relatively few studies enrolled only participants with an inflamed joint or even suspected gout but no definitive diagnosis. The present review excluded studies of individuals with a prior gout diagnosis; however, we identified no studies that limited inclusion only to patients presenting with a first attack, and almost no studies considered the duration of the disease or the number of prior attacks in their assessments.
Second, all imaging studies were conducted in a rheumatology setting, usually an academic rheumatology department. Patients seen in this setting may have more advanced disease than those seen in a primary care setting, or may have comorbidities that add complexity to their treatment.
Implications for Clinical and Policy Decisionmaking
The findings of this review provide some evidence to support the further development and validation of diagnostic algorithms based on a combination of clinical signs and symptoms for the diagnosis of gout in the primary care setting. The review further supports the use of imaging modalities (ultrasound and DECT) in cases where a definitive diagnosis cannot be made from signs and symptoms alone.
Limitations of the Comparative Effectiveness Review Process
Assessing the comparative validity of diagnostic tests in systematic reviews presents a number of challenges that are not faced with comparative effectiveness reviews of treatment strategies. These limitations are magnified by several issues surrounding tests for gout and the natural history of the disease itself. To increase applicability to the specific patient population and health care settings of interest, we limited included studies to those that enrolled previously undiagnosed patients; in doing so, we excluded a number of studies on the use of ultrasound and DECT for monitoring patients with chronic gout or hyperuricemia. Previous systematic reviews on the use of ultrasound and DECT included studies that enrolled patients with asymptomatic hyperuricemia and studies of patients with definitive gout diagnoses in various stages of the disease, as well as studies of patients with suspected gout but without a definitive diagnosis (their findings were similar to ours).
Our searches were aimed at identifying studies on gout diagnosis. Searches that identified studies on gout would be expected to identify studies on the differential diagnosis of gout, septic arthritis, calcium pyrophosphate deposition disease, and other such conditions. If a study was aimed at diagnosing patients with a mono- or oligo-arthritis, the chance that the word “gout” would appear is nearly 100%, as that would be one possible diagnosis. However, we might have overlooked an occasional study on differential diagnosis of inflammatory joint conditions that was applicable to gout.
In addition, our consideration of unpublished literature was limited. We were unable to obtain information from manufacturers of microscopes and imaging equipment used to diagnose gout. In addition, we did not include conference proceedings as sources of data but cited several in discussing our findings in the context of what is known about gout diagnosis.
Limitations of the Evidence Base
The literature that addresses the diagnosis of gout has numerous limitations that make it difficult to draw firm conclusions. These limitations can be divided into three categories: study volume, design, and reporting quality. We have already addressed some of the issues in the discussion above. Few studies have attempted to address the diagnosis of gout. Almost no studies have examined the impact of diagnostic test accuracy on decision-making (decisions to order further testing or to initiate particular treatments) or any clinical or patient centered outcomes, and almost no studies addressed adverse events potentially associated with diagnostic testing. Most studies of gout address management issues or monitoring of patients with chronic gout.
Of the diagnostic studies we did identify, few studies limited enrollment to gout suspects or patients with a monoarthritis or some other clinical signs or symptoms that might suggest gout. Many studies enrolled only patients with known gout and included no control group.
Even studies that enrolled patients who were gout suspects or included a control group and employed blinded assessment systematically failed to limit enrollment to patients in their first attack or with recent onset or did not stratify findings by duration of the condition (as would be ascertained by asking, “How long have you been having these attacks?”). The lack of stratification by duration of condition affects the sensitivity and specificity of both clinical diagnostic criteria and imaging techniques.
Most studies also fail to stratify by other relevant factors, such as time since the onset of the current or most recent flare, sex, and comorbidities. The time since onset of the current flare definitely affects the presence of crystals as well as clinical signs and symptoms.
No studies tested the validity of combining a diagnostic algorithm comprising clinical signs and symptoms with an imaging test, compared with clinical signs and symptoms or imaging alone.
As described above, issues concerning the use of synovial fluid MSU crystal identification as the reference standard abound. Taking the validity of the reference standard at face value, some studies assessed MSU in a fraction of participants only (e.g., those for whom synovial fluid could be aspirated, those most suspected of having gout, or those willing to undergo the test), using the ACR criteria or individual clinical judgment as the reference standard for the remaining participants. The technical problems with aspiration and analysis have been assessed and described extensively and include inconsistencies introduced by patient factors (e.g., the time lapse from the start of the flare to aspiration), sample handling factors (storage duration and temperature), and practitioner skills in aspiration and analysis. Finally, failure to report important study design details in publications is a further limitation. Studies tended to be vague regarding blinding of assessors and the time lapse between implementation of the index test and reference standard (and the sequence of tests), a critical detail considering the short duration of gout attacks.
Research Gaps
In a 2013 commentary, Dalbeth,17 noted that thus far, none of the current diagnostic (classification) criteria have been adequately validated. Efforts to validate the existing classification criteria have either failed to enroll patients prospectively (i.e., before a definitive diagnosis has been made) or have been limited to very small numbers of patients. The ongoing Study for Updated Gout classification cRiteria (SUGAR) project is validating gout classification criteria to improve case ascertainment for recruitment into research studies and for epidemiological purposes. As we suggested above in describing limitations of the research base, promising algorithms for diagnosis in the primary care setting, such as the Diagnostic Rule and the CGD have been validated, but need additional validation in larger, broader populations.
In addition, specific elements of the criteria, such as hyperuricemia, require additional testing. Most clinical diagnostic and classification criteria for gout include hyperuricemia as a criterion.16, 22, 31, 39, 41, 42 However, a 1994 study concluded that serum urate was not a valid criterion for diagnosing gout, as there was no lower level below which gout was not a possibility (and no upper limit beyond which it is a certainty).52 The 2011 Postgraduate Medicine criteria also excluded hyperuricemia as an element of its clinical diagnostic criteria for that reason,49 and the new 2014 ACR/EULAR criteria include hyperuricemia but state that it should not be the sole criterion on which a diagnosis of gout is made.50 Thus, further assessment of the effect of hyperuricemia on the sensitivity and specificity of the clinical diagnostic algorithms may be needed.
Patient-level factors that influence test behavior have also been understudied: These include the influence of duration of a flare; number and identity of joints involved; and patient age, sex, and comorbidities. A 2010 systematic review on the diagnosis of gout in women noted that clinical features and risk factors of gout in women differ from those in men.44 Women have later onset, are more likely to be taking diuretics, have more CVD and renal comorbidity, are less likely to drink alcohol, are less likely to have podagra (more involvement of other joints), are more likely to have polyarticular gout, and have less frequent recurrent attacks. These findings suggest the need for different clinical diagnostic criteria for women. Likewise, a number of the clinical diagnostic criteria, including the Diagnostic Rule and the new 2014 ACR/EULAR criteria, include cardiovascular comorbidities as a criterion. The sensitivity and specificity of this criterion may need to be established across a broad group of populations.
The findings of Park and colleagues on the effects of gout misdiagnosis37 suggest that studies are needed on differential diagnosis of gout and other inflammatory joint conditions, particularly septic arthritis and calcium pyrophosphate deposition disease. We identified two recent studies that assessed the validity of a simple laboratory test for the differential diagnosis of gout from septic arthritis. Neither study met our inclusion criteria because the gout diagnosis was made prior to the studies. A 2014 study conducted in Germany analyzed multiple inflammatory markers in serum and synovial fluid drawn from patients seen in a hospital emergency room; gout and septic arthritis were ascertained by synovial fluid aspiration with MSU crystal identification and culture, respectively. Among the markers assayed (e.g., serum UA, synovial fluid white blood cells, synovial fluid total protein), synovial fluid lactate had the greatest diagnostic potential to differentiate septic arthritis from gout, followed by glucose and serum uric acid concentrations.53 A 2014 study conducted in an academic orthopedics department in China found that serum and synovial fluid procalcitonin can discriminate between septic arthritis and the non-infectious forms of arthritis (gout, rheumatoid arthritis, and osteoarthritis) in the knee, but that synovial fluid procalcitonin is much more sensitive;54 unfortunately, this assessment would still require joint aspiration. Response to colchicine, which has been suggested as a diagnostic criterion for gout, also does not distinguish gout from other crystal arthopathies. Ultrasound and DECT show some evidence of distinguishing gout from calcium pyrophosphate deposition disease, but further work is needed.
Finally, studies are needed that assess the incremental value of ultrasound and DECT imaging over the use of a clinical diagnostic algorithm or even MSU analysis alone. One study assessed the potential additive value of DECT in patients with uncertain diagnosis: the findings suggested DECT may be a useful adjunct to clinical algorithms among patients with disease of longer duration but not those with new onset gout (first attack and symptom duration less than 6 months).28 Another study purported to assess the added value of ultrasound in a clinical diagnostic algorithm, but this study fell short of actually achieving that outcome.36 This information will be necessary in determining the importance and the practicality of setting a guideline for referring patients for imaging in making a diagnosis of gout. Of potential utility would be an appropriateness assessment study that creates a panel of possible clinical scenarios of inflammatory joint presentation with the goal of eliciting the most appropriate diagnostic workup for the primary/urgent/emergency care setting.
Conclusions
This review highlights the need for further, broader validation of promising diagnostic algorithms in primary care settings, where the majority of patients with signs and symptoms suggestive of gout, but no definitive gout diagnosis, are likely to be seen. A clinical algorithm with high diagnostic accuracy can ideally form part of a diagnostic decision tree, with referral of more clinically challenging cases to rheumatologists for more invasive tests or imaging. Research is needed to assess the incremental value of synovial fluid MSU crystal analysis and imaging over that of a diagnostic clinical algorithm.
- Discussion - Diagnosis of GoutDiscussion - Diagnosis of Gout
- Results - Management of GoutResults - Management of Gout
- Discussion - Management of GoutDiscussion - Management of Gout
- Diagnosis of GoutDiagnosis of Gout
- Diabetes Medications for Adults With Type 2 Diabetes: An UpdateDiabetes Medications for Adults With Type 2 Diabetes: An Update
Your browsing activity is empty.
Activity recording is turned off.
See more...