Send to

Choose Destination
Can J Gastroenterol. 2011 May;25(5):261-4.

Conventional versus Rosemont endoscopic ultrasound criteria for chronic pancreatitis: comparing interobserver reliability and intertest agreement.

Author information

Digestive Disease Center, Divission of Gastroenterology and Hepatology, Department of Medicine, Medical University of South Carolina, Charleston, South Dakota 29425, USA.



The Rosemont criteria (RC) were recently proposed by expert consensus to standardize endoscopic ultrasound (EUS) features and thresholds for diagnosing chronic pancreatitis (CP); however, they are cumbersome and are not validated.


To determine interobserver agreement between RC and conventional criteria (CC), and to assess intertest agreement in the diagnosis of CP.


Thirty-six consecutive patients who underwent EUS for abdominal pain or pancreatitis were retrospectively reviewed. Anonymized images were independently chosen as best representations of the pancreatic body and reviewed by three experts who recorded the presence of CC and RC features. Agreement (proportion and kappa statistic) between CC and RC was calculated. Interobserver agreement within the CC and RC was assessed. Secondary comparisons with endoscopic retrograde cholangiopancreatography were made where available.


Using CC, 60 readings (83.3%) were negative for CP, while 12 readings (16.7%) were positive. Using RC, 59 readings (81.9%) were negative for CP, while 13 (18.1%) were positive. The weighted kappa for interobserver agreement for CC (four categories: normal⁄low probability, indeterminate, high probability or calcific) was 0.50, with 80.0% overall agreement, versus 0.27 and 68.1% for the four RC categories (normal, indeterminate, suggestive of and consistent with). Agreement on a positive diagnosis with CC was 86.1% (P=0.38 [McNemar's exact test]), with a kappa of 0.47; for RC, agreement was lower at 80.6% (P=0.016 [McNemar's exact test]), with a kappa of 0.38. For patients who underwent endoscopic retrograde cholangiopancreatography (n=12), false-negative and false-positive rates between CC and RC did not appear to be different.


The RC do not appear to achieve the goals of improving accuracy and interobserver agreement for diagnosing CP.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for PubMed Central
Loading ...
Support Center