Send to

Choose Destination
Arch Pathol Lab Med. 2018 May;142(5):613-625. doi: 10.5858/arpa.2017-0181-OA. Epub 2018 Feb 19.

Reproducibility and Feasibility of Strategies for Morphologic Assessment of Renal Biopsies Using the Nephrotic Syndrome Study Network Digital Pathology Scoring System.

Author information

From Biostatistics, Arbor Research Collaborative for Health, Ann Arbor, Michigan (Dr Zee); the Departments of Pathology (Dr Hodgin), Internal Medicine (Dr Mariani), and Biostatistics (Dr Gillespie), University of Michigan, Ann Arbor; Arbor Research Collaborative for Health, Ann Arbor, Michigan (Dr Mariani); the Department of Pathology & Immunology, Washington University, St Louis, Missouri (Dr Gaut); the Departments of Pathology and Laboratory Medicine (Dr Palmer) and Medicine (Dr. Holzman), University of Pennsylvania, Philadelphia; the Department of Pathology, Johns Hopkins University, Baltimore, Maryland (Drs Bagnasco and Rosenberg); the Kidney Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, Maryland (Dr Rosenberg); the Laboratory of Pathology, National Cancer Institute, Bethesda, Maryland (Dr Hewitt); and the Department of Pathology, University of Miami, Miami, Florida (Dr Barisoni).


Context Testing reproducibility is critical for the development of methodologies for morphologic assessment. Our previous study using the descriptor-based Nephrotic Syndrome Study Network Digital Pathology Scoring System (NDPSS) on glomerular images revealed variable reproducibility. Objective To test reproducibility and feasibility of alternative scoring strategies for digital morphologic assessment of glomeruli and explore use of alternative agreement statistics. Design The original NDPSS was modified (NDPSS1 and NDPSS2) to evaluate (1) independent scoring of each individual biopsy level, (2) use of continuous measures, (3) groupings of individual descriptors into classes and subclasses prior to scoring, and (4) indication of pathologists' confidence/uncertainty for any given score. Three and 5 pathologists scored 157 and 79 glomeruli using the NDPSS1 and NDPSS2, respectively. Agreement was tested using conventional (Cohen κ) and alternative (Gwet agreement coefficient 1 [AC1]) agreement statistics and compared with previously published data (original NDPSS). Results Overall, pathologists' uncertainty was low, favoring application of the Gwet AC1. Greater agreement was achieved using the Gwet AC1 compared with the Cohen κ across all scoring methodologies. Mean (standard deviation) differences in agreement estimates using the NDPSS1 and NDPSS2 compared with the single-level original NDPSS were -0.09 (0.17) and -0.17 (0.17), respectively. Using the Gwet AC1, 79% of the original NDPSS descriptors had good or excellent agreement. Pathologist feedback indicated the NDPSS1 and NDPSS2 were time-consuming. Conclusions The NDPSS1 and NDPSS2 increased pathologists' scoring burden without improving reproducibility. Use of alternative agreement statistics was strongly supported. We suggest using the original NDPSS on whole slide images for glomerular morphology assessment and for guiding future automated technologies.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Allen Press, Inc. Icon for PubMed Central
Loading ...
Support Center