Send to

Choose Destination
Ophthalmic Plast Reconstr Surg. 2018 Nov/Dec;34(6):544-546. doi: 10.1097/IOP.0000000000001080.

Soft Tissue Metrics in Thyroid Eye Disease: An International Thyroid Eye Disease Society Reliability Study.

Author information

Vanderbilt Eye Institute, Vanderbilt University School of Medicine, Nashville, Tennessee, U.S.A.
Department of Ophthalmology and Visual Sciences, Eye Care Centre, University of British Columbia, Vancouver, British Columbia, Canada.
Edward S. Harkness Eye Institute, Columbia University Medical Center, New York, New York, U.S.A.
Department of Neuroscience, University of Naples, Federico II, Naples, Italy.
King Khaled Eye Specialist Hospital, Riyadh, Saudi Arabia.
Hospital Universitario de Fuenlabrada, Universidad Rey Juan Carlos, Fuenlabrada, Madrid, Spain.
Department of Ophthalmology and Visual Science, Chinese University of Hong Kong, Hong Kong, SAR.
Department of Ophthalmology, Royal Brisbane and Women's Hospital, University of Queensland, Brisbane, Queensland, Australia.
Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, California, U.S.A.
Prasad Eye Institute, Hyderabad, Telangana, India.
Department of Ophthalmology, University of North Carolina, Chapel Hill, North Carolina, U.S.A.
Department of Ophthalmology, Otorhinolaryngology and Head and Neck Surgery, School of Medicine of Ribeirão Preto, University of São Paulo, São Paulo, Brazil.
Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio, U.S.A.



To determine the reliability of 3 scales for assessing soft tissue inflammatory and congestive signs associated with thyroid eye disease.


This was a multicentered prospective observational study, recruiting 55 adults with thyroid eye disease from 9 international centers. Six thyroid eye disease soft tissue features were measured; each sign graded using 3 scales (presence/absence [0-1], 3-point scale [0-2], and percentage [0-100]). Each eye was graded twice by 2 independent raters. Accuracy (fraction of agreement) was calculated between the 2 trials for each rater (intrarater reliability) and between raters for all trials (interrater reliability) to determine the most sensitive scale for each feature that maintained a threshold of agreement greater than 0.70. Trial, intrarater reliability, and interrater reliability were determined by accuracy measurement of agreement for each inflammatory/congestive feature.


Fifty-five patients had 218 assessments for 6 thyroid eye disease metrics. The intrarater reliability for each feature was consistently better than the interrater reliabilities. Using an agreement of 0.70 or better, for the interrater tests, conjunctival and eyelid edema could be reliably measured using the 0-1 or 0-2 scale while conjunctival and eyelid redness could only be reliably measured with the binary 0-1 scale. Caruncular edema and superior conjunctival redness could not be measured reliably between 2 raters with any scale. The percentage scale had poor agreement unless slippage intervals of >20% were allowed on either side of the measurements.


Of the specific periocular soft tissue inflammatory features measured between raters in the Clinical Activity Score and Vision, Inflammation, Strabismus, Appearance scales, edema of the eyelids and conjunctiva could reliably be measured by both 0-1 and 0-2 scales, erythema of the eyelid and bulbar conjunctiva could reliably be measured only by the 0-1 scale, and the other parameters of superior bulbar erythema and caruncular edema were not reliably measured by any scale.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Wolters Kluwer
Loading ...
Support Center