Format

Send to

Choose Destination
Int J Health Geogr. 2016 Aug 3;15(1):27. doi: 10.1186/s12942-016-0056-6.

Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics.

Author information

1
Division of Biostatistics, Research Institute of Convergence for Biomedical Science and Technology, Pusan National University Yangsan Hospital, Pusan, Korea.
2
Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA. li.zhu@nih.gov.
3
Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
4
Information Management Services, Calverton, MD, USA.
5
Westat, Inc, Rockville, MD, USA.
6
Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA.

Abstract

BACKGROUND:

Spatial and space-time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate thousands of overlapping windows as potential clusters, adjusting for the multiple testing. Almost always, the method will find many very similar overlapping clusters, and it is not useful to report all of them. This paper proposes to use the Gini coefficient to help select which of the many overlapping clusters to report.

METHODS:

The Gini coefficient provides a quick and intuitive way to evaluate the degree of the heterogeneity of the collection of clusters, which is useful to explain how well the cluster collection reveal the underlying true cluster patterns. Using simulation studies and real cancer mortality data, it is compared with the traditional approach for reporting non-overlapping clusters.

RESULTS:

The Gini coefficient can identify a more refined collection of non-overlapping clusters to report. For example, it is able to determine when it makes more sense to report a collection of smaller non-overlapping clusters versus a single large cluster containing all of them. It also fulfils a set of desirable theoretical properties, such as being invariant under a uniform multiplication of the population numbers by the same constant.

CONCLUSIONS:

The Gini coefficient can be used to determine which set of non-overlapping clusters to report. It has been implemented in the free SaTScan™ software version 9.3 ( www.satscan.org ).

KEYWORDS:

Cancer mortality; Cluster detection; Cluster reporting size; Disease surveillance; Gini coefficient; Log likelihood ratio; SaTScan; Scan statistic; Spatial statistics

PMID:
27488416
PMCID:
PMC4971627
DOI:
10.1186/s12942-016-0056-6
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center