Send to

Choose Destination
Bioinformatics. 2013 Nov 15;29(22):2884-91. doi: 10.1093/bioinformatics/btt498. Epub 2013 Aug 29.

A-clustering: a novel method for the detection of co-regulated methylation regions, and regions associated with exposure.

Author information

Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue, SPH2, 4th floor, Boston, MA 02115, USA, Department of Statistics, University of Connecticut, 215 Glenbrook Road, Storrs, CT 06269, USA, NIEHS, Epidemiology Branch, MD A3-05, PO Box 12233, Research Triangle Park, NC 27709, USA, Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, 680 N Lake Shore Drive, Suite 1400 Chicago, IL 60611, USA, Department of Environmental Health and Department of Epidemiology, Harvard School of Public Health, 401 Park Drive, Landmark Ctr Room 415E, Boston, MA 02215, USA.



DNA methylation is a heritable modifiable chemical process that affects gene transcription and is associated with other molecular markers (e.g. gene expression) and biomarkers (e.g. cancer or other diseases). Current technology measures methylation in hundred of thousands, or millions of CpG sites throughout the genome. It is evident that neighboring CpG sites are often highly correlated with each other, and current literature suggests that clusters of adjacent CpG sites are co-regulated.


We develop the Adjacent Site Clustering (A-clustering) algorithm to detect sets of neighboring CpG sites that are correlated with each other. To detect methylation regions associated with exposure, we propose an analysis pipeline for high-dimensional methylation data in which CpG sites within regions identified by A-clustering are modeled as multivariate responses to environmental exposure using a generalized estimating equation approach that assumes exposure equally affects all sites in the cluster. We develop a correlation preserving simulation scheme, and study the proposed methodology via simulations. We study the clusters detected by the algorithm on high dimensional dataset of peripheral blood methylation of pesticide applicators.


We provide the R package Aclust that efficiently implements the A-clustering and the analysis pipeline, and produces analysis reports. The package is found on


[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center