Format

Send to

Choose Destination
Nat Commun. 2018 Aug 15;9(1):3265. doi: 10.1038/s41467-018-05691-7.

Decoding topologically associating domains with ultra-low resolution Hi-C data by graph structural entropy.

Li A1,2,3, Yin X4,5, Xu B6,7, Wang D6,7, Han J5, Wei Y8, Deng Y7, Xiong Y9, Zhang Z10,11.

Author information

1
State Key Laboratory of Software Development Environment, School of Computer Science, Beihang University, 100083, Beijing, P.R. China. angsheng@ios.ac.cn.
2
State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, 100190, P.R. China. angsheng@ios.ac.cn.
3
School of Computer Science, University of Chinese Academy of Sciences, Beijing, 100049, P.R. China. angsheng@ios.ac.cn.
4
State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, 100190, P.R. China.
5
School of Computer Science, University of Chinese Academy of Sciences, Beijing, 100049, P.R. China.
6
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, 100101, Beijing, P.R. China.
7
School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, P.R. China.
8
School of Mathematics, University of Chinese Academy of Sciences, Beijing, 100049, P.R. China.
9
School of Physics, University of Chinese Academy of Sciences, Beijing, 100049, P.R. China.
10
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, 100101, Beijing, P.R. China. zhangzhihua@big.ac.cn.
11
School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, P.R. China. zhangzhihua@big.ac.cn.

Abstract

Submegabase-size topologically associating domains (TAD) have been observed in high-throughput chromatin interaction data (Hi-C). However, accurate detection of TADs depends on ultra-deep sequencing and sophisticated normalization procedures. Here we propose a fast and normalization-free method to decode the domains of chromosomes (deDoc) that utilizes structural information theory. By treating Hi-C contact matrix as a representation of a graph, deDoc partitions the graph into segments with minimal structural entropy. We show that structural entropy can also be used to determine the proper bin size of the Hi-C data. By applying deDoc to pooled Hi-C data from 10 single cells, we detect megabase-size TAD-like domains. This result implies that the modular structure of the genome spatial organization may be fundamental to even a small cohort of single cells. Our algorithms may facilitate systematic investigations of chromosomal domains on a larger scale than hitherto have been possible.

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center