Format

Send to

Choose Destination
Proc Natl Acad Sci U S A. 2019 Jul 9;116(28):14011-14018. doi: 10.1073/pnas.1901423116. Epub 2019 Jun 24.

Robust single-cell Hi-C clustering by convolution- and random-walk-based imputation.

Author information

1
Genomic Analysis Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037.
2
Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093.
3
Department of Medicine, University of California San Diego, La Jolla, CA 92093.
4
Computational Neurobiology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037.
5
Division of Biological Sciences, University of California San Diego, La Jolla, CA 92093.
6
Department of Bioengineering, University of California San Diego, La Jolla, CA 92093.
7
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801.
8
Peptide Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037.
9
Genomic Analysis Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037; ecker@salk.edu.
10
Howard Hughes Medical Institute, Salk Institute for Biological Studies, La Jolla, CA 92037.

Abstract

Three-dimensional genome structure plays a pivotal role in gene regulation and cellular function. Single-cell analysis of genome architecture has been achieved using imaging and chromatin conformation capture methods such as Hi-C. To study variation in chromosome structure between different cell types, computational approaches are needed that can utilize sparse and heterogeneous single-cell Hi-C data. However, few methods exist that are able to accurately and efficiently cluster such data into constituent cell types. Here, we describe scHiCluster, a single-cell clustering algorithm for Hi-C contact matrices that is based on imputations using linear convolution and random walk. Using both simulated and real single-cell Hi-C data as benchmarks, scHiCluster significantly improves clustering accuracy when applied to low coverage datasets compared with existing methods. After imputation by scHiCluster, topologically associating domain (TAD)-like structures (TLSs) can be identified within single cells, and their consensus boundaries were enriched at the TAD boundaries observed in bulk cell Hi-C samples. In summary, scHiCluster facilitates visualization and comparison of single-cell 3D genomes.

KEYWORDS:

3D chromosome structure; Hi-C; random walk; single cell

Conflict of interest statement

Conflict of interest statement: J.R.E., J.R.D., and A.K. are coauthors on a 2015 research article. J.R.E. and A.K. are coauthors on a 2017 correspondence article.

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center