Format

Send to

Choose Destination
BMC Genomics. 2017 Jul 12;18(1):527. doi: 10.1186/s12864-017-3879-z.

Scaffolding of long read assemblies using long range contact information.

Author information

1
Department of Computer Science, University of Maryland, 20742 College Park, Maryland, USA.
2
Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 21702 Bethesda, Maryland, USA.
3
Cell Wall Biology and Utilization Research, US Dairy Forage Research Center, 53706 Madison, Wisconsin, USA.
4
Pacific Biosciences, 94205 Menlo Park, California, USA. jchin@pacificbiosciences.com.

Abstract

BACKGROUND:

Long read technologies have revolutionized de novo genome assembly by generating contigs orders of magnitude longer than that of short read assemblies. Although assembly contiguity has increased, it usually does not reconstruct a full chromosome or an arm of the chromosome, resulting in an unfinished chromosome level assembly. To increase the contiguity of the assembly to the chromosome level, different strategies are used which exploit long range contact information between chromosomes in the genome.

METHODS:

We develop a scalable and computationally efficient scaffolding method that can boost the assembly contiguity to a large extent using genome-wide chromatin interaction data such as Hi-C.

RESULTS:

we demonstrate an algorithm that uses Hi-C data for longer-range scaffolding of de novo long read genome assemblies. We tested our methods on the human and goat genome assemblies. We compare our scaffolds with the scaffolds generated by LACHESIS based on various metrics.

CONCLUSION:

Our new algorithm SALSA produces more accurate scaffolds compared to the existing state of the art method LACHESIS.

KEYWORDS:

Assembly; Hi-C; Long reads; Scaffolding

PMID:
28701198
PMCID:
PMC5508778
DOI:
10.1186/s12864-017-3879-z
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center