Format

Send to

Choose Destination
Sci Rep. 2015 Nov 20;5:16894. doi: 10.1038/srep16894.

Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

Author information

1
Department of Biosciences and Informatics, Keio University, Japan.
2
Department of Genome Medicine, National Research Institute for Child Health and Development, NCCHD, Japan.
3
Department of Applied Developmental Biology, Central Institute for Experimental Animals, Japan.
4
Principles of Informatics Research Division, National Institute of Informatics, Japan.
5
Center for Information Biology, National Institute of Genetics, Japan.
6
Preventive Medicine and Diagnosis Innovation Program, RIEKN, Japan.
7
Laboratory for Symbolic Cognitive Development, Brain Science Institute RIKEN, Japan.
8
Keio Advanced Research Center, Japan.
9
Department of Physiology, Keio University School of Medicine, Japan.
10
Laboratory for Marmoset Neural Architecture, RIKEN Brain Science Institute.

Abstract

The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

PMID:
26586576
PMCID:
PMC4653617
DOI:
10.1038/srep16894
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center