Format

Send to

Choose Destination
Genome Res. 2019 Jun;29(6):1009-1022. doi: 10.1101/gr.244830.118. Epub 2019 May 23.

Recompleting the Caenorhabditis elegans genome.

Author information

1
Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8583, Japan.
2
Department of Pathology, Stanford University, Stanford, California 94305, USA.
3
Department of Genetics, Stanford University, Stanford, California 94305, USA.
4
Department of Zoology and Michael Smith Laboratories, University of British Columbia, Vancouver V6T 1Z3, British Columbia, Canada.
5
Department of Genetics, Cell Biology, and Development, University of Minnesota, Minneapolis, Minnesota 55454, USA.
6
Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA.
#
Contributed equally

Abstract

Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center