Send to

Choose Destination
Nucleic Acids Res. 2009 Jan;37(1):289-97. doi: 10.1093/nar/gkn916. Epub 2008 Nov 28.

Assessing the gene space in draft genomes.

Author information

UC Davis Genome Center, University of California Davis, Davis, CA, USA.


Genome sequencing projects have been initiated for a wide range of eukaryotes. A few projects have reached completion, but most exist as draft assemblies. As one of the main reasons to sequence a genome is to obtain its catalog of genes, an important question is how complete or completable the catalog is in unfinished genomes. To answer this question, we have identified a set of core eukaryotic genes (CEGs), that are extremely highly conserved and which we believe are present in low copy numbers in higher eukaryotes. From an analysis of a phylogenetically diverse set of eukaryotic genome assemblies, we found that the proportion of CEGs mapped in draft genomes provides a useful metric for describing the gene space, and complements the commonly used N50 length and x-fold coverage values.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center