Send to

Choose Destination
J Biomol Struct Dyn. 2009 Feb;26(4):413-20.

Identify protein-coding genes in the genomes of Aeropyrum pernix K1 and Chlorobium tepidum TLS.

Author information

School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.


The problem that how many protein-coding genes there are in the genome of Aeropyrum pernix K1 has confused many scientists since the sequencing in 1999. In this paper, the protein-coding genes in A. pernix K1 are identified from the original and current NITE annotation by using the Aper_ORFs method. Consequently, 702 of 704 experimentally validated genes are correctly predicted as coding, which means the sensitivity of the method is 702/704 approximately 99.7%. This sensitivity is one percent higher than that of the versatile bacterial gene-finding program, ZCURVE 1.0. The number of genes determined in this work is 1699. This number is very closely equal to that of the current NITE annotation, which is 1700. Therefore, the two independent predictions may end the ten-years-lasting controversy about gene number in this genome. Furthermore, the Aper_ORFs method is extended to identify protein-coding genes in the genome of Chlorobium tepidum TLS and about 98% of the function-known genes are correctly predicted as coding. In addition, 188 hypothetical ORFs are identified as non-coding in the genome. Mapping point analysis shows that these ORFs have different base frequency distribution with that of function-known genes, suggesting that most of them do not encode proteins. It's hoped the Aper_ORFs method will become a useful tool for gene annotation in newly sequenced bacterial and archaeal genomes, as long as the G+C content of which is similar with that of A. pernix.

[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center