• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of narLink to Publisher's site
Nucleic Acids Res. Sep 11, 1982; 10(17): 5303–5318.
PMCID: PMC320873

Recognition of protein coding regions in DNA sequences.


We give a test for protein coding regions which is based on simple and universal differences between protein-coding and noncoding DNA. The test is simple enough to use without a computer and is completely objective. The test has been thoroughly proven on 400,000 bases of sequence data: it misclassifies 5% of the regions tested and gives an answer of "No Opinion" one fifth of the time. We predict some new coding and noncoding regions in published sequences.

Full text

Full text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (1.2M), or click on a page image below to browse page by page. Links to PubMed are also available for Selected References.

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Kastelein RA, Remaut E, Fiers W, van Duin J. Lysis gene expression of RNA phage MS2 depends on a frameshift during translation of the overlapping coat protein gene. Nature. 1982 Jan 7;295(5844):35–41. [PubMed]
  • Borst P, Grivell LA. One gene's intron is another gene's exon. Nature. 1981 Feb 5;289(5797):439–440. [PubMed]
  • Shulman MJ, Steinberg CM, Westmoreland N. The coding function of nucleotide sequences can be discerned by statistical analysis. J Theor Biol. 1981 Feb 7;88(3):409–420. [PubMed]
  • Shepherd JC. Periodic correlations in DNA sequences and evidence suggesting their evolutionary origin in a comma-less genetic code. J Mol Evol. 1981;17(2):94–102. [PubMed]
  • Shepherd JC. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci U S A. 1981 Mar;78(3):1596–1600. [PMC free article] [PubMed]
  • Staden R, McLachlan AD. Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Res. 1982 Jan 11;10(1):141–156. [PMC free article] [PubMed]
  • Gold L, Pribnow D, Schneider T, Shinedling S, Singer BS, Stormo G. Translational initiation in prokaryotes. Annu Rev Microbiol. 1981;35:365–403. [PubMed]
  • Breathnach R, Chambon P. Organization and expression of eucaryotic split genes coding for proteins. Annu Rev Biochem. 1981;50:349–383. [PubMed]
  • Rodier F, Gabarro-Arpa J, Ehrlich R, Reiss C. Key for protein coding sequences identification: computer analysis of codon strategy. Nucleic Acids Res. 1982 Jan 11;10(1):391–402. [PMC free article] [PubMed]
  • Grantham R, Gautier C, Gouy M. Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type. Nucleic Acids Res. 1980 May 10;8(9):1893–1912. [PMC free article] [PubMed]
  • Grantham R, Gautier C, Gouy M, Jacobzone M, Mercier R. Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 1981 Jan 10;9(1):r43–r74. [PMC free article] [PubMed]
  • Trifonov EN, Sussman JL. The pitch of chromatin DNA is reflected in its nucleotide sequence. Proc Natl Acad Sci U S A. 1980 Jul;77(7):3816–3820. [PMC free article] [PubMed]
  • Eigen M, Winkler-Oswatitsch R. Transfer-RNA, an early gene? Naturwissenschaften. 1981 Jun;68(6):282–292. [PubMed]
  • Sutcliffe JG, Shinnick TM, Green N, Liu FT, Niman HL, Lerner RA. Chemical synthesis of a polypeptide predicted from nucleotide sequence allows detection of a new retroviral gene product. Nature. 1980 Oct 30;287(5785):801–805. [PubMed]
  • Dijkema R, Dekker BM, Van Ormondt H. The nucleotide sequence of the transforming BglII-H fragment of adenovirus type 7 DNA. Gene. 1980 Apr;9(1-2):141–156. [PubMed]
  • Waring RB, Davies RW, Lee S, Grisi E, Berks MM, Scazzocchio C. The mosaic organization of the apocytochrome b gene of Aspergillus nidulans revealed by DNA sequencing. Cell. 1981 Nov;27(1 Pt 2):4–11. [PubMed]
  • Ohtsubo H, Ohtsubo E. Nucleotide sequence of an insertion element, IS1. Proc Natl Acad Sci U S A. 1978 Feb;75(2):615–619. [PMC free article] [PubMed]
  • Ohtsubo H, Nyman K, Doroszkiewicz W, Ohtsubo E. Multiple copies of iso-insertion sequences of IS1 in Shigella dysenteriae chromosome. Nature. 1981 Aug 13;292(5824):640–643. [PubMed]
  • Sugimoto K, Oka A, Sugisaki H, Takanami M, Nishimura A, Yasuda Y, Hirota Y. Nucleotide sequence of Escherichia coli K-12 replication origin. Proc Natl Acad Sci U S A. 1979 Feb;76(2):575–579. [PMC free article] [PubMed]
  • Meijer M, Beck E, Hansen FG, Bergmans HE, Messer W, von Meyenburg K, Schaller H. Nucleotide sequence of the origin of replication of the Escherichia coli K-12 chromosome. Proc Natl Acad Sci U S A. 1979 Feb;76(2):580–584. [PMC free article] [PubMed]
  • Lother H, Messer W. Promoters in the E. coli replication origin. Nature. 1981 Nov 26;294(5839):376–378. [PubMed]
  • Nakamura M, Yamada M, Hirota Y, Sugimoto K, Oka A, Takanami M. Nucleotide sequence of the asnA gene coding for asparagine synthetase of E. coli K-12. Nucleic Acids Res. 1981 Sep 25;9(18):4669–4676. [PMC free article] [PubMed]
  • Brosius J, Dull TJ, Sleeter DD, Noller HF. Gene organization and primary structure of a ribosomal RNA operon from Escherichia coli. J Mol Biol. 1981 May 15;148(2):107–127. [PubMed]
  • Csordás-Tóth E, Boros I, Venetianer P. Structure of the promoter region for the rrnB gene in Escherichia coli. Nucleic Acids Res. 1979 Dec 20;7(8):2189–2197. [PMC free article] [PubMed]
  • Brosius J, Palmer ML, Kennedy PJ, Noller HF. Complete nucleotide sequence of a 16S ribosomal RNA gene from Escherichia coli. Proc Natl Acad Sci U S A. 1978 Oct;75(10):4801–4805. [PMC free article] [PubMed]
  • Brosius J, Dull TJ, Noller HF. Complete nucleotide sequence of a 23S ribosomal RNA gene from Escherichia coli. Proc Natl Acad Sci U S A. 1980 Jan;77(1):201–204. [PMC free article] [PubMed]
  • Spritz RA, DeRiel JK, Forget BG, Weissman SM. Complete nucleotide sequence of the human delta-globin gene. Cell. 1980 Oct;21(3):639–646. [PubMed]
  • Rubtsov PM, Musakhanov MM, Zakharyev VM, Krayev AS, Skryabin KG, Bayev AA. The structure of the yeast ribosomal RNA genes. I. The complete nucleotide sequence of the 18S ribosomal RNA gene from Saccharomyces cerevisiae. Nucleic Acids Res. 1980 Dec 11;8(23):5779–5794. [PMC free article] [PubMed]
  • Hartley JL, Donelson JE. Nucleotide sequence of the yeast plasmid. Nature. 1980 Aug 28;286(5776):860–865. [PubMed]
  • Hindley J, Phear GA. Sequence of 1019 nucleotides encompassing one of the inverted repeats from the yeast 2 micrometer plasmid. Nucleic Acids Res. 1979 Sep 25;7(2):361–375. [PMC free article] [PubMed]
  • Singleton CK, Roeder WD, Bogosian G, Somerville RL, Weith HL. DNA sequence of the E. coli trpR gene and prediction of the amino acid sequence of Trp repressor. Nucleic Acids Res. 1980 Apr 11;8(7):1551–1560. [PMC free article] [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Cited in Books
    Cited in Books
    PubMed Central articles cited in books
  • Compound
    PubChem Compound links
  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...