Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
Nat Genet. 1993 Mar;3(3):266-72.

Identification of protein coding regions by database similarity search.

Author information

  • 1National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland 20894-0001.


Sequence similarity between a translated nucleotide sequence and a known biological protein can provide strong evidence for the presence of a homologous coding region, even between distantly related genes. The computer program BLASTX performed conceptual translation of a nucleotide query sequence followed by a protein database search in one programmatic step. We characterized the sensitivity of BLASTX recognition to the presence of substitution, insertion and deletion errors in the query sequence and to sequence divergence. Reading frames were reliably identified in the presence of 1% query errors, a rate that is typical for primary sequence data. BLASTX is appropriate for use in moderate and large scale sequencing projects at the earliest opportunity, when the data are most prone to containing errors.

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Icon for Nature Publishing Group
    Loading ...
    Write to the Help Desk