Send to

Choose Destination
See comment in PubMed Commons below
Proc Natl Acad Sci U S A. 1996 Feb 20;93(4):1560-5.

Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development.

Author information

Department of Mathematics, Stanford University, CA 94305-2125, USA.


Several human neurological disorders are associated with proteins containing abnormally long runs of glutamine residues. Strikingly, most of these proteins contain two or more additional long runs of amino acids other than glutamine. We screened the current human, mouse, Drosophila, yeast, and Escherichia coli protein sequence data bases and identified all proteins containing multiple long homopeptides. This search found multiple long homopeptides in about 12% of Drosophila proteins but in only about 1.7% of human, mouse, and yeast proteins and none among E. coli proteins. Most of these sequences show other unusual sequence features, including multiple charge clusters and excessive counts of homopeptides of length > or = two amino acid residues. Intriguingly, a large majority of the identified Drosophila proteins are essential developmental proteins and, in particular, most play a role in central nervous system development. Almost half of the human and mouse proteins identified are homeotic homologs. The role of long homopeptides in fine-tuning protein conformation for multiple functional activities is discussed. The relative contributions of strand slippage and of dynamic mutation are also addressed. Several new experiments are proposed.

[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons


    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Support Center