Send to

Choose Destination
Sci Rep. 2016 Jun 3;6:26941. doi: 10.1038/srep26941.

Non-random distribution of homo-repeats: links with biological functions and human diseases.

Author information

Group of Bioinformatics, Institute of Protein Research, Russian Academy of Sciences, 4 Institutskaya str., Pushchino, Moscow Region, 142290, Russia.
Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain.
Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain.
Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Passeig Lluís Companys, 08010 Barcelona, Spain.


The biological function of multiple repetitions of single amino acids, or homo-repeats, is largely unknown, but their occurrence in proteins has been associated with more than 20 hereditary diseases. Analysing 122 bacterial and eukaryotic genomes, we observed that the number of proteins containing homo-repeats is significantly larger than expected from theoretical estimates. Analysis of statistical significance indicates that the minimal size of homo-repeats varies with amino acid type and proteome. In an attempt to characterize proteins harbouring long homo-repeats, we found that those containing polar or small amino acids S, P, H, E, D, K, Q and N are enriched in structural disorder as well as protein- and RNA-interactions. We observed that E, S, Q, G, L, P, D, A and H homo-repeats are strongly linked with occurrence in human diseases. Moreover, S, E, P, A, Q, D and T homo-repeats are significantly enriched in neuronal proteins associated with autism and other disorders. We release a webserver for further exploration of homo-repeats occurrence in human pathology at

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center