Bringing order to protein disorder through comparative genomics and genetic interactions

Genome Biol. 2011;12(2):R14. doi: 10.1186/gb-2011-12-2-r14. Epub 2011 Feb 16.

Abstract

Background: Intrinsically disordered regions are widespread, especially in proteomes of higher eukaryotes. Recently, protein disorder has been associated with a wide variety of cellular processes and has been implicated in several human diseases. Despite its apparent functional importance, the sheer range of different roles played by protein disorder often makes its exact contribution difficult to interpret.

Results: We attempt to better understand the different roles of disorder using a novel analysis that leverages both comparative genomics and genetic interactions. Strikingly, we find that disorder can be partitioned into three biologically distinct phenomena: regions where disorder is conserved but with quickly evolving amino acid sequences (flexible disorder); regions of conserved disorder with also highly conserved amino acid sequences (constrained disorder); and, lastly, non-conserved disorder. Flexible disorder bears many of the characteristics commonly attributed to disorder and is associated with signaling pathways and multi-functionality. Conversely, constrained disorder has markedly different functional attributes and is involved in RNA binding and protein chaperones. Finally, non-conserved disorder lacks clear functional hallmarks based on our analysis.

Conclusions: Our new perspective on protein disorder clarifies a variety of previous results by putting them into a systematic framework. Moreover, the clear and distinct functional association of flexible and constrained disorder will allow for new approaches and more specific algorithms for disorder detection in a functional context. Finally, in flexible disordered regions, we demonstrate clear evolutionary selection of protein disorder with little selection on primary structure, which has important implications for sequence-based studies of protein structure and evolution.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Conserved Sequence
  • Databases, Protein
  • Escherichia coli
  • Evolution, Molecular
  • Genomics / methods*
  • Humans
  • Models, Statistical
  • Molecular Sequence Data
  • Protein Folding*
  • Protein Structure, Tertiary / genetics*
  • Proteins / chemistry*
  • Proteins / genetics
  • Transcriptome

Substances

  • Proteins