Wolfson Centre for Age-Related Diseases, The Wolfson Wing, Hodgkin Building, Kings College London, London SE1 1UL, UK. gareth.2.williams@kcl.ac.uk
BACKGROUND: The wealth of information on protein structure has led to a variety of statistical analyses of the role played by individual amino acid types in the protein fold. In particular, the contact propensities between the various amino acids can be converted into folding energies that have proved useful in structure prediction. The present study addresses the relationship of protein folding propensities to the evolutionary relationship between residues. RESULTS: The contact preferences of residue types observed in a representative sample of protein structures are converted into a residue similarity matrix or inter-residue distance matrix. Remarkably, these distances correlate excellently with evolutionary substitution costs. Residue vectors are derived from the distance matrix. The residue vectors give a concrete picture of the grouping of residues into families sharing properties crucial for protein folding. CONCLUSIONS: Inter-residue distances have proved useful in showing the explicit relationship between contact preferences and evolutionary substitution rates. It is proposed that the distance matrix derived from structural analysis may be useful in aligning proteins where remote homologs share structural features. Residue vectors derived from the distance matrix illustrate the spatial arrangement of residues and point to ways in which they can be grouped.