• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Proteins. Author manuscript; available in PMC Sep 16, 2012.
Published in final edited form as:
Published online Sep 16, 2011. doi:  10.1002/prot.23161
PMCID: PMC3212657
NIHMSID: NIHMS323389

Evaluation of disorder predictions in CASP9

Abstract

Lack of stable three-dimensional structure, or intrinsic disorder, is a common phenomenon in proteins. Naturally unstructured regions are proven to be essential for carrying function by many proteins and therefore identification of such regions is an important issue. CASP has been assessing the state of the art in predicting disorder regions from amino acid sequence since 2002. Here we present the results of the evaluation of the disorder predictions submitted to CASP9. The assessment is based on the evaluation measures and procedures used in previous CASPs. The balanced accuracy and the Matthews correlation coefficient were chosen as basic measures for evaluating the correctness of binary classifications. The area under the receiving operating characteristic curve was the measure of choice for evaluating probability-based predictions of disorder. The CASP9 methods are shown to perform slightly better than the CASP7 methods but not better than the methods in CASP8. It was also shown that capability of most CASP9 methods to predict disorder decreases with increasing minimum disorder segment length.

Keywords: CASP, intrinsically disordered proteins, unstructured proteins, rediction of disordered regions, assessment of disorder prediction

INTRODUCTION

It has been widely accepted that that the ability of proteins to perform specific cellular functions is directly associated with their unique spatial structure1. Numerous experiments have shown that proteins lose their activity upon loss of ordered structure due to exposure to non-physiological environments such as high temperature, urea or acid. It was also shown that a denatured protein can regain practically all of its original activity by recovering its structure upon restoration of physiological conditions2. Based on these observations, the concept that proteins achieve their biological function upon folding into unique structural conformations became widely accepted. In the latest two decades ample information has been collected in evidence of proteins that do not follow this general rule36. These so-called naturally unstructured or intrinsically disordered proteins (IDPs) lack stable structures under physiological conditions but are nevertheless biologically active. Many other proteins contain structured regions alongside extended intrinsically disordered regions (IDRs) that often play an important functional role (e.g., BRCA1, a breast cancer susceptibility protein contains approximately 1,500 unstructured non-termini residues participating in repairing damaged DNA7). The IDPs and proteins including IDRs are highly abundant in both eukaryotes and prokaryotes810 and tend to be enriched for regulatory functions related to molecular recognition and signal transduction1114. The number of experimentally verified IDPs and IDRs is rapidly rising15: the DisProt database16 currently contains annotations for more than 640 proteins with disordered regions, and recent reviews on this topic6,11,12 cite hundreds of papers. The association of several IDPs with human disease, such as cancer, cardiovascular disease, amyloidoses, diabetes, neurodegenerative diseases, and others, has triggered additional research on the subject1719.

With the high level of interest in disordered proteins, a substantial effort was placed to develop experimental and computational methods to study this phenomenon12,20. Computational methods have quickly become a particularly valuable tool, in part because of their ability to keep pace with the large-scale genome sequencing projects. These techniques are based on the premise that the amino acid sequence encodes protein non-folding similarly to protein folding. Indeed, comparison of composition and complexity of protein sequences in ordered and disordered regions shows that they are statistically different12,21. Based on this observation, many methods were built to predict the IDRs through recognition of amino acid motifs characteristic of disorder.

The first formal method for computational protein disorder prediction was published in 199722, and, since then, more than fifty methods to identify disorder have been developed2325. In a recent review, He et al24 provide a historical perspective of progress in this field, pointing out also the important role that the CASP experiments have played in these advancements since 200226.

The present paper analyzes the results obtained by the thirty-two disorder prediction groups participating in CASP9. While the initial round of disorder prediction (2002) was assessed by the organizers, the following three rounds were evaluated by independent assessors (2004, 2006, 2008). Since the methods to evaluate disorder prediction in CASP have developed to the point where assessments are relatively straightforward and can be made fully automatically, in CASP9 the evaluations were again performed by the organizers.

MATERIALS AND METHODS

Targets and definition of disorder

One hundred and twenty nine targets were released for modeling in CASP9 and all of them were made available for disorder prediction. The structures of twelve targets were not solved in time for prediction assessment and thus were canceled27, leaving 117 targets (98 X-ray and 19 NMR structures) for assessment*. Structure data for five of these targets (T0533, T0536, T0600, T0612, T0637) were compromised in the period between the corresponding server and human prediction deadlines27 and therefore only the server predictions were evaluated on these targets.

Disorder regions for each target were defined based on the best structure determination available at the time of the assessment and using the sequence released for prediction (which sometimes was slightly different than the sequence later deposited in the PDB). In cases where both the NMR and X-ray structures were available (T0551), the X-ray structure was used.

Disorder in CASP9 was defined similarly to the previous CASPs2831. A residue was considered to be in a disordered state if it appeared in the protein’s amino acid sequence but either (1) lacked the spatial coordinates or (2) showed a high conformational variability across different X-ray chains or NMR models. We have defined “high variability” as cases where distances between positions of the same residue in any pair of models in the NMR ensemble or in any pair of X-ray chains in the asymmetric unit exceeded 3.5Å in the optimal LGA32 superposition. In all other cases, the residue was assumed to be ordered. This is an oversimplification as residues may be disordered under physiological conditions but forced into the “ordered” state by crystallization. Also, long disordered regions often contain “dual personality” fragments33 that become structured when binding to a partner. These segments are often predicted to be ordered even though they are disordered in the absence of their partner34. However, such transitions are impossible to detect given only the crystal structure of the isolated protein.

Overall, 2,677 residues (or 10.2% of residues in all CASP9 targets) were classified as disordered, including 403 residues in NMR structures (or 19.9% of all residues in NMR targets) and 2,274 residues (9.4%) in X-ray targets. Thus, percent-wise, CASP9 NMR structures contain approximately twice as many disordered residues as X-ray structures. At the target level, the fraction of disordered residues is approximately the same for both types of targets, varying from 0 to 55% in X-ray structures and from 2 to 53% in NMR structures. Two targets at the high end of this range were T0603 (X-ray, 305 residues) containing 6 separate unstructured regions summing up to 55% of its length, and T0590 (NMR, 137 residues) containing two long disordered segments covering 54% of its sequence. The statistics on the number and length of the IDRs in the CASP9 targets are shown in Figure 1. Short disordered regions are much more common than the long ones. To reduce noise due to experimental uncertainty, segments consisting of less than four consecutive residues of the same order/disorder type were not considered in the assessment. After eliminating short segments, the assessment was performed on a set of 26,075 residues, including 2,417 classified as disordered. We also assessed the ability of methods to identify longer disordered regions by setting the minimum length of a disordered region to 20, 30 and 40 residues.

Figure 1
Length distribution of disordered regions in CASP9 target proteins.

Participating groups and prediction format

Thirty two groups participated in prediction of disordered regions in CASP9, including 22 servers and 10 human-expert groups. These groups could submit up to five DR predictions (here called models) per target, but only models identified by the predictors as number “1” were evaluated. The overwhelming majority of groups submitted predictions on all or almost all of the targets (see Table I). The two exceptions were human-expert groups G147 and G462, which submitted predictions on 53 and 57 out of the 112 targets, respectively. We assessed the performance of these two groups but did not include them in the final rankings.

Table I
Summary of assessment scores for disorder prediction groups in CASP9

The format of the predictions in the DR category has not changed since CASP5. The predictors were asked to identify the IDRs by assigning to each residue a binary classifier of order or disorder (“O” for the ordered state and “D” for the disordered), and a probability of belonging to a disordered region (a real number in the [0;1] range). The detailed description of the DR format can be found at the Prediction Center website http://predictioncenter.org/casp9/index.cgi?page=format#DR. Learning from the lessons of CASP8, in CASP9 we required that all the residues that were assigned a binary disordered/ordered tag were also assigned probability values above/below 0.5, respectively. The value of 0.5 was reserved for residues where predictors were undecided.

Evaluation criteria

Disorder predictions in CASP9 were evaluated with the MCC, Acc, and AUC measures also used in previous CASPs2931. The Sw measure was dropped from the assessment as it was shown35 to be equivalent to the Acc, when calculated with the weights used in CASPs6–8. The statistical significance of the differences in group performance was assessed using procedures adopted in CASP730, i.e. the bootstrap confidence interval method36,37 and the DeLong tests38.

Measures for evaluating binary order/disorder predictions

In CASP, the ability to correctly assign the order/disorder tags to residues in a target has been evaluated with several measures2831: sensitivity and specificity (used in CASP5–8), statistical accuracy Q2 (CASP6), Matthews correlation coefficient MCC (CASP6), the weighted score Sw (CASP6–8), and the balanced accuracy Acc (CASP7–8).

The disorder prediction data are characterized by a large class imbalance: in the latest five CASPs ordered residues outnumbered disordered ones 9 to 1 or higher. As disordered residues are relatively rare and therefore harder to predict, their correct prediction should be rewarded more generously than the prediction of ordered residues, and vice versa – the incorrect prediction of disordered regions should be penalized less severely than the incorrect prediction of ordered residues. Not all measures are equally effective in handling these tasks. Below, we briefly discuss the relative strengths and weaknesses of the aforementioned evaluation measures for the disorder assessment.

Sensitivity and specificity

Sensitivity=TPTP+FN=TPNd,

Specificity=TNTN+FP=TNNo

are the two statistical measures routinely used for evaluating the accuracy of a two-class binary predictor. In prediction of disorder, TP (true positives) and TN (true negatives) are the numbers of correctly predicted disordered and ordered residues, respectively; FP (false positives) and FN (false negatives) are the numbers of misclassified ordered and disordered residues, and Nd and No are the total numbers of disordered and ordered residues in all targets predicted by a particular group. Specificity determines the fraction of negative examples (ordered residues) correctly identified in a prediction. For datasets dominated by negative examples, specificity is high for practically all predictors and therefore is not a discriminative measure of prediction quality. Sensitivity represents the fraction of positive examples (disordered residues) correctly identified in a prediction and has a better discriminative power but at the same time is completely insensitive to negative examples (see the corresponding formula). Predictors can increase the sensitivity or specificity of their classifiers by deliberately predicting more residues as disordered or ordered, respectively. There is a tradeoff between these two measures and increasing one of them usually leads to decreasing the other. A prediction method can be considered to perform well only if it scores high in both sensitivity and specificity; neither of these two measures is a good estimator of methods’ strength when used alone. The one-sided nature of sensitivity and specificity can be overcome by employing measures that use all four parameters of prediction quality (TP, FP, TN and FN).

The statistical accuracy (used under the name of Q2 in previous CASP disorder prediction assessments) is calculated according to

Q2=TP+TNTP+FN+TN+FP
(1)

and accounts for all four components of prediction quality. Nevertheless, it strongly favors conservative classifications (i.e. predicting more residues as ordered)29,30 and therefore is not well suited for disorder assessment.

The balanced accuracy Acc

Acc=Sensitivity+Specificity2=TPTP+FN+TNTN+FP
(2)

is a much better measure as it does not reward over-prediction of the ordered state. On contrary, it has a desired feature of rewarding prediction of disordered state more generously than the prediction of the ordered, but it is also known to strongly favor greedy classifications (i.e. predicting more residues as disordered)29.

The Matthews correlation coefficient (MCC)

MCC=TP·TNFP·FN(TP+FP)(TP+FN)(TN+FP)(TN+FN)
(3)

does not favor over-prediction of any of the prediction classes and had been recommended for handling cases with skewed class frequencies39,40. MCC varies between −1 and 1 with a random prediction scoring zero. It was noticed, though, that MCC can yield unreasonably high scores in cases where prediction algorithms assign very few or no false positives and at the same time very few true positives41. As this situation can happen in DR prediction (over-prediction of ordered residues), we have conducted a numerical experiment to estimate the scale of possible discrepancies. We have run these calculations on artificial datasets with the TP, TN, FP and FN values varying in the ranges typical of the CASP9 data. The MCC appeared to yield reasonable and consistent scores for all combinations of prediction characteristics, leading to a conclusion that in general it does not overinflate scores for over-prediction of ordered residues in our data.

The general conclusions on the effectiveness of measures (1)–(3) hold true for the CASP9 data. First, the tendency of Q2 to unreasonably favor conservative predictions can be illustrated by an example of two CASP9 groups: G291 and G067. Group G291 is ranked high according to all three measures used in our evaluation (Table I), while group G067 is at the very bottom of the table. Surprisingly, G067 outscores G291 0.91 to 0.87 according to Q2 (data not shown). This result can be directly attributed to more conservative predictions submitted by G067 (only 391 residues predicted as disordered; the remaining 98.5% of residues predicted as ordered, the highest figure in CASP9). Second, both Acc and MCC, reproduce the overall trends in prediction quality fairly well but emphasize numerical contributions from TP, TN, FP and FN differently (on CASP data, the Spearman ranked correlation coefficient between these two measures is only ρ=0.56). The balanced accuracy (Acc) generously rewards correct prediction of disordered regions and mildly penalizes their incorrect prediction, encouraging development of riskier methods tuned to identify large numbers of disordered residues. The MCC is more balanced, it does not reward “greedy” predictions as strongly as the Acc does, but instead rewards classifiers with higher predictive precision

precision=PPV=TPTP+FP.
(4)

The difference between the Acc and MCC can be illustrated by the example of groups G119 and G015 (Table I). Group G119 is one of the most “greedy” CASP9 classifiers. It has predicted 5115 residues as disordered, but only 1570 of these classifications had been correct (31%). Group G015 predicted only 1019 residues as disordered, of which 839 were correct (82%). The Acc score favors G119 as able to identify almost twice as many disordered residues as G015. At the same time the MCC favors G015 for obtaining a much higher level of precision, while still predicting a relatively high number of disordered residues. The decision of which of these two measures should be used in assessments is to some degree subjective and therefore we present the results of both, noting the better balance of the MCC.

In previous three CASPs, also the Sw score was extensively used in assessments

Sw=wdTPwoFP+woTNwdFNwd(TP+FN)+wo(TN+FP).

This measure was specifically designed29 to address the imbalance in the ratio of ordered and disordered residues through adjustable weights wo and wd. It was recently shown35 that for the weights used in CASP6–8

wo=NdNo+Nd,wd=NoNo+Nd

this score is equivalent to the Acc as there is a linear relationship between the two:

Sw=2Acc1.

Therefore we kept only one of these measures (Acc) in our analysis.

In addition to the scores used in previous CASPs we have tested other evaluation measures. One such measure is the F-score, which, similarly to the MCC, had been recommended to handle skewed data42,43. Our calculations on CASP data have shown a high correlation of this measure with the MCC (Spearman’s ρ=0.9) and therefore these results are not shown.

Measures for evaluating probability-based predictions of disorder

The ability to identify the IDRs through assigning per residue disorder confidence scores [0;1] was assessed with the receiver operating characteristic (ROC) analysis. This method is frequently used to assess the accuracy of a classifier, and has been previously used in the assessment of protein disorder predictions (both in CASP and elsewhere)2931,44.

In essence, a ROC curve illustrates the correspondence between the true positive rate of a predictor (Sensitivity) and its false positive rate (FPR = FP/(TN + FP) 1 − Specificity) for a set of probability thresholds (from 0 to 1 in our case). For each threshold, a residue is considered as a positive example (disordered) if its predicted probability is equal to or greater than the threshold value. The area under a ROC curve (AUC) is indicative of the classifier accuracy. An AUC of 1 identifies a perfect predictor, while an AUC of 0.5 corresponds to a random classifier. We have computed the AUC scores using the trapezoid integration rule with a threshold increment of 0.01.

Statistical significance of differences in group performance

Performance of groups as binary order/disorder classifiers was statistically compared using the re-sampling procedure. For each group, 80% of targets were randomly drawn from the list of targets predicted by that group and the evaluation scores were re-calculated on that subset. The procedure was repeated 1000 times, and a discrete distribution of the two-class classifications was learned for every group. Based on these distributions, we have calculated the 95% confidence intervals for each assessment measure using the two-tailed bootstrap percentile method36,37. Statistical significance of the differences in group performance was inferred based on the comparison of the confidence intervals obtained for each group45.

Performance of groups as predictors of the per-residue disorder probabilities was compared using the DeLong non-parametric tests38, designed to assess the statistical significance of the differences between the AUC scores in the ROC analysis. The evaluation was performed using the statistical package R46 and the pROC library47.

RESULTS

Performance of disorder prediction methods

Numerical evaluations of DR predictions for all groups participating in CASP9 are summarized in Table I and illustrated in Figure 2. Scores from the main three evaluation measures used in our assessment (Acc, MCC and AUC) are provided together with the ranges of the corresponding 95% confidence intervals and the group ranks. The ROC curves based on the continuous-scale disorder predictions are plotted in Figure 3. Note that the uneven distribution of the assigned probability scores can affect the smoothness of ROC curves, which is imperative for an accurate calculation of the AUC scores. In CASP9, all top-ranked groups have assigned sufficiently distinct probabilities to enable an accurate calculation of the AUC scores. The only exception is group G193, which submitted predictions yielding good scores according to the binary-classification measures Acc and MCC but poor AUC scores in the probability-based analysis. This was due to uniformly assigning a value of zero to all residues predicted as ordered.

Figure 2
Performance of DR groups according to three evaluation scores: AUC (black bars), Acc (grey bars) and MCC (light grey bars). The groups are sorted according to decreasing AUC score. The error bars on the plot indicate boundaries of the 95% confidence intervals ...
Figure 3
ROC curves of disordered region predictions for all CASP9 groups. Legends are shown for the best 12 groups according to the AUC. There are four non-regular ROCs corresponding to poorly performing groups, two of which misinterpreted DR format (G193 used ...

Table I shows that prdos2 is the only group to rank among the top three prediction groups according to all three evaluation measures. In addition to this group, there are three other groups (Zhou-Spine-D, Multicom-refine and biomine_dr_pdb) to rank among the best 10 groups according to all three measures.

Figure 2 shows that there are several groups that perform equally well according to the Acc measure (grey bars). However, as we have discussed in Materials and Methods, some high Acc scores may be an artifact due to over-prediction of disordered residues. As the scores for top groups are very close, the statistical significance of the differences between them could not be established by the comparison of the confidence intervals.

Group DisoPred3C has obtained a relatively low Acc score but, at the same time, the best and the second best MCC and AUC scores, respectively. The high MCC score can be attributed to the high (highest in CASP9) precision (4) of classifications submitted by this group. The low Acc score is most likely due to the relatively low levels of disorder prediction. Based on the comparison of confidence intervals for the MCC scores, results of DisoPred3C are statistically better than those of all other groups, except for biomine_dr_pdb_c, which is second best according to the MCC.

Groups prdos2, DisoPred3C and Multicom are the best performing groups according to the probability-based assessment. These groups are statistically indistinguishable from each other by the AUC score and better than all other groups according to the results of the DeLong tests (see Table II). This conclusion is also confirmed by the comparison of the AUC confidence intervals (Table I).

Table II
Statistical comparison of the best 12 groups according to the AUC score

Evaluation of results for longer disorder regions

As noted earlier, short disordered regions prevail in the CASP data set (see Figure 1). While such regions may sometimes consist of chain termini or short loops without any obvious functional role, they are often of functional importance (for example, flaps over enzyme active sites, pieces of chain that order into DNA grooves, or loops that become ordered in protein-protein interfaces), so their inclusion in the methods testing is important. At the same time, long disorder regions require separate attention as they are found in abundance in the human disease-associated proteins48 and their properties and functional roles are likely different from those of short disorder regions (for example, ordering of complete domains upon complex formation). The issue of the different length of disordered regions has been taken into consideration in several disorder predictions methods4953. To address this issue in the assessment, we additionally evaluated the predictions taking into account only segments longer than a specified length cutoff.

Figure 4 compares results of CASP9 methods for four minimum length thresholds: 4, 20, 30 and 40 residues. The “average group” splines (‘AVG’, thicker line) in all three panels of the graph show that the discriminatory power of the methods tends to decrease with the increase of the minimum disorder segment length. The average drop in performance is moderate according to the Acc and AUC scores and more pronounced according to the MCC score. The MCC panel suggests that an average CASP9 method can identify 40+ residue long disorder segments just slightly better than a random predictor (MCC=0). It should be mentioned, though, that the results for disorder regions spanning 40 residues or more should be interpreted with caution as there were only four qualified segments constituting to only 0.8% of all residues in CASP9 targets.

Figure 4
Comparison of prediction performance across four different minimum disorder segment length thresholds. Different panels show scores for different evaluation measures (Acc, MCC and AUC). Each group is marked with a different color; groups in the legend ...

Curves for the vast majority of participating groups follow the average trend to decrease, resulting in high correlation (0.85 – 0.98) between the scores of the same evaluation measure at neighboring length thresholds. The lowest (even though still high in absolute value: 0.85) correlation between the 4+ and 20+ Acc score sets reflects the fact that this score was the most prone to the shifting of ranks. While the majority of groups performed worse in identifying 20+ residue long disorder segments (compared to 4+ segments), there were five groups that performed somewhat better, with two of them - DisoPred3C (G015) and GSMetaDisorder3D (G421) - improving their Acc scores significantly (by more than 6%) and consequently raising their ranks by 13 positions (to #8 and #7). DisoPred3C is the only group that demonstrated an ability to better discriminate 20+ residue long disorder regions according to all three evaluation scores, and is the best group in this length range according to the MCC and AUC scores. This group also has quite high scores for the 30+ residue-long regions, comparable to those they obtained for the 4+ ones. GSMetaDisorder3D also proved to be successful in identifying longer disorder segments, consistently placing in the best three according to the MCC and AUC scores at all longer disorder length levels.

Comparison between recent CASPs

To compare the accuracy of disorder predictions across all CASPs, we have re-evaluated predictions in previous CASPs using exactly the same disorder definitions and evaluation measures as in CASP9. Even so, it is hard to ensure full objectivity of such a comparison as the targets, methods and databases change in time.

Figure 5 shows the results of a comparison of the MCC scores for the twelve best performing groups in the latest three CASPs. The CASP9 scores are higher than those in CASP7 but lower than in CASP8. This tendency holds when these methods are compared using other scores (see Figure S1 in Supplementary Material). As the majority of the top performing CASP8 methods were among the best also in CASP9, the drop in performance is most likely due to a greater difficulty of the CASP9 targets54.

Figure 5
Comparison of the performance of the best 12 groups in the latest 3 CASPs. Groups in each CASP are sorted according to the MCC score. CASP8 results are evaluated for both the full set of targets and the set without target T0500, a long, completely unfolded ...

CONCLUSIONS

The number of disorder prediction methods published in the literature is now well over fifty24 and continues to grow as new methods continue to appear5558. This growth correlates well with the increase in the number of disorder prediction groups participating in CASP experiments. However, the increased number of participating groups does not seem to result in a better performance. Rather, our analysis show that the scores obtained in CASP9 have slightly decreased in comparison with those in CASP8 according to all three measures used in the CASP9 evaluation. As we discussed in this paper, this might be related to a higher difficulty of the CASP9 targets but perhaps also to the lack of conceptually new methods. New meta-predictors or slight modifications of established methods were not sufficient to achieve substantial progress in the field. By analyzing the submitted abstracts we could identify only one group (Zhou-Spine-D), claiming development of a conceptually new method based on neural networks. This method was assessed to be among top 10 according to all scores in CASP9, but did not outperform other already established CASP performers. A brief description of the best performing automatic methods participated in CASP9 is provided in Table III. It seems that performance of disorder prediction methods in CASP has reached a plateau and new breakthroughs are needed.

Table III
Methods description for the best CASP9 DR servers

Besides more effective disorder prediction methods, we also need better target sets in CASP, since the vast majority of targets are solved by X-ray crystallography and therefore typically contain only short disorder regions. This type of data likely does not fully represent the type of disorder observed in functionally relevant long disordered segments. Thus, test sets containing more targets with extended disordered regions are required for more comprehensive testing of disorder prediction methods.

For the first time, we have analyzed differences in the capability of methods to recognize disorder regions of different length. The surprising result is that, independent of the exact evaluation metrics, there is a rather dramatic fall-off in performance with disorder length increase. Perhaps this reflects a tendency for the methods used for CASP to be trained on the short disorder segments typical of the targets. Nevertheless, it is a disturbing result.

Supplementary Material

Supp Fig S1

ACKNOWLEDGEMENTS

This work was partially supported by the US National Library of Medicine (NIH/NLM) – grant LM007085 to KF and by Award No. KUK-I1-012-43 made by King Abdullah University of Science and Technology (KAUST) to AT.

Abbreviations

3D
three-dimensional
DR
disordered residues
IDP (IDR)
Intrinsically Disordered Protein (Region)
MCC
the Matthews Correlation Coefficient
ROC
the Receiver Operating Characteristic
AUC
Area Under the ROC Curve

Footnotes

*Target T0549, which was excluded from the tertiary structure assessment as lacking big parts of structure, was retained for the disorder assessment.

This definition slightly differs from the previous CASP definitions, in part due to the 3.5Å deviation criterion for X-ray structures, which classifies additional 246 residues (or 1.0% of all residues in CASP9 X-ray structures) as disordered.

REFERENCES

1. Wu H. Studies on denaturation of proteins. XIII. A theory of denaturation. Chin J Physiol. 1931;1:219–234.
2. Anfinsen CB. Principles that govern the folding of protein chains. Science. 1973;181(96):223–230. [PubMed]
3. Wright PE, Dyson HJ. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J Mol Biol. 1999;293(2):321–331. [PubMed]
4. Dunker AK, Obradovic Z, Romero P, Garner EC, Brown CJ. Intrinsic protein disorder in complete genomes. Genome Inform Ser Workshop Genome Inform. 2000;11:161–171. [PubMed]
5. Tompa P. Intrinsically unstructured proteins. Trends Biochem Sci. 2002;27(10):527–533. [PubMed]
6. Dunker AK, Silman I, Uversky VN, Sussman JL. Function and structure of inherently disordered proteins. Curr Opin Struct Biol. 2008;18(6):756–764. [PubMed]
7. Mark WY, Liao JC, Lu Y, Ayed A, Laister R, Szymczyna B, Chakrabartty A, Arrowsmith CH. Characterization of segments from the central region of BRCA1: an intrinsically disordered scaffold for multiple protein-protein and protein-DNA interactions? J Mol Biol. 2005;345(2):275–287. [PubMed]
8. Fuxreiter M, Tompa P, Simon I, Uversky VN, Hansen JC, Asturias FJ. Malleable machines take shape in eukaryotic transcriptional regulation. Nat Chem Biol. 2008;4(12):728–737. [PMC free article] [PubMed]
9. Xue B, Williams RW, Oldfield CJ, Dunker AK, Uversky VN. Archaic chaos: intrinsically disordered proteins in Archaea. BMC Syst Biol. 2010;4 Suppl 1:S1. [PMC free article] [PubMed]
10. Bogatyreva NS, Finkelstein AV, Galzitskaya OV. Trend of amino acid composition of proteins of different taxa. J Bioinform Comput Biol. 2006;4(2):597–608. [PubMed]
11. Dyson HJ, Wright PE. Intrinsically unstructured proteins and their functions. Nat Rev Mol Cell Biol. 2005;6(3):197–208. [PubMed]
12. Radivojac P, Iakoucheva LM, Oldfield CJ, Obradovic Z, Uversky VN, Dunker AK. Intrinsic disorder and functional proteomics. Biophys J. 2007;92(5):1439–1456. [PMC free article] [PubMed]
13. Xie H, Vucetic S, Iakoucheva LM, Oldfield CJ, Dunker AK, Uversky VN, Obradovic Z. Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions. J Proteome Res. 2007;6(5):1882–1898. [PMC free article] [PubMed]
14. Wright PE, Dyson HJ. Linking folding and binding. Curr Opin Struct Biol. 2009;19(1):31–38. [PMC free article] [PubMed]
15. Dunker AK, Oldfield CJ, Meng J, Romero P, Yang JY, Chen JW, Vacic V, Obradovic Z, Uversky VN. The unfoldomics decade: an update on intrinsically disordered proteins. BMC Genomics. 2008;9 Suppl 2:S1. [PMC free article] [PubMed]
16. Sickmeier M, Hamilton JA, LeGall T, Vacic V, Cortese MS, Tantos A, Szabo B, Tompa P, Chen J, Uversky VN, Obradovic Z, Dunker AK. DisProt: the Database of Disordered Proteins. Nucleic Acids Res. 2007;35(Database issue):D786–D793. [PMC free article] [PubMed]
17. Xie H, Vucetic S, Iakoucheva LM, Oldfield CJ, Dunker AK, Obradovic Z, Uversky VN. Functional anthology of intrinsic disorder. 3. Ligands, post-translational modifications, and diseases associated with intrinsically disordered proteins. J Proteome Res. 2007;6(5):1917–1932. [PMC free article] [PubMed]
18. Uversky VN, Oldfield CJ, Midic U, Xie H, Xue B, Vucetic S, Iakoucheva LM, Obradovic Z, Dunker AK. Unfoldomics of human diseases: linking protein intrinsic disorder with diseases. BMC Genomics. 2009;10 Suppl 1:S7. [PMC free article] [PubMed]
19. Raychaudhuri S, Dey S, Bhattacharyya NP, Mukhopadhyay D. The role of intrinsically unstructured proteins in neurodegenerative diseases. PLoS ONE. 2009;4(5):e5566. [PMC free article] [PubMed]
20. Bracken C, Iakoucheva LM, Romero PR, Dunker AK. Combining prediction, computation and experiment for the characterization of protein disorder. Curr Opin Struct Biol. 2004;14(5):570–576. [PubMed]
21. Linding R, Russell RB, Neduva V, Gibson TJ. GlobPlot: Exploring protein sequences for globularity and disorder. Nucleic Acids Res. 2003;31(13):3701–3708. [PMC free article] [PubMed]
22. Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK. Identifying disorder regions in proteins from amino acid sequence. Proc IEEE Int Conf Neural Networks. 1997;1:90–95.
23. Ferron F, Longhi S, Canard B, Karlin D. A practical overview of protein disorder prediction methods. Proteins. 2006;65(1):1–14. [PubMed]
24. He B, Wang K, Liu Y, Xue B, Uversky VN, Dunker AK. Predicting intrinsic disorder in proteins: an overview. Cell Res. 2009;19(8):929–949. [PubMed]
25. Dosztanyi Z, Meszaros B, Simon I. Bioinformatical approaches to characterize intrinsically disordered/unstructured proteins. Brief Bioinform. 2010;11(2):225–243. [PubMed]
26. Moult J, Fidelis K, Zemla A, Hubbard T. Critical assessment of methods of protein structure prediction (CASP)-round V. Proteins. 2003;53 Suppl 6:334–339. [PubMed]
27. Kinch L, Shi S, Cheng H, Cong Q, Pei J, Schwede T, Grishin N. CASP9 target classification. Proteins. 2011 (Current) [PMC free article] [PubMed]
28. Melamud E, Moult J. Evaluation of disorder predictions in CASP5. Proteins. 2003;53 Suppl 6:561–565. [PubMed]
29. Jin Y, Dunbrack RL., Jr Assessment of disorder predictions in CASP6. Proteins. 2005;61 Suppl 7:167–175. [PubMed]
30. Bordoli L, Kiefer F, Schwede T. Assessment of disorder predictions in CASP7. Proteins. 2007;69 Suppl 8:129–136. [PubMed]
31. Noivirt-Brik O, Prilusky J, Sussman JL. Assessment of disorder predictions in CASP8. Proteins. 2009;77 Suppl 9:210–216. [PubMed]
32. Zemla A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res. 2003;31(13):3370–3374. [PMC free article] [PubMed]
33. Zhang Y, Stec B, Godzik A. Between order and disorder in protein structures: analysis of "dual personality" fragments in proteins. Structure. 2007;15(9):1141–1147. [PMC free article] [PubMed]
34. Cheng Y, Oldfield CJ, Meng J, Romero P, Uversky VN, Dunker AK. Mining alpha-helix-forming molecular recognition features with cross species sequence alignments. Biochemistry. 2007;46(47):13468–13477. [PMC free article] [PubMed]
35. Lobanov MY, Furletova EI, Bogatyreva NS, Roytberg MA, Galzitskaya OV. Library of disordered patterns in 3D protein structures. PLoS Comput Biol. 2010;6(10):e1000958. [PMC free article] [PubMed]
36. Carpenter J, Bithell J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat Med. 2000;19(9):1141–1164. [PubMed]
37. Wilcox RR. Fundamentals of modern statistical methods : substantially improving power and accuracy. New York, NY: Springer; 2010. p. 249. xvi.
38. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–845. [PubMed]
39. Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975;405(2):442–451. [PubMed]
40. Carugo O. Detailed estimation of bioinformatics prediction reliability through the Fragmented Prediction Performance Plots. BMC Bioinformatics. 2007;8:380. [PMC free article] [PubMed]
41. Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics. 2000;16(5):412–424. [PubMed]
42. van Rijsbergen CJ. Foundation of evaluation. J of Documentation. 1974;30(4):365–373.
43. Sokolova M, Japkowicz N, Szpakowicz S. Beyond Accuracy, F-score and ROC: a Family of Discriminant Measures for Performance Evaluation. Lecture Notes in Comp Sci. 2006;4304:1015–1021.
44. Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol. 2004;337(3):635–645. [PubMed]
45. Payton ME, Greenstone MH, Schenker N. Overlapping confidence intervals or standard error intervals: what do they mean in terms of statistical significance? J Insect Sci. 2003;3:34. [PMC free article] [PubMed]
46. The R development Core Team. Vienna: 2006. R: a language and environment for statistical computing.
48. Cheng Y, LeGall T, Oldfield CJ, Mueller JP, Van YY, Romero P, Cortese MS, Uversky VN, Dunker AK. Rational drug design via intrinsically disordered protein. Trends Biotechnol. 2006;24(10):435–442. [PubMed]
49. Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z. Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics. 2006;7:208. [PMC free article] [PubMed]
50. Radivojac P, Obradovic Z, Smith DK, Zhu G, Vucetic S, Brown CJ, Lawson JD, Dunker AK. Protein flexibility and intrinsic disorder. Protein Sci. 2004;13(1):71–80. [PMC free article] [PubMed]
51. Hirose S, Shimizu K, Noguchi T. POODLE-I: Disordered region prediction by integrating POODLE series and structural information predictors based on a workflow approach. In Silico Biology. 2010;10(0015) [PubMed]
52. Han P, Zhang X, Norton RS, Feng ZP. Large-scale prediction of long disordered regions in proteins using random forests. BMC Bioinformatics. 2009;10:8. [PMC free article] [PubMed]
53. Vullo A, Bortolami O, Pollastri G, Tosatto SC. Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res. 2006;34(Web Server issue):W164–W168. [PMC free article] [PubMed]
54. Kryshtafovych A, Fidelis K, Moult J. CASP9 results compared to those of previous CASP experiments. Proteins. 2011 (Current) [PMC free article] [PubMed]
55. Deng X, Eickholt J, Cheng J. PreDisorder: ab initio sequence-based prediction of protein disordered regions. BMC Bioinformatics. 2009;10:436. [PMC free article] [PubMed]
56. Schlessinger A, Punta M, Yachdav G, Kajan L, Rost B. Improved disorder prediction by combination of orthogonal approaches. PLoS ONE. 2009;4(2):e4433. [PMC free article] [PubMed]
57. Mizianty MJ, Stach W, Chen K, Kedarisetti KD, Disfani FM, Kurgan L. Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources. Bioinformatics. 2010;26(18):i489–i496. [PMC free article] [PubMed]
58. Xue B, Dunbrack RL, Williams RW, Dunker AK, Uversky VN. PONDR-FIT: a meta-predictor of intrinsically disordered amino acids. Biochim Biophys Acta. 2010;1804(4):996–1010. [PMC free article] [PubMed]
59. Ishida T, Kinoshita K. PrDOS: prediction of disordered protein regions from amino acid sequence. Nucleic Acids Res. 2007;35(Web Server issue):W460–W464. [PMC free article] [PubMed]
60. Meiler J, Muller M, Zeidler A, Schmaschke F. Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks. J Mol Model. 2001;7(9):360–369.
61. Faraggi E, Yang Y, Zhang S, Zhou Y. Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. Structure. 2009;17(11):1515–1527. [PMC free article] [PubMed]
62. Zhang T, Faraggi E, Zhou Y. Fluctuations of backbone torsion angles obtained from NMR-determined structures and their prediction. Proteins. 2010;78(16):3353–3362. [PMC free article] [PubMed]
63. Cheng JL, Sweredoski MJ, Baldi P. Accurate prediction of protein disordered regions by mining protein structure data. Data Min Knowl Disc. 2005;11(3):213–222.
64. Dosztanyi Z, Csizmok V, Tompa P, Simon I. IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics. 2005;21(16):3433–3434. [PubMed]
65. Shimizu K, Hirose S, Noguchi T. POODLE-S: web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix. Bioinformatics. 2007;23(17):2337–2338. [PubMed]
66. Hirose S, Shimizu K, Kanai S, Kuroda Y, Noguchi T. POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions. Bioinformatics. 2007;23(16):2046–2053. [PubMed]
67. Shimizu K, Muraoka Y, Hirose S, Tomii K, Noguchi T. Predicting mostly disordered proteins by using structure-unknown protein data. BMC Bioinformatics. 2007;8:78. [PMC free article] [PubMed]
68. McGuffin LJ. Intrinsic disorder prediction from the analysis of multiple protein fold recognition models. Bioinformatics. 2008;24(16):1798–1804. [PubMed]
69. Wang L, Sauer UH. OnD-CRF: predicting order and disorder in proteins using [corrected] conditional random fields. Bioinformatics. 2008;24(11):1401–1402. [PMC free article] [PubMed]
70. Rangwala H, Kauffman C, Karypis G. svmPRAT: SVM-based protein residue annotation toolkit. BMC Bioinformatics. 2009;10:439. [PMC free article] [PubMed]
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles
  • Substance
    Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...