Performance on identifying misannotations. a) The ROC curves on different types of artificially generated misannotations in the yeast network. The True Negative set 1 (TN1) was generated by randomly assigning incorrect metabolic functions to a fraction of network genes. The TN2 set was generated by reassigning network genes to new metabolic activities only if they had at least 30% sequence identities to newly assigned (incorrect) activities. The TN3 was generated by assigning genes to new activities only if they had similar (within 10%) or higher sequence identities to the reassigned (incorrect) activities. In all cases the remaining (not reassigned) activities were used as true positive examples. For realistic misannotation models, simulated by the sets TN2 and TN3, the method correctly identifies about 70%–90% of misannotations at 5%–15% false positive rate. The red dot in the figure approximately corresponds to 70% true positives and 5% false positives. b) The cumulative distributions of the classification confidence scores for B. subtilis metabolic assignments. The B. subtilis annotations made simultaneously by all analyzed databases (KEGG, MetaCyc and Swiss-Prot) are shown in red, annotations unique to KEGG, MetaCyc, or Swiss-Prot, are shown in black. For comparison we also show the true negative set TN3 from S. cerevisiae in blue. The cumulative distributions demonstrate that the consensus annotations (red) are, on average, more accurate than the ones unique to individual databases (blue, Kolmogorov-Smirnov test P=2*10−19). However, on average, database-specific annotations still score significantly better than true misannotations (KS P=2*10−4).