Format

Send to

Choose Destination
Genome Biol. 2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Zhou N1,2, Jiang Y3, Bergquist TR4, Lee AJ5, Kacsoh BZ6,7, Crocker AW8, Lewis KA8, Georghiou G9, Nguyen HN1,10, Hamid MN1,2, Davis L2, Dogan T11,12, Atalay V13, Rifaioglu AS13,14, Dalkıran A13, Cetin Atalay R15, Zhang C16, Hurto RL17, Freddolino PL16,17, Zhang Y16,17, Bhat P18, Supek F19,20, Fernández JM21,22, Gemovic B23, Perovic VR23, Davidović RS23, Sumonja N23, Veljkovic N23, Asgari E24,25, Mofrad MRK26, Profiti G27,28, Savojardo C27, Martelli PL27, Casadio R27, Boecker F29, Schoof H30, Kahanda I31, Thurlby N32, McHardy AC33,34, Renaux A35,36,37, Saidi R12, Gough J38, Freitas AA39, Antczak M40, Fabris F39, Wass MN40, Hou J41,42, Cheng J42, Wang Z43, Romero AE44, Paccanaro A44, Yang H45,46, Goldberg T47, Zhao C48,49,50, Holm L51, Törönen P51, Medlar AJ51, Zosa E52, Borukhov I53, Novikov I54, Wilkins A55, Lichtarge O55, Chi PH56, Tseng WC57, Linial M58, Rose PW59, Dessimoz C60,61,62, Vidulin V63, Dzeroski S64,65, Sillitoe I66, Das S67, Lees JG67,68, Jones DT69,70, Wan C71,69, Cozzetto D71,69, Fa R71,69, Torres M44, Warwick Vesztrocy A70,72, Rodriguez JM73, Tress ML74, Frasca M75, Notaro M75, Grossi G75, Petrini A75, Re M75, Valentini G75, Mesiti M75,76, Roche DB77, Reeb J77, Ritchie DW78, Aridhi S78, Alborzi SZ78,79, Devignes MD78,80,79, Koo DCE81, Bonneau R82,83, Gligorijević V84, Barot M85, Fang H86, Toppo S87, Lavezzo E87, Falda M88, Berselli M87, Tosatto SCE89,90, Carraro M90, Piovesan D90, Ur Rehman H91, Mao Q92,93, Zhang S92, Vucetic S92, Black GS94,95, Jo D94,95, Suh E94, Dayton JB94,95, Larsen DJ94,95, Omdahl AR94,95, McGuffin LJ96, Brackenridge DA96, Babbitt PC97,98, Yunes JM99,98, Fontana P100, Zhang F101,102, Zhu S103,104,105, You R103,104,105, Zhang Z103,105, Dai S103,105, Yao S103,104, Tian W106,107, Cao R108, Chandler C108, Amezola M108, Johnson D108, Chang JM109, Liao WH109, Liu YW109, Pascarelli S110, Frank Y111, Hoehndorf R112, Kulmanov M112, Boudellioua I113,114, Politano G115, Di Carlo S115, Benso A115, Hakala K116,117, Ginter F116,118, Mehryary F116,117, Kaewphan S116,117,119, Björne J120,121, Moen H118, Tolvanen MEE122, Salakoski T120,121, Kihara D123,124, Jain A125, Šmuc T126, Altenhoff A127,128, Ben-Hur A129, Rost B47,130, Brenner SE131, Orengo CA67, Jeffery CJ132, Bosco G133, Hogan DA6,8, Martin MJ9, O'Donovan C9, Mooney SD4, Greene CS134,135, Radivojac P136, Friedberg I137.

Author information

1
Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA.
2
Program in Bioinformatics and Computational Biology, Ames, IA, USA.
3
Indiana University Bloomington, Bloomington, Indiana, USA.
4
Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, USA.
5
Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, USA.
6
Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
7
Department of Molecular and Systems Biology, Hanover, NH, USA.
8
Department of Microbiology and Immunology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
9
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom.
10
Program in Computer Science, Ames, IA, USA.
11
Department of Computer Engineering, Hacettepe University, Ankara, Turkey.
12
European Molecular Biolo gy Labora tory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK.
13
Department of Computer Engineering, Middle East Technical University (METU), Ankara, Turkey.
14
Department of Computer Engineering, Iskenderun Technical University, Hatay, Turkey.
15
CanSyL, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey.
16
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
17
Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
18
Achira Labs, Bangalore, India.
19
Institute for Research in Biomedicine (IRB Barcelona), Barcelona, Spain.
20
Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
21
INB Coordination Unit, Life Sciences Department, Barcelona Supercomputing Center, Barcelona, Catalonia, Spain.
22
(former) INB GN2, Structural and Computational Biology Programme, Spanish National Cancer Research Centre, Barcelona, Catalonia, Spain.
23
Laboratory for Bioinformatics and Computational Chemistry, Institute of Nuclear Sciences VINCA, University of Belgrade, Belgrade, Serbia.
24
Molecular Cell Biomechanics Laboratory, Departments of Bioengineering, University of California Berkeley, Berkeley, CA, USA.
25
Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Berkeley, CA, USA.
26
Departments of Bioengineering and Mechanical Engineering, Berkeley, CA, USA.
27
Bologna Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
28
National Research Council, IBIOM, Bologna, Italy.
29
University of Bonn: INRES Crop Bioinformatics, Bonn, North Rhine-Westphalia, Germany.
30
INRES Crop Bioinformatics, University of Bonn, Bonn, Germany.
31
Gianforte School of Computing, Montana State University, Bozeman, Montana, USA.
32
University of Bristol, Computer Science, Bristol, Bristol, United Kingdom.
33
Computational Biology of Infection Research, Helmholtz Centre for Infection Research, Brunswick, Germany.
34
RESIST, DFG Cluster of Excellence 2155, Brunswick, Germany.
35
Interuniversity Institute of Bioinformatics in Brussels, Université libre de Bruxelles - Vrije Universiteit Brussel, Brussels, Belgium.
36
Machine Learning Group, Université libre de Bruxelles, Brussels, Belgium.
37
Artificial Intelligence lab, Vrije Universiteit Brussel, Brussels, Belgium.
38
MRC Laboratory of Molecular Biology, Cambridge, United Kingdom.
39
University of Kent, School of Computing, Canterbury, United Kingdom.
40
School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom.
41
University of Missouri, Computer Science, Columbia, Missouri, USA.
42
Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA.
43
University of Miami, Coral Gables, Florida, USA.
44
Centre for Systems and Synthetic Biology, Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, United Kingdom.
45
School of Mathematics, Statistics and Applied Mathematics, National University of Ireland, Galway, Galway, Ireland.
46
Technical University of Munich, Garching, Germany.
47
Department of Informatics, Bioinformatics & Computational Biology-i12, Technische Universitat Munchen, Munich, Germany.
48
Faculty for Informatics, Garching, Germany.
49
Department for Bioinformatics and Computational Biology, Garching, Germany.
50
School of Computing Sciences and Computer Engineering, Hattiesburg, Mississippi, USA.
51
Institute of Biotechnology, Helsinki Institute of Life Sciences, University of Helsinki, Finland, Helsinki, Finland.
52
Institute of Biotechnology, University of Helsinki, Helsinki, Finland.
53
Compugen Ltd., Holon, Israel.
54
Baylor College of Medicine, Department of Biochemistry and Molecular Biology, Houston, TX, USA.
55
Baylor College of Medicine, Department of Molecular and Human Genetics, Houston, TX, USA.
56
National TsingHua University, Hsinchu, Taiwan.
57
Department of Electrical Engineering in National Tsing Hua University, Hsinchu City, Taiwan.
58
The Hebrew University of Jerusalem, Jerusalem, Israel.
59
University of California San Diego, San Diego Supercomputer Center, La Jolla, California, USA.
60
Department of Computational Biology and Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.
61
Department of Genetics, Evolution & Environment, and Department of Computer Science, University College London, London, UK.
62
Swiss Institute of Bioinformatics, Lausanne, Switzerland.
63
Department of Knowledge Technologies, Jozef Stefan Institute, Ljubljana, Slovenia.
64
Jozef Stefan Institute, Ljubljana, Slovenia.
65
Jozef Stefan International Postgraduate School, Ljubljana, Slovenia.
66
Research Department of Structural and Molecular Biology, University College London, London, England.
67
Research Department of Structural and Molecular Biology, University College London, London, United Kingdom.
68
Department of Health and Life Sciences, Oxford Brookes University, London, UK.
69
The Francis Crick Institute, Biomedical Data Science Laboratory, London, United Kingdom.
70
Department of Genetics, Evolution and Environment, University College London, Gower Street, London, WC1E 6BT, United Kingdom.
71
Department of Computer Science, University College London, London, United Kingdom.
72
SIB Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland.
73
Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), Madrid, Spain.
74
Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
75
Università degli Studi di Milano - Computer Science Department - AnacletoLab, Milan, Milan, Italy.
76
Institut de Biologie Computationnelle, LIRMM, CNRS-UMR 5506, Universite de Montpellier, Montpellier, France.
77
Department of Informatics, Bioinformatics and Computational Biology-i12, Technische Universitat Munchen, Munich, Germany.
78
University of Lorraine, CNRS, Inria, LORIA, Nancy, 54000, France.
79
Inria, Nancy, France.
80
University of Lorraine, Nancy, Lorraine, France.
81
Department of Biology, New York University, New York, NY, USA.
82
NYU Center for Data Science, New York, 10010, NY, USA.
83
Flatiron Institute, CCB, New York, 10010, NY, USA.
84
Center for Computational Biology (CCB), Flatiron Institute, Simons Foundation, New York, New York, USA.
85
Center for Data Science, New York University, New York, 10011, NY, USA.
86
Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
87
Department of Molecular Medicine, University of Padova, Padova, Italy.
88
Department of Biology, University of Padova, Padova, Italy.
89
CNR Institute of Neuroscience, Padova, Italy.
90
Department of Biomedical Sciences, University of Padua, Padova, Italy.
91
Department of Computer Science, National University of Computer and Emerging Sciences, Peshawar, Khyber Pakhtoonkhwa, Pakistan.
92
Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA.
93
University of California, Riverside, Philadelphia, PA, USA.
94
Department of Biology, Brigham Young University, Provo, UT, USA.
95
Bioinformatics Research Group, Provo, UT, USA.
96
School of Biological Sciences, University of Reading, Reading, England, United Kingdom.
97
Department of Pharmaceutical Chemistry, San Francisco, CA, USA.
98
Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, 94158, CA, USA.
99
UC Berkeley - UCSF Graduate Program in Bioengineering, University of California, San Francisco, 94158, CA, USA.
100
Research and Innovation Center, Edmund Mach Foundation, San Michele all'Adige, Italy.
101
State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai, Shanghai, China.
102
Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China.
103
School of Computer Science and Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai, China.
104
Institute of Science and Technology for Brain-Inspired Intelligence and Shanghai Institute of Artificial Intelligence Algorithms, Fudan University, Shanghai, China.
105
Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Fudan University), Ministry of Education, Shanghai, China.
106
State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and Development, Department of Biostatistics and Computational Biology, School of Life Sciences, Fudan University, Shanghai, Shanghai, China.
107
Department of Pediatrics, Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.
108
Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA.
109
Department of Computer Science, National Chengchi University, Taipei, Taiwan.
110
Okinawa Institute of Science and Technology, Tancha, Okinawa, Japan.
111
Tel Aviv University, Tel Aviv, Israel.
112
Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Jeddah, Saudi Arabia.
113
Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
114
Computer, Electrical and Mathematical Sciences Engineering Division (CEMSE), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
115
Control and Computer Engineering Department, Politecnico di Torino, Torino, TO, Italy.
116
Department of Future Technologies, Turku NLP Group, University of Turku, Turku, Finland.
117
University of Turku Graduate School (UTUGS), Turku, Finland.
118
University of Turku, Turku, Finland.
119
Turku Centre for Computer Science (TUCS), Turku, Finland.
120
Department of Future Technologies, Faculty of Science and Engineering, University of Turku, Turku, FI-20014, Finland.
121
Turku Centre for Computer Science (TUCS), Agora, Vesilinnantie 3, Turku, FI-20500, Finland.
122
Department of Future Technologies, University of Turku, Turku, Finland.
123
Department of Biological Sciences, Department of Computer Science, Purdue University, 47907, IN, USA.
124
Department of Pediatrics, University of Cincinnati, Cincinnati, 45229, OH, USA.
125
Department of Computer Science, Purdue University, West Lafayette, IN, USA.
126
Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia.
127
Department of Computer Science, ETH Zurich, Zurich, Switzerland.
128
SIB Swiss Institute of Bioinformatics, Zurich, Switzerland.
129
Department of Computer Science, Colorado State University, Fort Collins, CO, USA.
130
Institute for Food and Plant Sciences WZW, Technische Universität München, Freising, Germany.
131
University of California, Berkeley, CA, USA.
132
Biological Sciences, University of Illinois at Chicago, Chicago, Illinois, USA.
133
Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
134
Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
135
Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, Pennsylvania, USA.
136
Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA. predrag@northeastern.edu.
137
Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA, USA. idoerg@iastate.edu.

Abstract

BACKGROUND:

The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.

RESULTS:

Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.

CONCLUSION:

We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.

KEYWORDS:

Biofilm; Community challenge; Critical assessment; Long-term memory; Protein function prediction

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center