Big Data in Plant Science: Resources and Data Mining Tools for Plant Genomics and Proteomics

Methods Mol Biol. 2016:1415:533-47. doi: 10.1007/978-1-4939-3572-7_27.

Abstract

In modern plant biology, progress is increasingly defined by the scientists' ability to gather and analyze data sets of high volume and complexity, otherwise known as "big data". Arguably, the largest increase in the volume of plant data sets over the last decade is a consequence of the application of the next-generation sequencing and mass-spectrometry technologies to the study of experimental model and crop plants. The increase in quantity and complexity of biological data brings challenges, mostly associated with data acquisition, processing, and sharing within the scientific community. Nonetheless, big data in plant science create unique opportunities in advancing our understanding of complex biological processes at a level of accuracy without precedence, and establish a base for the plant systems biology. In this chapter, we summarize the major drivers of big data in plant science and big data initiatives in life sciences with a focus on the scope and impact of iPlant, a representative cyberinfrastructure platform for plant science.

Keywords: Big data; Databases; Genomics; Mass spectrometry; Next-generation sequencing; Proteomics; iPlant.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Botany
  • Computational Biology / methods*
  • Data Mining / methods*
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing
  • Internet
  • Mass Spectrometry
  • Plants / genetics*
  • Plants / metabolism*
  • Proteomics / methods
  • Systems Biology