MassBase: A large-scaled depository of mass spectrometry datasets for metabolome analysis

Plant Biotechnol (Tokyo). 2021 Mar 25;38(1):167-171. doi: 10.5511/plantbiotechnology.20.0911a.

Abstract

Depository of low-molecular-weight compounds or metabolites detected in various organisms in a non-targeted manner is indispensable for metabolomics research. Due to the diverse chemical compounds, various mass spectrometry (MS) setups with state-of-the-art technologies have been used. Over the past two decades, we have analyzed various biological samples by using gas chromatography-mass spectrometry, liquid chromatography-mass spectrometry, or capillary electrophoresis-mass spectrometry, and archived the datasets in the depository MassBase (http://webs2.kazusa.or.jp/massbase/). As the format of MS datasets depends on the MS setup used, we converted each raw binary dataset of the mass chromatogram to text file format, and thereafter, information of the chromatograph peak was extracted in the text file from the converted file. In total, the depository comprises 46,493 datasets, of which 38,750 belong to the plant species and 7,743 are authentic or mixed chemicals as well as other sources (microorganisms, animals, and foods), as on August 1, 2020. All files in the depository can be downloaded in bulk from the website. Mass chromatograms of 90 plant species obtained by LC-Fourier transform ion cyclotron resonance MS or Orbitrap MS, which detect the ionized molecules with high accuracy allowing speculation of chemical compositions, were converted to text files by the software PowerGet, and the chemical annotation of each peak was added. The processed datasets were deposited in the annotation database KomicMarket2 (http://webs2.kazusa.or.jp/km2/). The archives provide fundamental resources for comparative metabolomics and functional genomics, which may result in deeper understanding of living organisms.

Keywords: database; mass spectrometry; metabolome.