Format

Send to

Choose Destination
Plant Physiol. 2018 Apr;176(4):2772-2788. doi: 10.1104/pp.17.01764. Epub 2018 Feb 12.

Multi-Omics Driven Assembly and Annotation of the Sandalwood (Santalum album) Genome.

Author information

1
Center for Functional Genomics and Bioinformatics, TransDisciplinary University, Institute of Trans-Disciplinary Health Sciences and Technology, Bengaluru 560064, India.
2
Center for Cellular and Molecular Platforms, National Centre for Biological Sciences, Bengaluru 560065, India.
3
Center for Systems Biology and Molecular Medicine, Yenepoya University, Mangalore 575018, India.
4
Institute of Bioinformatics, International Technology Park, Bengaluru 560066, India.
5
Manipal Academy of Higher Education, Manipal 576104, India.
6
School of Biotechnology, Amrita Vishwa Vidyapeetham, Kollam 690525, India.
7
Center for Systems Biology and Molecular Medicine, Yenepoya University, Mangalore 575018, India keshav@ibioinformatics.org malalig@tdu.edu.in.
8
Center for Functional Genomics and Bioinformatics, TransDisciplinary University, Institute of Trans-Disciplinary Health Sciences and Technology, Bengaluru 560064, India keshav@ibioinformatics.org malalig@tdu.edu.in.

Abstract

Indian sandalwood (Santalum album) is an important tropical evergreen tree known for its fragrant heartwood-derived essential oil and its valuable carving wood. Here, we applied an integrated genomic, transcriptomic, and proteomic approach to assemble and annotate the Indian sandalwood genome. Our genome sequencing resulted in the establishment of a draft map of the smallest genome for any woody tree species to date (221 Mb). The genome annotation predicted 38,119 protein-coding genes and 27.42% repetitive DNA elements. In-depth proteome analysis revealed the identities of 72,325 unique peptides, which confirmed 10,076 of the predicted genes. The addition of transcriptomic and proteogenomic approaches resulted in the identification of 53 novel proteins and 34 gene-correction events that were missed by genomic approaches. Proteogenomic analysis also helped in reassigning 1,348 potential noncoding RNAs as bona fide protein-coding messenger RNAs. Gene expression patterns at the RNA and protein levels indicated that peptide sequencing was useful in capturing proteins encoded by nuclear and organellar genomes alike. Mass spectrometry-based proteomic evidence provided an unbiased approach toward the identification of proteins encoded by organellar genomes. Such proteins are often missed in transcriptome data sets due to the enrichment of only messenger RNAs that contain poly(A) tails. Overall, the use of integrated omic approaches enhanced the quality of the assembly and annotation of this nonmodel plant genome. The availability of genomic, transcriptomic, and proteomic data will enhance genomics-assisted breeding, germplasm characterization, and conservation of sandalwood trees.

PMID:
29440596
PMCID:
PMC5884603
[Available on 2019-04-01]
DOI:
10.1104/pp.17.01764
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for HighWire
Loading ...
Support Center