Format

Send to

Choose Destination
See comment in PubMed Commons below
F1000Res. 2016 Jul 5;5:1574. doi: 10.12688/f1000research.9110.1. eCollection 2016.

An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study.

Author information

  • 1Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, NY, Box 1603, USA; BD2K-LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, New York, NY, Box 1603, USA; Mount Sinai Knowledge Management Center for Illuminating the Druggable Genome, Icahn School of Medicine at Mount Sinai, New York, NY, Box 1603, USA.

Abstract

RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at:  http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and  https://hub.docker.com/r/maayanlab/zika/.

KEYWORDS:

RNA-seq; Systems biology; bioinformatics pipeline; gene expression analysis

PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for F1000 Research Ltd Icon for PubMed Central
    Loading ...
    Support Center