show Abstracthide AbstractThe main goal of the project is to develop a new generation of bioinformatics resources for the integrative analysis of multiple types of omics data. These resources include both novel statistical methodologies as well as user-friendly software implementations. STATegra methods address many aspects of the omics data integration problem such as the design of multiomics experiments, integrative transcriptional and regulatory networks, integrative variable selection, data fusion, integration of public domain data, and integrative pathway analysis. To support method development STATegra uses a model biological system, namely the differentiation process of mouse pre-B-cells. The STATegra consortium generated data focused on a critical step in the differentiation of B lymphocytes, which are key components of the adaptive immune system. Transcription factors of the Ikaros family are central to the normal differentiation of B cell progenitors and their expression increases in response to developmental stage-specific signals to terminate the proliferation of B cell progenitors and to initiate their differentiation. In particular, a novel biological system that models the transition from the pre-BI stage to the pre-BII subsequent stage, where B cell progenitors undergo growth arrest and differentiation, was used. The approach involves a pre-B cell line, B3 , and an inducible version of the Ikaros transcription factor, Ikaros-ERt2. Ikaros factors act to down-regulate genes that drive proliferation and to simultaneously up-regulate the expression of genes that promote the differentiation of B cell progenitors. Hence, in the B3 system, before induction of Ikaros, cells proliferate and their gene expression pattern is similar to proliferating B cell progenitors in vivo. Following Ikaros induction, B3 cells undergo gene expression changes that resemble those that occur in vivo during the transition from cycling to resting pre-B cells, followed by a marked reduction in cellular proliferation and by G1 arrest. On this system the consortium has created a high-quality data collection consisting of a replicated time course using seven different omics platforms: RNA-seq, miRNA-seq, ChIP-seq, DNase-seq, Methyl-seq, proteomics and metabolomics, which is used to assess and to validate STATegra methods. Overall design: The STATegra experimental design consists of a 6 points time course that captures the differentiation of B3 cells containing Ikaros-ERt2 upon Ikaros induction within a 24 hours period. The process was sampled at 0h, 2h, 6h, 12h, 18h and 24h after Ikaros induction by Tamoxifen. As control, B3 cells transfected with an empty vector were treated and sampled in the same way as the inducible line. Eight different omic technologies were measured on this system: RNA-seq, miRNA-seq, DNase-seq, RRBS-seq, ChIP-seq, scRNA-seq, Proteomics and Metabolomics. Generally, three biological replicates were obtained per platform and were processed as independent batches. Small RNA-seq analysis was performed using Trizol-extracted total RNA of 3 biological batches (4, 5 and 6) for time 0h and total RNA of 3 biological batches (1, 2 and 3) for times 2h, 6h, 12h,18h and 24h. There were two experimental conditions (C=Control, IK=Ikaros) and the 3 biological replicates per condition and time point were numbered as 1, 2 and 4. For some of these biological replicates, also technical replicates were generated in order to measure the variability between batches and afterwards correct the potential batch effect.