U.S. flag

An official website of the United States government

PMC Full-Text Search Results

Items: 4

1.
Fig. 2.

Fig. 2. From: Removing technical variability in RNA-seq data using conditional quantile normalization.

Empirical distributions. (a) Empirical density estimates of are shown for 6 samples from the Montgomery data. (b) A histogram of counts in a single sample for genes with a GC-content of 45 ± 1% and with a length between 500 and 2000 bp is shown.

Kasper D. Hansen, et al. Biostatistics. 2012 Apr;13(2):204-216.
2.
Fig. 1.

Fig. 1. From: Removing technical variability in RNA-seq data using conditional quantile normalization.

Exploratory plots. (a) The points show the frequency of counts in the bins shown on the x-axis. The 3 colors represent 3 samples (NA12812, NA12874, and NA11993) from the Montgomery data. (b) log2-RPKM values are stratified by GC-content for 2 biological replicates from the Montgomery data (NA11918 and NA12761) and are summarized by boxplots. The 2 samples are distinguished by the 2 colors (colors can be seen in the online version). Genes with average (across all 60 samples) log2-RPKM values below 2 are not shown. (c) Log fold changes between RPKM values from the 2 samples and the same genes shown in (b) were computed and are plotted against GC-content. Red is used to show the genes with the 10% highest GC-content and blue is used to show the genes with the 10% lowest GC-content. (d) RPKM log fold changes are plotted against average log2 counts for the samples and genes shown in (b), with the same color coding as in (c). (e) As (d) but from values corrected using the method proposed by . (f) As (d) but for values normalized using our approach (see Section 4).

Kasper D. Hansen, et al. Biostatistics. 2012 Apr;13(2):204-216.
3.
Fig. 3.

Fig. 3. From: Removing technical variability in RNA-seq data using conditional quantile normalization.

Results from normalizing 60 samples. In these plots, we only show genes with a length greater than 100 bp and an average (across all 60 samples) standard log2-RPKM of 2 or greater. (a) Empirical density estimates of log2-RPKM for 5 different biological replicates from the Montgomery data are shown. (b) As (a) but CQN-normalized expression values on the log2-scale are shown. (c) The estimated GC-content effect are shown as curves for all 60 biological replicates in the Montgomery study. We created a 5 versus 5 comparison using the samples highlighted in blue (group 1) and red (group 2) (colors can be seen in the online version). (d) As (c) but curves are shown for the gene length effect instead of GC-content. (e) Average log fold change is plotted against GC-content. Here, we used RPKM values and compared group 2 to group 1. (f) Average log fold change is plotted against GC-content using CQN-normalized expression measures.

Kasper D. Hansen, et al. Biostatistics. 2012 Apr;13(2):204-216.
4.
Fig. 4.

Fig. 4. From: Removing technical variability in RNA-seq data using conditional quantile normalization.

Improved precision provided by CQN on comparisons across studies. (a) We show boxplots of the estimated log fold change between the 2 groups of 5 samples (the same 2 groups as in Figure 3) from the Montgomery data using standard RPKM, expression values normalized by TMM (trimmed median of M-values, the method proposed in ), the method proposed in , and CQN with and without quantile normalization. We show genes with length greater than 100 bp and average (across all samples) log2-RPKM greater or equal to 2. (b) We normalized the 29 samples assayed in both Montgomery and Cheung. For each gene, we computed the mean squared difference between the expression measure based on the Montgomery and the Cheung data. The boxplots show the distribution of these precision measures for the highly expressed genes, for each of the 4 choices of normalization: standard RPKM, TMM, the method proposed in , and CQN. We show genes with length greater than 100 bp and average (across all samples) log2-RPKM greater or equal to 2. (c) For the MicroArray Quality Control data, we obtained fold change estimates between UHR and brain based on RNA-Seq and microarrays. For RNA-seq, we used 2 samples. For the microarrays, we used a 5 versus 5 comparison. The microarray data were normalized using Robust Multiarray Analysis, and the RNA-seq data were normalized by CQN.

Kasper D. Hansen, et al. Biostatistics. 2012 Apr;13(2):204-216.

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...
Support Center