## Results: 6

Figure 4. From: MOABS: model based analysis of bisulfite sequencing data.

**MOABS improves the detection of allele specific DNA methylation. (a)**The y-axis shows the number of known DMRs recovered by three different methods.

**(b)**Sensitivity (Y-axis) at 5% FDR with different sequencing depth (X-axis).

Figure 6. From: MOABS: model based analysis of bisulfite sequencing data.

**MOABS detects differential 5hmc using RRBS and oxBS-Seq. (a)**Simulation study of 5hmc detection from oxBS-seq and RRBS. Each point on curves represents the smallest number of reads (X-axis) needed to detect a 5hmc ratio (Y-axis) at specified 5mc ratio (indicated as colors). The thin and thick curves represent FETP and MOABS, respectively.

**(b)**Beta distribution of 5hmc ratio in each sample.

Figure 2. From: MOABS: model based analysis of bisulfite sequencing data.

**Overview of the MOABS software pipeline. (a)**Comprehensive workflow of the MOABS pipeline.

**(b)**An example of hypo-methylated region.

**(c)**A descriptive figure for global methylation distribution of a mouse methylome. The Y-axis on the left is percent of CpGs and the Y-axis on the right is the average of local CpG density at each specified methylation ratio.

Figure 3. From: MOABS: model based analysis of bisulfite sequencing data.

**Comparison between MOABS and FETP in detecting DMCs.**We simulated 1,000,000 CpGs in two samples with predefined true positive or true negative states. In both samples, 900,000 true negative CpGs were initially assigned the same methylation ratios. The density of the methylation ratios fits a bimodal distribution (Additional file : Figure S1) frequently observed in real BS-seq data. The remaining true positive 100,000 CpGs were randomly assigned at low ratios [0, 0.25] in one sample and high ratios [0.75,1] in the other sample, respectively. Each methylation ratio was then given a +/-0.05 fluctuation to simulate BS-seq errors. Sequencing depth is randomly sampled from 5-fold to 50-fold. The Y-axis shows the percentage of true DMCs predicted at 5% FDR.

Figure 1. From: MOABS: model based analysis of bisulfite sequencing data.

**Overview of the MOABS algorithm. (a)**Posterior distribution of methylation ratio inferred from biological replicates. Each curve represents the inferred methylation ratio Beta distribution of a CpG. The symbols at the bottom indicate the observed methylation ratios of all replicates. The values on the top right corner indicate number of methylated reads over number of total reads in each replicate.

**(b)**An example of Credible Methylation Difference (CDIF). Dash curves indicate inferred methylation ratio Beta distributions from low (Sample #1) or high sequencing depth (Sample #2). The black curve is the exact distribution of the methylation difference between two samples. The CDIF is shown as the lower bound of the 95% confidence interval.

**(c)**Ranking of three CpG examples by CDIF, FETP p-value and nominal difference, i.e. direct subtraction of two methylation ratios. The three curves are the exact distributions of methylation differences. The corresponding CDIF values are show as vertical dash lines.

Figure 5. From: MOABS: model based analysis of bisulfite sequencing data.

**MOABS reveals differential methylation underlying TFBSs. (a)**UCSC genome browser illustration of one TF binding site. The tracks from top to bottom are genomic positions, RefSeq Gene, HSC Methylation, ESC Methylation, and TFBS. For each CpG, an upward bar denotes the methylation ratio.

**(b)**Distribution of the number of DMCs underlying TFBSs. The inserted boxplot indicates the length distribution of TFBSs with 1–3 DMCs.

**(c)**Number of differentially methylated TFBSs predicted by different methods at 5% FDR.

**(d)**Running enrichment scores for TFBSs. All the CpGs are ranked by each method. The score increases if the CpG is in a TFBS or decreases if not. Only 10000 CpGs are sampled to make this plot, as indicated by the x-axis. The 10000 times of random shuffle of TFBSs determined p-values of the maximum enrichment score to be 1.4E-3, 1.6E-3, and 4E-3 for MOABS, FETP and BSMOOTH respectively.

**(e)**and

**(f)**Same as

**(c)**and

**(d)**with 4X sequencing depth by random sampling. The 10000 times of random shuffle of TFBSs determined p-values of the max enrichment score to be 2.9E-2, 5.1E-2, and 9.2E-2 for MOABS, FETP and BSMOOTH respectively.