## Results: 6

1.

**Figure 6.**Variation of neuronal proportion by age. CETS model predicted neuronal proportion (y-axis) vs. age in years (x-axis) for frontal cortex (

**A**), cerebellum (

**B**), and cerebellum excluding outlier neuronal predictions (

**C**). Outliers were those predictions located beyond the 9th and 91st percentile of the predicted neuronal distribution.

2.

**Figure 2.**CETS model validation. (

**A**) Scatter plot of predicted neuronal proportions vs. empirical mixes of neuronal and glial DNA in increments of 10%. (

**B**) Scatter plot of predicted neuronal proportion vs. FACS based estimate in 20 bulk tissue samples. (

**C**) Scatter plot of predicted neuronal proportion vs. NeuN gene (

*RBFOX3*) expression (2

^{log(fold change)}) in 100 bulk brain tissue samples evaluated on Illumina HM27 microarrays and custom gene expression platforms. 24

3.

**Figure 5.**Brain region specific epigenetic variation is a function of neuronal proportion. (

**A**) Boxplots of the CETS model predicted neuronal proportion (y-axis) in Pons (green), frontal cortex (FCTX, blue), temporal cortex (TCTX, red), and cerebellum (CER, black) (x-axis). (

**B**) A plot of the neuronal proportion (y-axis) vs. euclidean distance of each array (x-axis). Color coding is the same as in

**A**. (

**C**) Volcano plots depicting the –log FDR significance of FCTX vs. pons (green), TCTX (red), and CER (black).

4.

**Figure 4.**Identification and correction of cell heterogeneity in MDD. (

**A**) Boxplots of the proportion of CETS model predicted neuronal proportions for control and MDD cases. (

**B**) Scatterplot of the log2 fold change (M value) between MDD and controls in non-corrected CHARM data (x-axis) vs. the percentage of DNA methylation difference in FACS separated neuronal and glial nuclei (y-axis) at overlapping loci between the CHARM and HM450 microarray platforms. (

**C**) Scatterplot of the M value between MDD and controls in CETS model corrected CHARM data (x-axis) vs. the percentage of DNA methylation difference in FACS separated neuronal and glial nuclei (y-axis) at overlapping loci between the CHARM and HM450 microarray platforms.

5.

**Figure 1.**CETS model generation. (

**A**) Volcano plot of DNA methylation vs. –log of FDR significance for neuronal vs. glial DNA methylation profiles. Almost the entire microarray identifies FDR significant changes across cell types. Red spots represent loci significant at a nominal p value of 0.05 in a comparison of MDD vs. control individuals that are excluded from CETS model generation. Green boxes represent top 10,000 CETS markers. (

**B**) Scatterplot of DNA methylation as obtained by HM450 microarrays (x-axis) and by independent pyrosequencing assays (y-axis) at five loci within the top1000 CETS markers. (

**C**) Heat map depicting the in silico virtual gradient of neuronal to glial DNA methylation at the top 10,000 CETS markers generated across our sample of 58 individuals. Yellow and red denote β values of methylated and unmethyated DNA, respectively. Linear modeling F-statistic for each neuronal proportion is overlaid (blue). Blue dashed line depicts the model prediction of 43%.

6.

**Figure 3.**Transformation of bulk tissue derived data to in silico neuronal and non-neuronal profiles. (

**A**) Heat Maps of raw non-transformed, transformed neuronal, and transformed glial DNA methylation values at the top 10,000 CETS markers across the empirically mixed sample range. (

**B**) Scatter plots of the raw 100% neuronal vs. the transformed neuronal DNA methylation profile at the top 10000 CETS markers across the empirically mixed sample range with the blue line depicting the line of best fit. Red lines represent the line of best fit for the correlation between 100% neuronal vs. non-transformed DNA methylation profile across the empirically mixed sample range. (

**C**) Scatter plots of the –log of the p-values generated from 100 iterations of randomly shuffling 20 bulk tissue samples. The –log(p value) of a Fisher’s exact test evaluating the degree of overlap between FACS derived neuronal profiles and the transformed and non-transformed bulk tissues in each of the 100 pair wise comparisons (y-axis) is plotted as a function of the –log(p value) of a test evaluating the group-wise neuronal proportion differences (x-axis) for each of the 100 random comparisons.