Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Oct 26, 2010; 107(43): 18729–18734.
Published online Oct 11, 2010. doi:  10.1073/pnas.1009695107
PMCID: PMC2972920
Plant Biology

Local DNA hypomethylation activates genes in rice endosperm


Cytosine methylation silences transposable elements in plants, vertebrates, and fungi but also regulates gene expression. Plant methylation is catalyzed by three families of enzymes, each with a preferred sequence context: CG, CHG (H = A, C, or T), and CHH, with CHH methylation targeted by the RNAi pathway. Arabidopsis thaliana endosperm, a placenta-like tissue that nourishes the embryo, is globally hypomethylated in the CG context while retaining high non-CG methylation. Global methylation dynamics in seeds of cereal crops that provide the bulk of human nutrition remain unknown. Here, we show that rice endosperm DNA is hypomethylated in all sequence contexts. Non-CG methylation is reduced evenly across the genome, whereas CG hypomethylation is localized. CHH methylation of small transposable elements is increased in embryos, suggesting that endosperm demethylation enhances transposon silencing. Genes preferentially expressed in endosperm, including those coding for major storage proteins and starch synthesizing enzymes, are frequently hypomethylated in endosperm, indicating that DNA methylation is a crucial regulator of rice endosperm biogenesis. Our data show that genome-wide reshaping of seed DNA methylation is conserved among angiosperms and has a profound effect on gene expression in cereal crops.

Keywords: DNA methylation, embryo, transposable element

Roughly 150 million y ago, flowering plants diverged to form the two dominant extant lineages, monocots and dicots (1). Arabidopsis thaliana, the preeminent plant genetic system, is a dicot, whereas cereal crops, such as rice, wheat, and maize, that feed much of the world are monocots. In both plant groups, pollen grains contain two sperm nuclei, one of which fertilizes a diploid central cell to give rise to triploid endosperm (2). A. thaliana endosperm is consumed by the developing embryo, whereas cereal endosperm persists and makes up the bulk of the mature seed—a developmental difference of particular practical importance (3). Developing seeds are genetic battlegrounds on multiple fronts: parents are proposed to be in conflict over resource allocation (2), whereas the embryo must repress parasitic transposable elements (TEs) to prevent damage to the genome.

A. thaliana endosperm DNA methylation participates in both conflicts (2, 46). Plant TEs are preferentially methylated through a small RNA pathway, which recruits the DOMAINS REARRANGED METHYLASES (DRM) methyltransferases that catalyze methylation of cytosines in all sequence contexts (6). This methylation is generally referred to as CHH (H = A, C, or T) to differentiate it from CG methylation, catalyzed by the METHYLTRANSFERASE 1 family, and CHG methylation, catalyzed by the CHROMOMETHYLASE family (6). CHG and CHH methylation is predominantly found in TEs, whereas CG methylation is abundant in both TEs and genes (68). Plants also posses DNA glycosylase enzymes capable of removing 5-methylcytosine (6). One of these enzymes, DEMETER (DME), is expressed in the A. thaliana central cell before fertilization, leading to extensive hypomethylation of the maternal genome (2, 4, 5). This methylation difference between the maternal and paternal genomes in endosperm—a form of genomic imprinting—causes differential expression of a number of genes, depending on parent of origin (2, 4). Imprinted expression is generally explained in terms of conflict between parents over resource allocation, with male-expressed genes maximizing resource extraction from the female and female-expressed genes counteracting this drive (2).

Most of our knowledge about DNA methylation in plant seeds is derived from A. thaliana. Processes involving genetic conflict tend to evolve rapidly (9), and therefore, methylation dynamics in cereal seeds may be quite different. Here, we use deep bisulfite sequencing to examine DNA methylation in rice seeds. Wild-type rice endosperm methylation patterns—globally reduced non-CG methylation and local CG hypomethylation—resemble those of DME-deficient A. thaliana endosperm, a finding consistent with lack of DME in monocots. Reduced endosperm methylation is common in genes with preferential endosperm expression, indicating that demethylation is a major mechanism for gene activation in rice endosperm. Short TEs are hypermethylated at CHH sites in embryo, suggesting that endosperm demethylation functions to immunize the embryo against TEs through small RNAs.


Global Genomic Methylation Patterns Are Similar Across Rice Tissues.

To learn how cytosine methylation regulates cereal seed genomes, we quantified methylation in rice embryos, endosperm, and seedling shoots and roots by sequencing bisulfite-converted genomic DNA (bisulfite treatment converts unmethylated cytosine to uracil) to 11- to 15-fold coverage of the nuclear genome (Table S1). The aggregate methylation patterns in all tissues are very similar to those of mature rice leaves (8) as well as those of A. thaliana (6)—CG methylation is common in gene bodies, except near the transcription start and termination sites, whereas TEs are methylated in all sequence contexts (Fig. 1). Overall CG methylation patterns and levels are virtually indistinguishable between embryos, shoots, roots, and leaves (Fig. 1 A and B). CHG methylation increases modestly with age of the tissue: lowest in embryos, higher in young shoots and roots, and highest in mature leaves (Fig. 1 C and D), consistent with reports of increased methylation in older tissues of maize and petunia (10, 11). CHH methylation is also higher in leaves than in seedling tissues (Fig. 1 E and F).

Fig. 1.
Patterns of DNA methylation in rice tissues. Rice genes (A, C, and E) or TEs (B, D, and F) were aligned at the 5′ end (Left) or the 3′ end (Right), and average methylation levels for each 100-bp interval are plotted. The dashed line represents ...

Short TEs Are Hypermethylated at CHH Sites in Rice Embryo.

Average CHH methylation of embryo TEs is higher than in seedlings near the points of alignment but indistinguishable past 1 kb into the element (Fig. 1F), a pattern caused by differential methylation of short and long TEs (Fig. 2A and Fig. S1). TEs longer than 1 kb show the same methylation levels in embryos and seedlings, whereas shorter elements [i.e., miniature inverted-repeat transposable elements (MITEs) and short interspersed nuclear elements (SINEs)] are hypermethylated in embryos (Fig. 2A and Fig. S1). The abundance of CHH methylation in short TEs led us to examine whether their genomic distribution accounts for the spike in CHH methylation upstream of genes (Fig. 1E). MITEs, the most abundant short elements in rice (some of which are active) (12), preferentially occur near genes (13). MITE distribution indeed closely parallels that of CHH methylation (Fig. 2B). MITE frequency 5′ and 3′ of genes is directly correlated with gene transcription, whereas MITE frequency within genes is inversely correlated with transcription (Fig. 2C). CHH methylation shows a similar distribution (Fig. 2D), a pattern quite different from CG or CHG methylation (Fig. S2). Thus the distribution of CHH methylation closely follows that of MITEs.

Fig. 2.
MITEs are the predominant target of CHH methylation. (A) Box plots showing methylation levels of different TE classes in rice embryo (Em), shoot (St), root (Rt), and endosperm (En). Each box encloses the middle 50% of the distribution, with the horizontal ...

Global Hypomethylation of Non-CG Sites and Local Hypomethylation of CG Sites in Rice Endosperm.

DNA methylation in rice endosperm is lower in all sequence contexts compared with all other tissues that we examined—CG methylation is about 93% of that of embryos, whereas CHG and CHH methylation is lower by about twofold and fivefold, respectively (Figs. 1 and and2A,2A, Fig. S3, and Table S1). The decrease in CG methylation affects gene bodies, gene-adjacent regions, and TEs (Fig. 1), which may reflect A. thaliana-like even hypomethylation of the entire genome (5) or might be because of severe demethylation of specific loci. To identify differentially methylated rice sequences, we calculated fractional methylation in each context within 50-bp windows, subtracted one dataset from another in all pair-wise combinations, and identified loci with significant methylation differences between tissues (Fig. 3 and Table S2). In contrast to A. thaliana, CG methylation of most loci is unchanged in rice endosperm (Fig. 3A), with hypomethylation restricted to specific domains (Fig. 3 D and E and Fig. S4), whereas non-CG methylation is reduced throughout the genome (Fig. 3 B–E). Gene bodies, transcriptional start site (TSS)-proximal regions, and MITEs show similar decreases in endosperm CG methylation, whereas long TEs exhibit virtually no CG hypomethylation (Fig. S5). Long TEs also lose less CHG methylation in endosperm than other sequences (Fig. S5). There are modest (193–1,799 loci) CG methylation differences between tissues other than endosperm, with leaves most similar to seedling shoots and seedling shoots most similar to seedling roots (Table S2), suggesting that age and tissue type influence methylation patterns. CG methylation differences are much greater when endosperm is considered, with 25,655–29,969 loci hypomethylated in endosperm (Table S2).

Fig. 3.
Local CG and global non-CG hypomethylation of rice endosperm. (AC) Kernel density plots of the differences between embryo and endosperm methylation (red trace), the differences between embryo and seedling shoot methylation (blue trace), and the ...

DEMETER Orthology Group Is Restricted to Dicots.

Our data show that a major reduction of DNA methylation occurs in the endosperm of rice, but the specifics of demethylation are quite different compared with A. thaliana (4, 5), leaving open the question of whether global demethylation has a common evolutionary origin among angiosperms. A plausible answer is suggested by the fact that the methylation landscape of rice endosperm closely resembles that of A. thaliana endosperm with a mutation in the DEMETER (DME) DNA glycosylase (5). DME is required for global CG demethylation in A. thaliana endosperm (4, 5), and its loss of function leads to increased CG methylation and decreased non-CG methylation (5) very similar to that of wild-type rice (Fig. 4). DME and its three A. thaliana homologs that function primarily outside the seed (14, 15) belong to three distinct orthology groups within flowering plants: DME, REPRESSOR OF SILENCING 1 (ROS1), and DEMETER-LIKE 3 (DML3) (Fig. 5). The DME orthology group extends to monkey flower (Mimulus guttatus) (Fig. 5), a basal dicot that diverged from A. thaliana about 125 million y ago (1), suggesting that DME function may be conserved across dicots. However, monocots like rice lack DME orthologs (Fig. 5), and therefore, rice endosperm hypomethylation must rely on ROS1 and/or DML3 orthologs or alternate biochemical mechanisms. The similarity between wild-type rice and DME-deficient A. thaliana suggests that the overall process of extensive endosperm demethylation is likely conserved between monocots and dicots, with the observed differences caused by divergent evolution of the DME/ROS1/DML3 family.

Fig. 4.
Wild-type rice endosperm methylation resembles A. thaliana dme endosperm. (A and B) Kernel density plots of CHG and CHH methylation differences within 50-bp windows between rice embryo and endosperm (red trace), A. thaliana endosperm from loss-of-function ...
Fig. 5.
Phylogenetic analysis of the DME/ROS1/DML3 glycosylase family. A phylogenetic tree based on conserved domains of glycosylase proteins, with basal land plant (moss and lycophyte) proteins as an outgroup. Posterior probability values (0–100) are ...

CG and CHG Hypomethylation Is a Major Mechanism of Gene Activation in Rice Endosperm.

Gene expression in A. thaliana endosperm is significantly linked to DNA methylation—those genes with reduced DNA methylation upstream of the TSS tend to be more expressed in endosperm (5)—but the association is not very strong, with most genes that are preferentially expressed in endosperm not directly activated by removal of DNA methylation (4, 5). To examine the situation in rice, we measured gene expression in the same tissues that we used to quantify DNA methylation (embryos, endosperm, and seedling shoots and roots) using tiling microarrays. We identified a stringent group of 165 genes with a strong preference for endosperm expression (Table S3) by selecting only those genes with fourfold or greater RNA levels in endosperm compared with each of the other three tissues and that also met these criteria in previously published expression datasets (16). We similarly identified a control group of 153 genes preferentially expressed in embryo (Table S4).

Embryo-preferred genes have similar methylation patterns in all tissues (Fig. 6 A and B and Fig. S6). In contrast, endosperm-preferred genes are, on average, hypermethylated in embryos, shoots, roots, and leaves, with decreased CG and CHG methylation in endosperm (Fig. 6 C and D); 69 of the endosperm-preferred genes (42%) have a significant decrease (P < 10−7, Fisher's exact test) in CG methylation within 100 bp of the TSS or in CHG methylation within the gene body, with 38 genes (23%) exhibiting both (Table S3). Only nine embryo-preferred genes (6%) have a significant decrease in CG or CHG methylation in endosperm, and none show both (Table S4)—a highly significant difference (P < 10−11, Fisher's exact test).

Fig. 6.
Hypomethylation is a major mechanism for activation of rice endosperm genes. (AD) Embryo-preferred (A and B; n = 153) or endosperm-preferred (C and D; n = 165) genes were aligned as in Fig. 1, and average methylation levels for each 100-bp interval ...

For a global comparison of methylation and transcription changes, we aligned all genes according to our microarray data from those most expressed in endosperm vs. embryo to those least expressed in endosperm vs. embryo, and we displayed embryo methylation levels and the difference between embryo and endosperm as heat maps (Fig. 6 E and F and Fig. S6). Endosperm-expressed genes have higher levels of embryo CG methylation near the TSS (Fig. 6E) and higher embryo CHG methylation throughout the gene body (Fig. 6F), with reduced levels of methylation in endosperm. CHH methylation shows a weak, if any, correlation with embryo/endosperm expression differences (Fig. S6). Our data suggest that DNA hypomethylation is a major mechanism for activation of genes in rice endosperm. Endosperm-preferred genes apparently activated by DNA demethylation include precursors of glutelin, which accounts for about 70% of rice endosperm protein (17), and carbohydrate enzymes that synthesize endosperm starch (Fig. 3 D and E, Fig. S4, and Table S3), genes that create the nutritive molecules relied on by germinating seedlings and much of the human population.


Our data have important implications for engineering of transgenic cereal crops. Glutelin promoters have been investigated for their suitability to drive transgene expression in rice endosperm, and strong endosperm-specific promoters are generally highly sought (17). However, our data suggest that a substantial fraction of endosperm-specific expression is caused by hypomethylation, and transgenic constructs may not recapitulate endogenous expression patterns. Understanding the epigenetic mechanisms regulating such promoters will allow for more informed design of transgenic lines.

Considering the difficulty of targeted mutagenesis in plants (18), RNAi is an attractive mechanism for silencing unwanted endogenous genes. However, the extremely low levels of CHH methylation in rice endosperm indicate that the functionality of the RNAi system is altered in this tissue. Small RNAs may be transported out of the endosperm to immunize the embryo, or the link between RNAi and DNA methylation may be weakened. In either case, targeting RNAi to promoters for the purpose of transcriptional repression may be ineffective. Similarly, transgenes inactivated by DNA methylation in other tissues may be reactivated in endosperm.

We previously suggested that DNA hypomethylation in A. thaliana endosperm is a mechanism to enhance TE silencing in the embryo (5), a hypothesis supported by an abundance of TE-derived small RNAs in the endosperm (19). Our rice data are fully consistent with this hypothesis. The greatest amount of CHH methylation is found in short rice TEs, consistent with the observation that methylation of short TEs is more dependent on the RNAi system (20). Short TEs lose the most CHH methylation in rice endosperm (Fig. 1F) and are the only elements hypermethylated in embryo (Fig. 2A), suggesting that demethylation and activation of TEs in endosperm lead to hypermethylation and silencing in the embryo through the RNAi pathway. Considering that the major targets of rice CHH methylation are MITEs, which tend to be found near the start sites of active genes, this system is likely of particular importance for proper functionality of the rice genome.

Present models of plant gene imprinting posit that DME-mediated removal of DNA methylation in the central cell before fertilization creates epigenetic differences between the maternal and paternal genomes in endosperm, which, in turn, lead to differential gene expression (2). A number of monocot imprinted genes apparently activated by selective maternal demethylation have been indentified (2123), but lack of monocot DME implies that another member of this gene family or a different biochemical mechanism is responsible for monocot imprinting. Furthermore, it is unclear to what extent endosperm DNA hypomethylation and associated gene expression changes are allele-specific. We find that major resource genes are activated by hypomethylation in endosperm, and if their expression is confined to the maternal genome, our observations would seem inconsistent with the prevalent parental conflict theory, which predicts that genes that increase resource allocation should be expressed from the paternal genome. Whether and to what extent the demethylation that we observe is parent-specific and how this translates into imprinted gene expression are urgent issues for further study.


Bisulfite Sequencing and Analysis.

Endosperm and embryos were isolated at the milky stage, and shoots and roots were taken from 14-d-old seedlings grown in liquid medium as described (8). Bisulfite sequencing and data processing were performed as described (5, 8).

Microarray Analysis.

Our custom NimbleGen microarray consists of 2,154,325 45- to 85-bp probes that are tiled across the entire sequenced rice genome (Oryza sativa ssp. japonica cultivar Nipponbare, Michigan State University release 5, http://rice.plantbiology.msu.edu) without repeat masking. Each probe is selected to have a predicted melting temperature close to 76 °C. The array design is deposited in Gene Expression Omnibus (GEO) with accession number GSE22591. cDNA samples were prepared and labeled as described (24), with hybridization and data extraction preformed at the Fred Hutchinson Cancer Research Center (www.fhcrc.org) DNA array facility (25). Two independent cDNA samples for each tissue were labeled with Cy5 and cohybridized with sonicated genomic DNA labeled with Cy3. The two replicates were averaged, and outlier probes were removed by median smoothing (three-probe window). An expression score for each gene was calculated by averaging the signal of all probes within the gene's exons.

Phylogenetic Analysis.

Conserved domains of the indicated glycosylase proteins were aligned using MUSCLE v3.7, and phylogenetic trees were inferred using MrBayes v3.1.2 as described (8). The tree was checked and graphically presented in FigTree v1.2.2 (http://tree.bio.ed.ac.uk/software/figtree).

Supplementary Material

Supporting Information:


We thank Leath Tonkin for Illumina sequencing, Peijian Cao and Pamela Ronald for processed Affymetrix array data (GSE11966), and Barbara Rotz for rice care. A.Z. is a fellow of the Jane Coffin Childs Memorial Fund for Medical Research. J.A.R. is a Fulbright Scholar. This work was partially funded by a Young Investigator Grant from the Arnold and Mabel Beckman Foundation and Grant IOS-1025890 from the National Science Foundation (to D.Z.).


The authors declare no conflict of interest.

Data deposition: The data reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE22591).

*This Direct Submission article had a prearranged editor.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1009695107/-/DCSupplemental.


1. Hedges SB, Dudley J, Kumar S. TimeTree: A public knowledge-base of divergence times among organisms. Bioinformatics. 2006;22:2971–2972. [PubMed]
2. Huh JH, Bauer MJ, Hsieh TF, Fischer RL. Cellular programming of plant gene imprinting. Cell. 2008;132:735–744. [PubMed]
3. Chaudhury AM, et al. Control of early seed development. Annu Rev Cell Dev Biol. 2001;17:677–699. [PubMed]
4. Gehring M, Bubb KL, Henikoff S. Extensive demethylation of repetitive elements during seed development underlies gene imprinting. Science. 2009;324:1447–1451. [PMC free article] [PubMed]
5. Hsieh TF, et al. Genome-wide demethylation of Arabidopsis endosperm. Science. 2009;324:1451–1454. [PMC free article] [PubMed]
6. Law JA, Jacobsen SE. Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat Rev Genet. 2010;11:204–220. [PMC free article] [PubMed]
7. Feng S, et al. Conservation and divergence of methylation patterning in plants and animals. Proc Natl Acad Sci USA. 2010;107:8689–8694. [PMC free article] [PubMed]
8. Zemach A, McDaniel IE, Silva P, Zilberman D. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science. 2010;328:916–919. [PubMed]
9. Swanson WJ, Vacquier VD. The rapid evolution of reproductive proteins. Nat Rev Genet. 2002;3:137–144. [PubMed]
10. Martienssen R, Barkan A, Taylor WC, Freeling M. Somatically heritable switches in the DNA modification of Mu transposable elements monitored with a suppressible mutant in maize. Genes Dev. 1990;4:331–343. [PubMed]
11. Meyer P, et al. Endogenous and environmental factors influence 35S promoter methylation of a maize A1 gene construct in transgenic petunia and its colour phenotype. Mol Gen Genet. 1992;231:345–352. [PubMed]
12. Jiang N, et al. An active DNA transposon family in rice. Nature. 2003;421:163–167. [PubMed]
13. Bureau TE, Wessler SR. Mobile inverted-repeat elements of the Tourist family are associated with the genes of many cereal grasses. Proc Natl Acad Sci USA. 1994;91:1411–1415. [PMC free article] [PubMed]
14. Penterman J, et al. DNA demethylation in the Arabidopsis genome. Proc Natl Acad Sci USA. 2007;104:6752–6757. [PMC free article] [PubMed]
15. Zhu JK. Active DNA demethylation mediated by DNA glycosylases. Annu Rev Genet. 2009;43:143–166. [PMC free article] [PubMed]
16. Xue LJ, Zhang JJ, Xue HW. Characterization and expression profiles of miRNAs in rice seeds. Nucleic Acids Res. 2009;37:916–930. [PMC free article] [PubMed]
17. Qu le Q, Xing YP, Liu WX, Xu XP, Song YR. Expression pattern and activity of six glutelin gene promoters in transgenic rice. J Exp Bot. 2008;59:2417–2424. [PMC free article] [PubMed]
18. Li J, Hsia AP, Schnable PS. Recent advances in plant recombination. Curr Opin Plant Biol. 2007;10:131–135. [PubMed]
19. Mosher RA, et al. Uniparental expression of PolIV-dependent siRNAs in developing endosperm of Arabidopsis. Nature. 2009;460:283–286. [PubMed]
20. Tran RK, et al. Chromatin and siRNA pathways cooperate to maintain DNA methylation of small transposable elements in Arabidopsis. Genome Biol. 2005;6:R90. [PMC free article] [PubMed]
21. Gutiérrez-Marcos JF, et al. Epigenetic asymmetry of imprinted genes in plant gametes. Nat Genet. 2006;38:876–878. [PubMed]
22. Haun WJ, et al. Genomic imprinting, methylation and molecular evolution of maize Enhancer of zeste (Mez) homologs. Plant J. 2007;49:325–337. [PubMed]
23. Jahnke S, Scholten S. Epigenetic resetting of a gene imprinted in plant embryos. Curr Biol. 2009;19:1677–1681. [PubMed]
24. Zilberman D, Gehring M, Tran RK, Ballinger T, Henikoff S. Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat Genet. 2007;39:61–69. [PubMed]
25. Zilberman D, Coleman-Derr D, Ballinger T, Henikoff S. Histone H2A.Z and DNA methylation are mutually antagonistic chromatin marks. Nature. 2008;456:125–129. [PMC free article] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...