DNA methylation and differentiation.

The methylation of specific cytosine residues in DNA has been implicated in regulating gene expression and facilitating functional specialization of cellular phenotypes. Generally, the demethylation of certain CpG sites correlates with transcriptional activation of genes. 5-Azacytidine is an inhibitor of DNA methylation and has been widely used as a potent activator of suppressed genetic information. Treatment of cells with 5-azacytidine results in profound phenotypic alterations. The drug-induced hypomethylation of DNA apparently perturbs DNA-protein interactions that may consequently alter transcriptional activity and cell determination. The inhibitory effect of cytosine methylation may be exerted via altered DNA-protein interactions specifically or may be transduced by a change in the conformation of chromatin. Recent studies have demonstrated that cytosine methylation also plays a central role in parental imprinting, which in turn determines the differential expression of maternal and paternal genomes during embryogenesis. In other words, methylation is the mechanism whereby the embryo retains memory of the gametic origin of each component of genetic information. A memory of this type would probably persist during DNA replication and cell division as methylation patterns are stable and heritable.


Introduction
The major impetus of molecular biology research is striving to elucidate the mechanisms governing gene expression during development. The differentiation of cells during embryogenesis involves a complex program of changes in gene activity that dictates the progression of cells into functionally specialized phenotypes. Since all cells in an organism contain the same genetic information, developmental switches turn on phenotypic-specific genes and suppress those genes characteristic of other cell types. Such developmentally regulated changes in gene expression are classified as epigenetic (1,2). DNA methylation is an epigenetic regulator that constitutes one level of the information coding system controlling eukaryotic gene expression. The hypomethylation of specific sites in some, but not all, genes correlates with transcriptional activity (3)(4)(5). Furthermore, the in vitro methylation of genes prior to their introduction into eukaryotic cells results in gene silencing (6,7). Methylation may therefore function to lock certain genes in a transcriptionally inactive state.
Defining the precise role of DNA methylation in eukaryotic cells has been difficult (4). For instance, the expression of some genes is not controlled by DNA methyl-ation, and even in those genes that are regulated by the modification, methylation at some but not all sites correlates with gene control. Additionally, some animals contain extremely low levels of DNA 5-methylcytosine [e.g., Drosophila (8)], whereas the genome of other animals is highly methylated (3). There has therefore been a problem in developing an unequivocal, all-encompassing model to describe the role of DNA methylation in gene control.
However, great excitement has been generated recently by several illuminating studies. Genomic CpG dinucleotide clusters were discovered (9), and the absence of methylation at these sites was thought to functionally distinguish these regions from the remainder of the genome (10). Chandler and colleagues demonstrated the existence of allele-specific methylation patterns that substantiated the concept that differential methylation may be the discriminatory signal conferring distinctions on otherwise identical genomic sequences (11). Moreover, DNA methylation was then implicated as a molecular mechanism for parental imprinting (12)(13)(14). In this chapter, we will discuss these fascinating findings as well as the growing evidence that DNA methylation is one mechanism by which eukaryotes control gene expression during cellular differentiation.
DNA Methylation Patterns 5-Methylcytosine is the only modified base in vertebrate DNA. About 3% of cytosines, predominantly in the dinucleotide 5'CpG3, are postreplicatively modified to 5-methylcytosine (15). Significantly, this doublet is underrepresented in eukaryotic DNA (16). The proposed reason for the underrepresentation is that the presence of the 5-methyl moiety increases the spontaneous rate of oxidative deamination of the pyrimidine ring resulting in the formation of a thymine base which cannot be recognized by repair enzymes, thus resulting in mutational hotspots (17). Consistent with this explanation is the fact that the diminished frequency of CpG sequences is usually accompanied by an overabundance of TpG and CpA dinucleotides (18). Furthermore, restriction endonuclease sites containing CpG have a high frequency of polymorphism (19). Thken together, the results suggest that the remaining CpG dinucleotides in vertebrate genomes have functional importance (20), or have never been methylated.
The distribution of methylated cytosine residues in eukaryotic DNA is nonrandom. 5-Methylcytosine occurs in repetitive sequences several-fold more frequently than in middle repetitive or unique sequences (21,22). Moreover, the extent and pattern of genomic DNA methylation is species and tissue-specific (22)(23)(24). During development, the generation of tissue-specific patterns may be determined by de novo sequential changes in methylation patterns (25)(26)(27)(28). The fact that tissue-specific methylation patterns do exist indicates that such a pattern can be faithfully inherited in somatic cells. Experiments that have demonstrated this phenomenon provide strong evidence that cytosine DNA modification is associated with the epigenetic control of gene transcription (29)(30)(31).
The dinucleotide CpG is also nonrandomly distributed in the vertebrate genome (9). Many genes contain CpG islands, which are G and C-rich regions of DNA which have a higher frequency of CpG dinucleotides than bulk DNA (10). CpG islands occur at the 5' end of housekeeping and tissue-specific genes, as well as at the 3' end of some tissue-specific genes (32,33). Bird has estimated that there may be as many as 30,000 CpG islands in the haploid mouse genome (9). It is also predicted that CpG islands will be found associated with the vast majority of genes, especially those that are widely expressed (32). Interestingly, all of the CpG islands that have been examined are not methylated, with the exception of those on the inactive X-chromosome in somatic cells. The CpG dinucleotide is therefore abundant and not methylated in these islands, whereas it is relatively scarce and predominantly methylated in the rest of the genome. It is believed that in the germline these clusters are protected from methylation so that the respective genes are poised for early transcription.
Since 5-methylcytosine occurs predominantly in the sequence CpG (20), methylation sites in double-stranded DNA are symmetrical. A potential methylation site could exist therefore in three states: unmethylated, hemimethylated, or symmetrically methylated in both strands. Fully methylated sites would be converted to hemimethylated sites as a result of semi-conservative DNA replication (15). DNA methyltransferase enzymes are responsible for the maintenance of methylation patterns (34). These en-zymes catalyze the transfer ofthe methyl group from Sadenosylmethionine (SAM) to the 5-position of speciflc cytosine residues in DNA. Methyltransferase enzymes recognize hemimethylated DNA (34) and maintain methylation patterns with high fidelity (35). Following replication, the maintenance methylases use the the methylation pattern on the template strand to restore symmetry and direct the precise methylation of the newly synthesized DNA strand (15). In addition, methyltransferase enzymes probably mediate de novo methylation by imposing sequence-specific DNA methylation at sites previously not methylated (36,37). De novo methylation has been demonstrated in T-lymphoid cells (38), mouse embryo fibroblast variants (39), viral systems (40,41), and during the establishment of immortal murine cell lines (42). Such alterations in methylation patterns may represent regulatory signals which, directly or indirectly, elicit changes in gene expression.
Eukaryotic DNA methyltransferase has been partially purified and characterized by several investigators (34,(43)(44)(45)(46). However, controversy still exists as to the existence of two distinct enzymes, since the de novo and maintenance methyltransferase activities do co-purify (26,44,46). Initially, the sequence specificities of the two enzymes were thought to be different (35,47), but recently Bolden et al. suggested that both methyltransferase reactions are catalyzed by the same enzyme, as the maintenance and de novo methyltransferases share the same substrate specificities (48). It has also been postulated that DNA methyltransferase associated with the nuclear matrix is predominantly responsible for the maintenance of inheritable methylation patterns (49,50), but does also possess some de novo methylation activity (49).
The methyltransferase activity is inducible by mitogenic stimuli in a cell cycle-dependent manner that ensures conservation of methylation patterns (51). Moreover, Bestor and Ingram reported that Friend erythroleukemia cells contained three distinct methyltransferase enzymes whose relative amounts depended on the proliferative state of the cells (52). Using methyltransferase purified from these cells, Bestor also demonstrated that the sequence specificity of the enzyme was conditional on the amount of supercoiling or the physical conformation of the DNA substrate (53).
DNA Methylation and Protein-DNA Interactions 5-Methylcytosine affects protein-DNA interactions in prokaryotes (25) so that methylation of specific cytosine residues in eukaryotes may also alter the binding of regulatory or transcriptional factors. However, recent experiments in our laboratory have shown no direct effect of cytosine methylation on the binding of the transcription factor Spl to its recognition sequence (54). Similar results have been obtained by Doerfler's laboratory for the adenovirus gene (55), so that the methylation signal in these genes is presumably transduced by more indirect mechanisms such as altered chromatin configuration. Al-ternatively, the effect of cytosine methylation on protein-DNA interactions may be site-and protein-specific since the modification does prevent protein binding to some but not all upstream sequences of the rat tyrosine aminotransferase gene (56).
DNA associated with nucleosomes is significantly more methylated than DNA in spacer regions between nucleosome cores (21,57,58). Moreover, at least 80% of 5-methylcytosine in chromatin is nonrandomly packaged into nucleosomes that contain histone Hi (59). DNA methylation may therefore inhibit gene expression via conformational changes in chromatin. Consistent with this theory are microinjection experiments that revealed that chromatin formation mediated the inhibitory effect of DNA methylation on transcription of the thymidine kinase gene (60). The expression of the thymidine kinase gene was not blocked by DNA methylation per se, but by the formation of chromatin consisting of the methylated gene reconstituted with histones. Indeed, 5-methylcytosine is involved in the maintenance of X-chromosome inactivation in somatic cells, and this function may be achieved through condensation of chromosome structure. It remains unclear whether DNA methylation is a cause or an effect of chromosome compaction. Inactivation of the X-chromosome precedes methylation of the mouse HPRT gene (61) and may therefore stabilize the inactive configuration rather than induce it directly. In contrast, Keshet et al. transfected methylated and unmethylated M13 constructs into mouse L cells and found that the sequences integrated into the genome, but only the methylated sequences assumed the chromatin conformation characteristic of inactive genes (62). Irrespective of the temporal events, methylation probably does play a role in the selective protein-DNA interactions that maintain chromatin conformation. Since methylation patterns are faithfully inherited, the chromatin structure associated with genetic repression can be stably propagated.

Methylation Inhibitors 5-Azacytidine and 5-Azadeoxycytidine
Studies using the methylation inhibitors 5-azacytidine (5-aza-CR) and 5-aza-2'-deoxycytidine (5-aza-CdR) have provided additional evidence for the role of methylation in gene repression (63). These nucleoside analogs were originally developed as cancer chemotherapeutic agents (64) but have now received wide attention as chemical activators of suppressed genetic information (65). 5-Aza-CR selectively activates eukaryotic gene expression and induces dramatic alterations in the differentiated state of certain eukaryotic cells.
5-Aza-CR and 5-aza-CdR are cytidine analogs with a nitrogen instead of a carbon atom at position 5 of the pyrimidine ring. Both drugs are phosphorylated intracellularly (64,66), and 5-aza-CR is incorporated into RNA and DNA, whereas 5-aza-CdR is only found in DNA (67). Once incorporated into DNA, the azanucleoside ring cannot be methylated and the analogs are thought to mediate their remarkable biological effects via inhibition of DNA methylation (68).
Mechanism of Inhibition of DNA Methylation by 5-Azacytidine 5-Aza-CR and 5-aza-CdR were shown to inhibit the methylation of the newly synthesized strand of DNA in C3H1OT1/2 C1 8 cells (68) and in L1210 cells (69). Inhibition of the cytosine modification occurred only after incorporation of the fraudulent base into DNA, and the extent of inhibition was dependent on the drug concentration (68,70). However, very extensive demethylation occurred as the substitution of only 5% of cytosine residues by 5-azacytosine resulted in greater than 80% decrease in methylation (71). In other words, the presence of 5-azacytosine at a modification site strongly inhibited the methylation of cytosine residues at unsubstituted sites.
Drahovsky and Mornis (72) and Panaka et al. (70) showed that the DNA methyltransferase was a processive enzyme that remained associated with DNA scanning for available hemimethylated sites. Thus, the presence of 5-azacytosine in duplex, hemimethylated DNA was proposed to interfere with the progress and functioning of the enzyme along DNA (71). The incorporation of low amounts of 5-azacytosine into DNA could therefore inactivate the methyltransferases and result in drastic inhibition of methylation sites downstream from the trapped enzyme. This would lead to the considerable decrease in DNA methylation observed in daughter cells.
Studies that demonstrated a drastic decrease in extractable DNA methyltransferase activity after treatment with 5-aza-CdR implied that the enzyme formed a tight complex with 5-azacytosine and could not subsequently be extracted (70,71,73). Treatment of cells with 5-aza-CR resulted in the rapid timeand dose-dependent formation of tight-binding complex that could not be dissociated with 0.3 M NaCl (71,73). In fact, the methyltransferase inhibition was irreversible and required new protein synthesis for recovery of active enzyme and restoration of 5-methylcytosine levels (71).
Santi et al. have proposed a mechanism to explain the irreversible inhibition of the DNA methyltransferase enzyme (74). This model, based on similarities to the thymidylate synthetase enzyme, proposes a nucleophilic attack by an S-H group at the active site of the enzyme onto the 6 position of the pyrimidine ring. Normally the covalent bond is broken following addition of the methyl group to the 5 position. Since the addition of the methyl group is blocked, the enzyme-DNA intermediate is stabilized. Consistent with this mechanism is the fact that methyltransferase inhibition can be mimicked by other 5-substituted analogs such as pseudoisocytidine and 5-fluoro-2'-deoxycytidine (71).
The incorporation of 5-azacytosine into DNA also increases the formation of stable, protein-DNA complexes with other nuclear proteins (75,76). These tight-binding complexes between specific nonhistone proteins and DNA occur at hemimethylated sites created by the incorporation of 5-azacytosine into DNA during one replication cycle (76). Such interactions may involve regulatory protein factors as well as enzymes associated with DNA. In this way, the perturbations of DNA-protein interactions caused by 5-azacytosine may intervene with normal cellular functioning and thus be repsonsible for some of the profound changes in gene expression elicited by the drug.

Altered Phenotypes Induced by 5-Azanucleosides
The pathway of differentiation initially involves a stage of determination when precursor cells become determined to a specific cell lineage. Subsequently, appropriate stimuli induce these lineage-determined cells to enter the commitment phase of development and form the end-stage phenotype. Cells primed to differentiate as a result of 5-aza-CR treatment provide an excellent system for analysis of the molecular events controlling the specialization of cellular phenotypes. 5-Aza-CR treatment is thought to trigger the determination of cells, and the presence of the drug is not essential for promotion of the commitment phase. Rather, commitment may be under hormonal (77) or extracellular matrix influences (78).
The dramatic effects of these hypomethylating agents on cell determination have been widely demonstrated in mouse embryo fibroblast cell lines (79). 5-Aza-CR induces the formation of contractile, striated muscle cells in C3H1OT½2 Cl 8 mouse fibroblasts 10 to 14 days after a single dose of the analog (80,81). Muscle cells were never seen in untreated 10T½/ cultures. The altered phenotype was heritable since the myocytes were stable and could be propagated in the absence of further drug treatment (82). In addition, functionally differentiated adipocytes and chondrocytes also emerged after 5-aza-CR treatment of 10T½/ cells, so that alterations in differentiation were not restricted to the muscle phenotype (83). Phenotypic changes were not even confined to the developmental lineage of the treated precursor cell since epithelial cells were induced from 5-aza-CR-treated teratocarcinoma-derived mesenchymal cells (84). The differentiation of nonmesenchymal cells is also dramatically altered by exposure to 5-aza-CR, as exemplified by various leukemia cell differentiation models (73,85,86). Therefore, the effects of these analogs on differentiation are quite general as they have been confirmed with diverse lineages in various cell lines from several different species.
It seems likely that the profound effects of 5-azanucleosides on differentiation may be due to the activation of one or a few determination loci whose subsequent expression defines the phenotype (82). In the muscle system this determination gene is probably replicated early in S phase since it can be activated by 5-aza-CR in a 5 min exposure of S phase-synchronized 10T½/ cells (87). Exciting experiments perfonned by Lassar et al. have substantiated this theory (88). DNA from 5-aza-CR-derived myoblasts was transfected into normal 10T½2 cells and resulted in the emergence of myoblasts at a frequency expected for the transfer of only a few demethylated loci. Characterization of this putative determination gene will be particularly interesting and informative for elucidation of the mechanisms of cell determination and commitment.

Selective Gene Activation by 5-Azacytidine and 5-Azadeoxycytidine
The phenotypic changes invoked by the methylation inhibitors presumably require the concerted activation of many speciflc genes. One intriguing feature of both analogs is that they selectively activate genes instead of causing genome-wide random derepression.

X-Chromosome Reactivation
DNA methylation is thought to be involved in the inactivation of one of the two X chromosomes in somatic cells of mammalian females (89,90). Considerable excitement was generated when experiments proved that 5-aza-CR treatment reactivated the expression of genes located on inactive mouse X chromosomes. Genes such as hypoxanthine guanine phosphoribosyl transferase (HPRT) (91) and glucose-6-phosphate dehydrogenase (92) were induced after drug treatment. The changes caused by 5-aza-CR treatment are mediated at the level of DNA structure and are heritable in the absence of further analog treatment. This fact was demonstrated by the ability of DNA extracted from hybrid cell lines containing 5-aza-CRreactivated X chromosomes to restore enzyme activity to HPRTrecipient cells (93,94). Taken together, these studies provide strong evidence that DNA methylation mediates chromatin structure and X chromosome inactivation as 5-aza-CR induces hypomethylation, chromosome decondensation (95) and gene activation.
Autosomal Gene Activation Table 1 shows that 5-aza-CR stimulates the expression of suppressed genes within a wide variety of cell types. One of the most dramatic 5-aza-CR-mediated gene inductions was a 10_to 106-fold increase in thymidine kinase expression reported by Harris (96). Gene reactivation was observed in as many as 10 to 30% of the surviving cells, which is several orders of magnitude higher than that expected for a mutagenic agent. 5-Aza-CR is in fact not measurably mutagenic in mammalian cells (9?). This finding suggests that the absence of expression of some housekeeping genes can, in many cases, be attributed to altered, suppressive methylation patterns rather than to classical mutations. 5-Aza-CR treatment can demethylate those sites important for regulation of gene activity so that the genes can be reexpressed.
It should be emphasized that some genes are not directly induced by 5-aza-CR treatment; exposure to the analog is permissive but not sufficient for activation. 5-Aza-CR treatment of mouse thymoma cells does not result in metallothionein gene expression unless a secondary stimulus (e.g., heavy metal or steroid) is applied (102), and no response is obtained with either individual stimulus. A two-stage mechanism of gene activation has also been observed for globin gene expression in chickens, with sodium butyrate as the secondary stimulus (122), and  (121) in mouse-human hybrids responding to hexamethylenebis-acetamide after 5-aza-CR exposure (123). 5-Aza-CR treatment failed to activate adenine phosphoribosyl transferase (124), or c-mos in 10TY2 cells (125) and did not enhance a-fetoprotein expression in differentiating F9 cells (126). These experiments suggest that 5-aza-CR action renders some genes poised for transcription but that other trans-acting factors are necessary for expression of the gene. The chromatin structure associated with 5-aza-CR-induced hypomethylation may represent the potentially active conformation necessary but, in the absence of trans-acting regulators, is not sufficient for gene expression. This theory would be consistent with a role of methylation in chromatin compaction and the consequential alteration of this structure by the methylation inhibitors. In this way, methylation may be only one level of the complex network system that regulates eukaryotic gene expression.

Viral Gene Activation
The expression of many different endogenous and exogenous viruses has been induced by 5-aza-CR treatment of various cell types from several species (Thble 2). Early studies revealed that 5-aza-CR induced the expression of Rous sarcoma virus from hamster cells (127). Treatment of chicken cells with 5-aza-CR caused hypomethylation Rat cells (138) and transcriptional activation of the ev-1 endogenous retroviral locus (128). Drug-induced activation of viruses in human (129,130), murine (131)(132)(133)(134)(135)(136), and rat systems (137,138) has also been reported. In most cases, the inactive proviral genomes were hypermethylated and exhibited decreased 5-methylcytosine levels once activated by 5-aza-CR treatment. These viral systems will be extremely useful for analysis of the mechanisms of viral suppression in eukaryotic cells.

De Novo DNA Methylation
De novo DNA methylation is a characteristic of early embryonic cells and may be responsible for the inactivation of genes during development (41). De novo methyltransferase activity is thought to facilitate the repression of genes that are expressed during gametogenesis but that are not required after fertilization. De novo methylation could also function to correct demethylation errors introduced by the maintenance methylase in germ cells.
Studies using viral infection of mouse cells demonstrated that de novo methylation is active in pluripotent cells of preimplantation mouse embryos, but not in postimplantation or newborn mice (41). De novo methylation of the exogenous sequences correlated with transcriptional inactivity. Repression of this type in genes not needed early in development may therefore be essential for maintenance of the pluripotential state of cells.
Groudine and Conklin proposed that de novo methylation of DNA during spermatogenesis would template those genes not immediately needed in the embryo (37).
Conversely, genes protected from de novo methylation at specific point sites would apparently be constitutively expressed in the embryo. Protein factors or local chromatin structure may inhibit the procession of the methyltransferase enzyme along specific DNA sequences, thus protecting certain sites from becoming methylated. The hypomethylated state would ensure that these embryonic genes remained in a transcriptionally competent configuration, thus permitting selective expression of certain paternal genes early in development. Temporal and regional genomic demethylation and progressive remethylation occurs during embryogenesis (27,139). Embryonic and extraembryonic lineages are independently methylated, resulting in the observed tissuespecific patterns of cytosine modification. Methylation may therefore be associated with programming of cell lineage determination in mammalian development.

DNA Methylation and Parental Imprinting
Clear evidence exists that paternal and maternal genomes exert different functions during embryogenesis in the mouse (140). Development requires the concerted contribution of both parental genomes, but distinct functions depend on the parental origin of the genetic information.
For example, the paternal genome is largely responsible for development of the mouse extraembryonic tissues, whereas the maternal genome is more important for embryonic development (141,142).
Parental imprinting is the molecular mechanism that determines the differential expression of maternal and paternal genomes during embryogenesis and defines the functional nonequivalence of the parental genomes. This template information must be stable and heritable, must persist during cell division, and must be capable of affecting gene expression. For these very criteria, DNA methylation has become an attractive model for the epigenetic programming of parental genomes (143).
The role of DNA methylation in genomic imprinting has been substantiated by very elegant experiments using transgenic mice (12)(13)(14). In each case, the investigators followed the methylation pattern of an autosomal transgene that was randomly integrated into the mouse genome. Since the transgene locus would only be transmitted to 50% of the progeny, appropriate crosses would distinguish between maternal or paternal inheritance of the transgene, and correlations could be made between the methylation status and gametogenic history of the gene. The studies showed that the methylation pattern of the exogenous DNA sequence could be switched between maternal and paternal patterns depending on the gamete of origin in successive generations.
Swain et al. (14) ingeniously demonstrated that the methylation pattern, as well as the potential for expression of a transgene, were governed by the parental origin of the RSV-myc fusion transgene The methylation pattern of the transgene was differentially imprinted during gametogenesis in the parent or very early in embryogenesis of the offspring. If the transgene was inherited from a paternal source, expression was detected in offspring heart tissue only, whereas inheritance from the maternal genome precluded transgene expression. The expressed transgene was relatively undermethylated when inherited from the male parent, whereas the transgenic al-lele of maternal origin was more methylated and not expressed. Autosomal gene expression was thus influenced by the sex of the parent that transmitted the gene. Furthermore, hypomethylation was necessary but not sufficient for expression of the transgene in tissues other than the heart.
These studies provide strong evidence linking methylation with parental imprinting of certain genes. On a global level, sperm DNA is more methylated than oocyte DNA, but imprinting could be achieved by differential modulation of methylation at specific domains in the gamete DNA. Future studies will elucidate whether DNA methylation is a primary signal in imprinting or whether it is a consequence of other chromosomal modifications. This work was supported by Public Health Service grant CA 39913 from the National Cancer Institute and by the California Foundation for Biochemical Research.