Role of RNA modifications in brain and behavior
Abstract
Much progress in our understanding of RNA metabolism has been made since the first RNA nucleoside modification was identified in 1957. Many of these modifications are found in noncoding RNAs but recent interest has focused on coding RNAs. Here, we summarize current knowledge of cellular consequences of RNA modifications, with a special emphasis on neuropsychiatric disorders. We present evidence for the existence of an “RNA code,” similar to the histone code, that fine-tunes gene expression in the nervous system by using combinations of different RNA modifications. Unlike the relatively stable genetic code, this combinatorial RNA epigenetic code, or epitranscriptome, may be dynamically reprogrammed as a cause or consequence of psychiatric disorders. We discuss potential mechanisms linking disregulation of the epitranscriptome with brain disorders and identify potential new avenues of research.
1 |. INTRODUCTION
RNA encodes and decodes information essential to organismal survival. Cellular organisms use messenger RNA (mRNA) to direct protein synthesis, a universal function performed by ribosomes. Transfer RNA (tRNA) delivers amino acids to the ribosome, where ribonuclease P, a ribozyme, links amino acids together to form proteins. As multicellular organisms developed greater and greater complexity in parallel with expanded functions of RNAs, the need for finely regulating RNA function arose. One mechanism for regulating RNA is by direct chemical modification. Here, we survey RNA modifications and their functional and structural consequences in the context of brain and behavior.
It is firmly established that the 4 primary nucleosides in DNA are extensively modified with wide-ranging, complex functional cellular effects.1 Since the discovery of pseudouridine in tRNA 60 years ago2 more than 100 RNA modifications have been discovered, far surpassing the naturally occurring modifications found in DNA. Recent studies provide mounting evidence that functional effects of RNA modifications are equally complex to those of DNA and chromatin. Furthermore, the DNA and chromatin epigenetic codes are modifiable in ways that the genetic code is not, and via enzymatic “writers” and “erasers,” RNA modifications constitute an additional and more highly dynamic layer of epigenetic regulatory control (Figure 1). Early RNA modification studies focused on the abundant noncoding RNAs (ncRNA). These studies showed critical roles for RNA modifications in translation and splicing. For instance, the tRNA modification N1-methyladenosine (m1A) stabilizes tRNA structure and affects translation by regulating associations between tRNA and polysomes.3 Pseudouridine (Ψ) in snRNA can fine-tune mRNA splicing whereas Ψ in rRNA regulates internal ribosome entry site (IRES) usage which ensures translational fidelity.4 5-Methylcytidine (m5C) in tRNA maintains the anticodon stem-loop conformation.4 One of the most extensively characterized RNA modifications is the 7-methylguanosine located at the 5’ terminus of mRNA and some long ncRNAs (lncRNA’s), the so-called 5’ cap, which has important functions in RNA stability and translation.
A historical timeline of RNA modification discoveries. Important milestones related to posttranscriptional modification, splicing and editing of RNA. Colored circles along the horizontal axis correspond with periods of time during which discoveries were made. The vertical axis indicates the number RNA modifications discovered
Recently, many new and chemically diverse modifications of mRNA, including N6-methyladenosine (m6A), 5-methylcytosine (m5C), 5-hydroxymethylcytosine (hm5C), inosine (I), pseudouridine (Ψ) and N1-methyladenosine (m1A) have been identified. Both mRNA and lncRNA modifications have been mapped across the entire transcriptome using high-throughput sequencing strategies. One of the beststudied RNA modifications, m6A, was actually discovered several decades ago, but its functions remained largely unknown until recently. The distribution of m6A across the entire transcriptome of many different cell types has showed a critical role for this RNA modification in RNA stability, translation, splicing and secondary structure. Recent studies have uncovered important roles for other mRNA modifications including Ψ-mediated translational read-through, m1A-associated translational regulation and inosine-induced recoding; functions that have often been informed by transcriptome-wide maps. Linking RNA modifications with effects on transcription and translation is now known as the study of “epitranscriptomics.”
2 |. RNA MODIFICATIONS
2.1 |. N-6-methyladenosine (m6A)
Since the observation that the FTO (fat mass obesity associated) protein functions as an m6A methyltransferase enzyme, the functional role of m6A RNA modifications and their potential contribution in disease etiology has attracted much attention.5–7 Methylated adenosine accounts for 0.2% of the total RNA transcripts which equates roughly to a frequency of 1 m6A per 2000 bp.8 This means that on average, 1 or 2 methylated adenosines (m6A) are present in each mammalian mRNA transcript. m6A is primarily located in the vicinity of stop codons and within long exons, but not start codons,5,9,10 in line with its function of regulating mRNA half-life.11 Also, m6A alters RNA folding and structure, and participates in the maturation of RNA through 5’-capping, polyadenylation and splicing.12 In addition, m6A methylation facilitates appropriate cellular localization and nuclear export of mRNA.13
2.2 |. 1-Methyladenosine (m1A)
The RNA modification m1A was discovered many years ago14 and is prevalent in rRNA and tRNA where it maintains tertiary structure and effects translation.15,16 m1A is found in all regions of transcripts including the coding sequence (CDS), as well as 5’ and 3’ untranslated regions (UTRs).17,18 mRNA molecules have also been found to contain m1A although its function is less well-known. m1A can increase translation of mRNA, by lowering binding of releasing factor, or decrease translation by disrupting RNA folding around the translation initiation site.5
2.3 |. 5-Methylcytosine (m5C)
5-Methylcytosine (m5C) is commonly found in DNA but can also be found in RNA.19 m5C is mainly found in UTRs of mRNA and near binding sites for Argonaute, part of the RNA degradation machinery.20,21 Similar to m6A, m5C modification is also dynamic, but unlike m6A, m5C is not removed but is oxidized to 5-hydroxymethylcytidine (hm5C).22–25 hm5C tends to be found in polyribosomes and is associated with increased translation efficiency in Drosophila.26
2.4 |. Pseudouridine (Ψ)
Pseudouridine (Ψ) was the first RNA modification identified and was mistakenly thought to be the fifth-nucleotide.27,28 Ψ is an isomer of uridine that does not affect Watson-Crick base pairing.29 Ψ is generated by the action of 2 enzymes: H/ACA box snoRNAs30 and by pseudouridine synthase (PUS).31,32 Ψ is especially abundant in tRNA and rRNA33,34 but is also found in snRNA. Recently, Ψ was mapped in eukaryote mRNA with next-generation sequencing technology.7,35,36 Ψ formation is initiated at the cellular level by environmental cues and is thought to be an irreversible modification of RNA.37 Mapping of Ψ modified RNA in conjunction with other functional studies have identified roles for Ψ in mRNA stabilization, intracellular transcript localization38 and translation termination.39
2.5 |. Queuosine (Q)
Queuosine (Q) is a 7-deazaguanosine nucleoside40 that is enzymatically added to specific tRNAs.15,41,42 In eukaryotes, queuosine production begins with dietary consumption or microbiome produced queuine base. The queuine base is then modified and added to RNA posttranscriptionally by tRNA-guanine transglycosylase (TGTase).43 Genetic disruption of mouse TGTase impairs the ability to produce tyrosine from phenylalanine44 with the potential to influence production of monoamine neurotransmitters such as dopamine.45 In a mouse model of multiple sclerosis there is recent evidence to suggest that queuine incorporation in tRNA contributes to the pervasive encephalomyelitis seen in this disease.46 Modification of tRNA induced remission of multiple sclerosis in an animal model.46 Since a large fraction of queuine base is produced in the gut but utilized in the brain, hypotheses about a potential role for queuine modified RNA in gut-brain signaling pathways arise. Future studies may be able to shed light on this intriguing possibility.
2.6 |. RNA editing
RNA editing was first identified as a mismatch between the RNA sequence of the transcript of mitochondrial oxidase 2 and the corresponding DNA sequence.47 The most prevalent substitution is adenosine to inosine.48 This substitution leads to I-U mismatches which the translational machinery recognizes as a guanosine, resulting in A to G mutations.49 mRNA editing can dramatically alter the properties of the translated protein. One example is the AMPA (alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid) glutamate receptor subunit GluR2. In this case mRNA editing converts a glutamine to arginine which subsequently changes AMPA receptor calcium permeability.50 Another example is the serotonin receptor 5-HTR2C where a total of 5 edited positions that dramatically alter G-protein coupling and downstream signaling events have been identified.51
2.7 |. Circular RNA
Another RNA modification that has received much attention are the circularized RNA species (circRNAs) which consists of single-stranded RNA molecules that have 5’ and 3’ ends joined.52 Mammalian circRNAs are generated by “backsplicing,” in which the spliceosome joins the 3’ end of an exon with an upstream 5’ end from the same transcript.53–56 Of particular interest for the present review, circRNAs are found in high abundance in the major brain areas. In neurons, the highest level of circRNAs is found in synaptosomes.57,58 circRNAs are significantly more stable than linear RNA species and therefore have much longer half-lives.59 Very recent data suggest that circRNAs have much different roles than linear RNAs. For example, circRNA Cdr1as is expressed exclusively in the brain and has more than 70 binding sites for miR-7,59,60 which in turn regulates many other genes in the brain.61–63 CDR1as may act like a sponge for miR-7,59,60 and animals with a genetic disruption of Cdr1as had deficits in neurotransmission.64
2.8 |. RNA modification cross-talk and the RNA code
The possibility of having multiple RNA modifications on individual RNA molecules suggests the potential for an exquisite level of functional control. Combinations of irreversible and reversible RNA modifications, coupled with the ability to modify all 4 primary ribonucleosides, and the chance to modify tRNA, rRNA, mRNA, lncRNA, snoRNA and other RNA species, perhaps preferentially, would generate an almost infinite combination of RNAs. Evidence for the existence of such complexity is accumulating. For example, m6A mapping identified multiple m6A containing circRNA species65 many of which exhibit cell-specific expression patterns. Interestingly, m6A modifications in circRNAs are written and read by the same proteins that perform these functions in mRNAs, but the pattern and location of m6A in circRNAs are completely different than in mRNAs. Another example of the complexity of the RNA code is found in yeast, where queuine modifications in tRNAs enhance the catalytic activity of the 5-methyl-cytosine enzyme DNMT2 which subsequently methylates other tRNAs.66 As discussed below, many more discoveries will be needed to crack the RNA code but these will have to await the development of new single molecule sequencing technologies.
3 |. WRITERS, READERS AND ERASERS
As we have learned in the previous section, a variety of RNA modifications significantly alter the life cycle of RNA and thereby dynamically regulate the transcriptome. Although RNA modifications are of diverse types, and the molecular machinery differs for these types, the proteins that catalyze RNA modification and that recognize and sometimes undo RNA modifications can be classified as writers, readers and erasers. In the following sections, we focus primarily on the enzymatic machinery that adds, reads and erases the well-characterized mRNA m6A modification (Figure 2).
m6A RNA modification is mediated by writers, readers and erasers. Mechanisms of RNA modification (writing), recognition and erasure are highly conserved. The methyltransferase complex is composed of multiple proteins and methylates the mRNA. Methylation occurs simultaneously with transcription and is recognized by the reader protein, which, in turn, recruits other proteins such as the splicing machinery. The reader may also protect against demethylation by physically blocking access of the demethylase enzyme, or the m6A methylation mark may be erased by a demethylase
3.1 |. Writers
METTL3 and METTL14: N6-methyladenosine (m6A) is incorporated by a methyltransferase complex consisting of METTL3, METTL14, WTAP, KIAA129 and RBM15.20 The m6A base is recognized by RNA-binding protein readers or may be erased by demethylase or by RNA turnover. Methylation of adenosine in mRNAs is catalyzed by a methyltransferase complex containing METTL3 (also called MT-A70, MTA and IME4) in cooperation with METTL14.67,68
3.1.1 |. Accessary proteins (WTAP and KIAA1429)
Wilms tumor 1-associated protein (WTAP) associates with Wilm’s tumor suppression gene 1 (WT1) which is involved in alternative splicing.69 In plants, the WTAP protein associates with METTL3 which assists in the production of m6A.70 KIAA1429 (also called Virilizer) is another recently identified methylase-associated protein and is thought to be a critical component of the methylase complex.71 The m6A methyltransferase complex components METTL3, METTL14, WTAP, Vir and RBM15 are highly conserved across eukaryotes, but not yeast or nematode.71,72
3.2 |. Readers
The main role of the reader protein is to fine tune the regulation of methylated transcripts. This may be accomplished by blocking access of writers or erasers to the modified base or by recruiting other RNA-binding proteins to promote chemical modification of the RNA.
3.2.1 |. YTH domain containing protein
The YTH (Yeast Two Hybrid clone 521-b Homologue) domain containing proteins are known as m6A readers because of their high affinity for methylated RNA. There are several known members in this protein family including YTHDC1/2 and YTHDF1/2/3. Knockdown of YTHDF2 affected mRNA degradation and m6A binding of YTHDF2 was closely related to mRNA localization and/or decay.11 YTHDF2 also plays a role in conserving methylation around the 5’ UTR. However, after exposure to external stress, m6A, which can be demethylated by FTO, is protected by YTHDF2 binding and the protected m6A methylation then promotes cap-independent translation initiation.22
3.2.2 |. HNRNPC, HNRNPC HNRNPA2B1 and eIF
NHRNPC is an abundant RNA-binding protein, and binding modulation of HNRNPC by m6A to RNA affects the alternative splicing of mRNA transcripts.11 A closely related protein HNRNPG may also have regulatory functions in RNA metabolism.73 Also, the eukaryotic initiation factor 3 (eIF3) protein binds to m6A near the 5’ UTR thereby regulating translation initiation.6
3.3 |. Erasers
The m6A mark on RNA can be catalytically reversed to adenosine by the ALKB family of proteins.74 To date, demethylase activity has been reported only in higher eukaryotic organisms and has not been found in lower eukaryotes.
3.3.1 |. FTO (fat mass associated protein) and ALKBH5
Knockdown of FTO significantly increased m6A methylation and overexpression of FTO decreased in m6A methylation.75 ALKBH5, like FTO, is a member of the ALKB family with demethylase activity. Similar to FTO, changes in protein levels of ALKBH5 had a significant effect on m6A methylation.74
4 |. BRAIN DISEASE AND NEURONAL BEHAVIOR
RNA modifications are pathoetiologic in specific types of brain cancer.76–78 However, studies of the relationship of RNA modifications to neuropsychiatric disorders are just beginning.
4.1 |. Brain cancer
m6A modified RNAs play a key role in brain cancer.76–78 A recent study using stem cells derived from glioblastoma multiforme (Glioblastoma-derived Stem Cells; GSCs) found that elevated levels of ALKBH5 are a reliable prognostic indicator of glioblastoma progression.77 This study also identified a functional role for ALKBH5 in GSCs capacity for self-renewal.77 Another study reported that miR-29a inhibited the invasion and migration of GSCs. The Quaking (QKI) gene facilitates central nervous system myelination and when QKI is inhibited by Mir-29a there is a strong inhibition of PI3K/AKT and ERK pathways important for migration and invasion of GSCs.79
4.2 |. Neurodevelopmental and neurodegenerative diseases
FTO, the fat mass and obesity associated gene, which catalyzes N-6-methyladenosine demethylation, was originally discovered in Fused Toe (Ft) mutant mice.80,81 FTO plays a critical role during development of the nervous system. Mouse embryos harboring a genetic deletion of the Fto locus show abnormalities of brain patterning including defective telencephalon and hypothalamus development.80 Genomic variants in the FTO gene are associated with Alzheimer disease82,83 and expression levels of FTO were significantly lowered in Alzheimer disease.83 Other data linking FTO with Alzheimer’s disease have also been identified in an epidemiological study utilizing meta-analysis.84
It is well-known that tRNA molecules undergo extensive post-transcriptional modifications85,86 and these modifications may contribute to brain dysfunction. Genetic inactivation of human tRNA methyltransferase 1, which catalyzes dimethylation of guanosines in tRNAs87 causes cognitive disorder.88 Also, partially inactivating mutations in pseudouridylase 3 (Pus3), which catalyzes isomerization of uracil to Ψ in certain tRNAs are associated with intellectual disability.89
Recently, a mutation in kinase-associated endopeptidase (KAE1), part of the biosynthetic pathway that generates the tRNA N6-threonyl-carbamoyl-adenosine (t6A) modification, was associated with neurodegenerative disease.90 Nonsyndromic X-linked mental retardation and intellectual disability can be caused by mutations in FtsJ methyltransferase homolog 1 (FTSJ1),55,91 an enzyme that methylates tRNAs.
Many recent studies have identified the elongator complex, a multisubunit protein complex of 6 ELP proteins, as playing an essential role in tRNA uridine modification.92–94 A variant of one of these subunits, ELP2, has been linked to neurodevelopmental disability.88,95 Mutations of ELP4 have been found in atypical rolandic epilepsy patients 96 who may have perturbed neuronal migration. Finally, genetic variants of ELP3 have been associated with the progressive motor neuron disease amyotrophic lateral sclerosis (ALS).97
Genetic defects in m5C enzymes are closely associated with neurological disorders. For example, mutations in the m5C RNA methyl-transferases Nsun2 (NOP2/Sun domain family, member 2) and Dnmt2 (DNA [cytosine-5-]-methyltransferase 2) that methylate several different tRNAs98–100 are both associated with nervous system disorders.99 Genetics variants of NSUN2 have also been linked to intellectual disability101,102 such as is seen in Dubowitz-like syndrome.103
As mentioned earlier, mRNA editing is catalyzed by a family of adenosine deaminase enzymes called ADARs.104 However, ADARs can also edit tRNA. Heterodimeric adenosine deaminase (hetADAT) catalyzes the conversion of adenosine-to-inosine of some tRNAs. Mutation of one hetADAR subunit, ADAT3, is found in families with inherited intellectual disability.105
4.3 |. RNA modifications and memory
To date, many studies have shown that DNA and/or histone modifications play an important role in memory formation.106 However, RNA modifications also participate in memory formation. For example, experimentally induced reductions in Fto expression have been shown to enhance contextual fear memory.107 No doubt many other examples await discovery.
4.4 |. RNA modifications in depression
FTO polymorphisms have been found in psychiatric diseases including major depressive disorder (MDD).108 In one study, an inverse association between obesity risk and depression in individuals carrying the FTO rs9939609 allele were discovered. Another study found an association between MDD and allelic variants of ALKBH5,109 in agreement with data associating m6A-modified mRNAs with anxiety and cognitive disturbances.110,111 Also, serotonergic transmission has long been associated with MDD, suicide ideation and completion. Modifications of the serotonin receptor 2C (HTR2C) mRNA leading to impaired 5-HTR2C signaling have been detected in the brains of suicide completers.51
4.5 |. RNA modifications in addiction
The combination of environmental factors such as drug-associated cues and genetic factors contribute strongly to human behavior such as drug seeking. The abundant brain expression of FTO and the elements of obesity-associated genetic variants have heightened interest in the role of FTO in relation to food-related cues and reward-response.
Imaging studies suggest that insulin sensitivity along with the contribution of genetic FTO and Taq1A (ANKK1) variants are related to the reward mechanism mediated by dopamine receptors.112 Other studies have shown association between FTO and ANKK1 variants with decreased D2 receptor density in the nucleus accumbens.113
In mice, deletion of FTO in D2 type neurons weakened conductance of G-protein-coupled inwardly rectifying potassium (GIRK) channels after acute cocaine treatment.114 Sequencing of immunoprecipitated m6A-modified RNA by immunoprecipitation showed an increased m6A methylation in many transcripts related to dopamine signaling in FTO-deficient mice and confirmed altered expression levels of these proteins.114 Together, these studies suggest that epitranscriptome changes in RNA methylation initiate and/or potentiate the addiction cycle.
5 |. FUTURE DIRECTIONS
Much of the new information about the functional roles of RNA modifications that has been generated over the last decade has relied on novel RNA-sequencing technologies as exemplified by the transcriptome wide m6A maps.115 Functional studies based on features uncovered by these maps have been directly linked to molecular changes affecting mRNA splicing, export, translation, stability, structure and mRNA biogenesis.
To achieve a deeper understanding of functional roles played by RNA modifications, new technologies are required. Current sequencing technologies are predominantly antibody-based and thus do not provide a direct readout of RNA modifications. The m6A-specific antibodies, for example, are known to have intrinsic bias for certain RNA sequences and secondary structures and cannot discriminate m6A from other modified ribonucleosides.116 Additionally, antibody-based RNA-sequencing approaches require a priori knowledge of the modified ribonucleotide, thereby preventing their use in the discovery of new RNA modifications.
The biological functions of dynamic chemical modifications of mRNAs such as m5C, Ψ, hm5C and m1A are poorly understood due to the lack of optimized detection methodologies. For m5C detection, RNA bisulfite sequencing can detect endogenous m5C sites at single nucleotide resolution. However, extremely high numbers of sequencing reads are required for accurate base calling making this a prohibitively expensive approach. Also, bisulfite treatment causes significant RNA degradation which may alter transcript representation after next-generation sequencing. Incomplete bisulfite conversion of cytosines may introduce bias, and other RNA modifications may also be converted to produce false positives. More sensitive and accurate m5C detection methods need to be developed in the future.
For Ψ detection, current sequencing approaches have achieved single-base RNA resolution but at the cost of significant RNA degradation due to harsh chemical treatment. For hm5C and m1A detection, current sequencing technologies have not reached single-base resolution. Significant advances in sequencing methodologies are urgently required to better detect and functionally analyze these RNA modifications.
Next, next-generation sequencing technologies were originally designed to sequence DNA and have recently been repurposed to sequence RNA. A successful technology would ideally be able to perform long reads and identify multiple modifications on a single molecule of RNA.117,118 For an excellent recent review of these technologies please see Jonkhout et al.119 One of the most promising approaches for single molecule RNA sequencing is nanopore sequencing which utilizes changes in current as RNA molecules pass through a pore to make base calls in real time.120 The first nanopore sequencing of a reference 16 seconds RNA molecule showed a low base calling accuracy (<90%)121 and very low throughput (~1 million sequence reads; https://nanoporetech.com/)122 New, and more sensitive methods of base detection; more tightly regulatable nanopores, and algorithms for better base calling will no doubt improve these results.
With the development of more and better epitranscriptome sequencing technologies there will be a need to analyze large sequencing datasets. New bioinformatic tools are needed to supplement the current data analysis pipelines which were initially designed to analyze chromatin immunoprecipitation sequencing (ChIP seq) data. These new tools will need to take into account the complications caused by differential splicing, and amplification bias induced during reverse transcription as well as integrate multiple RNA modifications within the same molecule of RNA, across the entire transcriptome. A comprehensive database for curating and sharing epitranscriptomic data should be established to standardize the experimental and computational procedures that are used in different studies.123 We envision that in the not so distant future many new molecular and bioinformatic tools will become available to facilitate rapid advancements in the field of epitranscriptomics.


