Astrocytes and neurons share region-specific transcriptional signatures that confer regional identity to neuronal reprogramming

Region-specific gene expression shared with neurons imparts to astrocytes competence for region-specific neuronal reprogramming.


INTRODUCTION
The development of neuronal diversity is central for the organization and function of the central nervous system (CNS). This diversity is largely determined by specific transcriptional programs already expressed at the progenitor stage (1)(2)(3)(4)(5)(6)(7). These programs can undergo temporal regulation, allowing for sequential generation of different progeny from the same original progenitor (4,8). The most drastic case of this temporal regulation occurs at the switch of progenitors from neurogenic to gliogenic competence (9). Moreover, transcriptional programs are also diversified across brain regions to reflect the positional identity of the progenitors. Pioneering work in the spinal cord suggests that the diversification of astrocytes might follow the same organizing principle of positional identity (10,11). This notion has recently received further support by clonal analyses and single-cell transcriptomics that unveiled highly characteristic distributions of heterogeneous astroglia within and across brain regions (12)(13)(14)(15). However, given that neurons and astroglia are generated from the same germinal zones, they could share common molecular signatures, reflecting their origin and potentially acting to coordinate region-specific developmental features. Here, we address this possibility and report that thalamic and cortical astrocytes exhibit region-specific transcriptional and epigenetic signatures, which are shared with the neurons generated within the same thalamic or cortical progenitor domain but not beyond. These shared signatures confer a remarkable degree of regional specification for astrocyteto-neuron reprogramming induced by the proneural factor Neurogenin 2. Last, manipulating the regional-specific code in defined astrocyte populations redirects reprogramming toward neurons of different, yet predictable, regional identity.

Shared gene expression signatures between astroglia and neurons
To test the hypothesis that astrocytes and neurons generated within the same brain region share molecular signatures unique to this region, we set out to identify differentially expressed genes (DEGs) between astrocytes of the thalamus and cortex, performed a similar analysis between thalamic and cortical neurons, and then searched for potential overlap among the two sets of DEGs. Toward this, we performed bulk RNA sequencing (RNA-seq) on astrocytes isolated from the thalamus [comprising dorsolateral geniculate (dLG), ventral posteromedial (VPM), and ventromedial geniculate (MGv) nuclei] and primary somatosensory cortex (S1) using astrocyte reporter mice (Gfap::Gfp) (16) at postnatal day 7 (P7) (Fig. 1A and fig. S1A) after the peak of astrogenesis (17). As for the analysis of neurons, we performed RNA-seq on neurons isolated from the thalamus at P0 using a Gbx2-CreER::Tomato-floxed thalamic reporter mouse, where early postmitotic thalamic neurons are labeled ( fig. S1B) (18). By

of 17
intersectional analysis within these astrocytic datasets, we first identified genes specifically expressed by astrocytes irrespective of their region of origin (e.g., Aqp4 and Aldh1l1; fig. S1C). As for neurons, we used canonical genes conserved in all neuronal subtypes (e.g., Rbfox3 or Nefm; fig. S1C) [see (19)(20)(21)]. The unambiguous expression pattern of these genes support the specificity of the Gfap::Gfp and Gbx2-CreER::Tomato-floxed mouse lines used for isolation of astrocytes and thalamic neurons, respectively (fig. S1, A to D). A principal components analysis (PCA) revealed that thalamic and cortical astrocytes clustered according to their anatomical origins (Fig. 1B). Moreover, a differential expression analysis (DEA) revealed 1675 versus 1287 DEGs enriched in the thalamus and cortex, respectively (Fig. 1C). Among the DEGs enriched in each population, we identified several genes, including transcription factors, that are known to be highly expressed in neurons of the respective regions (Fig. 1D and table S1) (20,22). This prompted us to perform Gene Ontology (GO) overrepresentation and gene set enrichment analyses (GSEA) of the DEGs between thalamic and cortical astrocytes, (B) Principal components analysis (PCA) of the transcriptomes of astrocytes (As) from the thalamus (Th), including dLG (n = 5 samples), VPM (n = 4), and MGv (n = 4), and cortex (Ctx, n = 4) at P7. (C) Heatmap of z score normalized regularized logarithm (Rlog) expression and unbiased clustering of significantly DEGs between thalamic (As-Th) and cortical astrocytes (As-Ctx). Each row represents a gene, the columns are biological replicates, and the color code represents the normalized expression for up-regulated genes in yellow versus down-regulated genes in purple. (D) MA plot displaying DEGs. Blue and light gray dots represent thalamic and cortical DEGs with their mean normalized counts, respectively. Dark gray dots represent genes that failed to give a significant result. (E) Enrichment plots from the GSEA of two specific GO terms related to the thalamic and cortical formation. (F) GO biological process (BP) enrichment analysis of significantly DEGs in thalamic and cortical astrocytes and associated gene networks. The size of every node (enriched term) represents the number of genes enriched and the color code (yellow, high expression; purple, low expression) corresponds to the log 2 FC in DE analysis. In (A), scale bars, 400 m.
which revealed marked differences in developmental programs and distinct region-specific molecular pathways that have been previously associated with neurons from these regions (Fig. 1, E and F). To unveil region-specific genes shared among astrocytes and neurons of the corresponding regions, we first identified the most highly DEGs enriched in thalamic and cortical neurons, by comparing RNA-seq data of neurons isolated at P0 from a thalamic reporter line (Gbx2-CreER::Tomato-floxed) (18) with a published dataset of P1 cortical neurons ( Fig. 2A) (20). We found that genes specifically enriched in thalamic or cortical neurons were substantially overrepresented among DEGs in thalamic or cortical astrocytes, respectively. Among the 400 most DEGs in thalamic neurons, only 6% were shared with cortical astrocytes, whereas 32.75% of these genes were significantly expressed by thalamic astrocytes, albeit typically at a lower level (including Gbx2, Ror, or Tcf7l2; Fig. 2, B to D, and fig. S1, E to K). A significant overlap in gene expression was also observed for cortical neurons and cortical astrocytes (17.5%), where genes such as Fezf2 or Foxg1 were identified in both populations. Overlap in gene expression was notably lower between cortical neurons and thalamic astrocytes (4.5%; Fig. 2, B to D).
Next, we interrogated the overlap in expression of region-specific genes between neurons and astrocytes at the single-cell level by analyzing an independent, published dataset containing single-cell transcriptomes of thalamic and cortical neurons and astrocytes from juvenile/young adult mice ( fig. S2; fig. S3, A and B; and table S2) (23). The analysis of these single-cell data fully confirmed the existence of region-specific gene expression programs shared between astrocytes and neurons of thalamic and cortical origin, respectively (Fig. 2, E to G, and fig. S3, C to G). Notably, among the DEGs at the single-cell level shared between neurons and astrocytes (247 for the Thalamic neurons were obtained from Gbx2-Cre::Tomato-floxed P0 mice and cortical neurons from publicly available datasets (20). Ns-Th included dLG (n = 4), VPM (n = 4), and MGv (n = 3), and Ns-Ctx (n = 6). (B) Venn diagram showing the genes that overlap between astrocytes (As) and neurons (Ns) in both the thalamus and cortex. Bar plots represent the percentage of the enriched genes shared between populations. (C) Heatmap showing overlapping genes between As and Ns in the thalamus and cortex. (D) Box plots showing expression levels of selected region-specific genes shared between neurons and astrocytes of the thalamus (top) or the cortex (bottom). TPM, transcripts per million. (E) Heatmap of the z score of average expression levels of DEGs at the single-cell level, identified by comparing cell types among different regions of origin (As-Th versus As-Ctx; Ns-Th versus Ns-Ctx) from publicly available data (23). (F) Comparison matrix of the number of shared specific gene lists between As and Ns datasets of every specific region. Color code according to significance of overlap. (G) Schematic of the main conclusion of the experiments. Data are plotted with box-andwhisker plots, which give the median, 25th and 75th percentiles, and range. Dots in (D) represent every single value. thalamus and 442 for the cortex), we found numerous genes known to confer regional neuronal identity (e.g., Ror, Tcf7l2, Fezf2, or Foxg1), as observed in the bulk RNA-seq dataset. These single-cell data demonstrate that the shared region-specific transcriptional signature is not a transient developmental feature but maintained well beyond the first postnatal week.
Last, we conducted two additional experiments to validate the expression of region-specific "neuronal" genes in astrocytes. First, we performed fluorescence in situ hybridization (FISH) in fixed slices from Gfap::Gfp brains at P7 and confirmed the expression, in a region-specific manner, of shared genes in astrocytes (GFP + cells) ( fig. S4, A and B). This expression was mainly found at the level of mRNA, as the corresponding proteins were only detected in a low percentage of the astrocytes, at least for the genes analyzed ( fig. S4C), which suggests that posttranscriptional regulations might take place (24). Second, we isolated, purified, and cultured astrocytes from the thalamus or cortex and performed quantitative polymerase chain reaction (qPCR) for region-specific genes, confirming the expression of shared genes in astrocytes (fig. S4, D and E). Thus, single-cell RNA-seq (scRNA-seq), FISH, and qPCR provide strong support for the specificity of the detection of neuronal genes in astrocytes, arguing against neuronal contamination of the astrocyte datasets.
We next asked whether region-specific gene expression programs can be identified at the level of individual regional subdivisions such as those of sensory thalamic nuclei. Thus, we compared the transcriptomes of astrocytes and neurons from the three main sensory thalamic nuclei (dLG, VPM, and MGv; Fig. 3A). PCA identified three well-defined clusters corresponding to each nucleus in both astrocytes and neurons, supporting the notion that the identity of each thalamic nucleus is encoded transcriptionally in a cell typeindependent manner (Fig. 3, B and C). Hence, the nucleus-specific DEGs of astrocytes exhibited a significant overlap with those of the neurons from the same nucleus (e.g., Sp9 for dLG or Crabp2 for MGv; Fig. 3, D and E, and tables S3 and S4), although the expression levels of these genes were notably lower in the astrocytic populations (Fig. 3F). Our single-cell data analysis also revealed a regionspecific pattern of shared genes between astrocytes and neurons along the anteroposterior axis of the cerebral cortex ( fig. S5 and table S5), supporting a generalization of the existence of region-and subregion-specific shared transcriptional programs between these two major cell types in the mouse brain.
Thalamic progenitor clones are nucleus specific Next, we investigated whether the significant gene expression overlap between postmitotic astrocytes and neurons from the same region reflects a common clonal origin during embryonic development. This would imply that within the thalamus, cells belonging to the same clone should not disperse beyond nuclear boundaries. To test this hypothesis, we first analyzed the distribution of astrocytes originating from single clones across thalamic sensory nuclei. We tracked astrocyte clones arising from embryonic day 11.5 (E11.5) progenitors by electroporating a battery of plasmids encoding distinct fluorophores under the control of the Gfap promoter, following transposase-mediated integration ("StarTrack") (12), and analyzed the dispersion of each clone at P8 (Fig. 4, A and B, and fig. S6A). This revealed that clonally related astrocytes remain within the boundaries of a given nucleus with little dispersion to other nuclei, even in the case of larger clones (>10 cells) (Fig. 4C and fig. S6, B and C). Next, we addressed the question of whether thalamic progenitors that generate astrocytes also produce neurons and, if so, whether these neurons stay within the same nuclear boundaries as their sibling astrocytes. Thalamic clones containing neuronal and/or nonneuronal cells were tracked by using the same set of fluorophores under the control of a ubiquitously expressed promoter (Fig. 4, D and E, and fig. S6, D and E) (25). While we found 39% of clones consisting only of neurons or glia, the majority (61%) were mixed, containing similar proportions of neurons and glia ( Fig. 4F and fig. S6F). We found that mixed clones covered territories that largely respected nuclear boundaries, although neurons exhibited a wider range of dispersion (Fig. 4, G and H, and fig. S6G), extending and confirming previous studies (26,27). Our data suggest that the overlap in region-specific gene expression between neurons and astrocytes of each sensory thalamic nucleus is the result of their common clonal origin together with the limited spatial dispersion of clonally related cells and may indicate that positional information is retained from an early progenitor stage onward.

Astrocytes reprogram into region-specific neurons
Since forebrain astrocytes and neurons share region-specific gene expression, we hypothesized that such molecular signature could instruct transcription factor-induced reprogramming of astrocytes toward an identity akin to their sibling neurons. To test this hypothesis, we injected a retrovirus encoding the proneural gene Neurogenin 2 (Neurog2) and the cell death regulator Bcl2 (28) into the somatosensory cortex and thalamus of P3 mice (Fig. 5A). At this developmental stage, retroviruses only transduce proliferative glia (17). Transduction with a retrovirus encoding Bcl2 and Gfp alone, as control, resulted in labeling of glial cells. In contrast, transduction with Neurog2-and were positive for the astrocytic marker Aldh1l1 and negative for the neuronal markers NeuN and DCX. Furthermore, we found that more than 90% of the transduced cells were also positive for 5-bromo-2′-deoxyuridine (BrdU), revealing that they are proliferating cells at the time of retroviral transduction ( fig. S7D). After 7 and 14 dpi, the number of transduced cells positive for DCX or NeuN increased progressively. These DCX-or NeuN-positive cells were also BrdU positive, demonstrating again that they had been generated by the time of retroviral transduction and that DCX expression was not a result of reexpression in embryonically generated neurons ( fig. S7, E to H). Consistent with our hypothesis, in vivo induced neurons expressed markers specific for a thalamic (Lef1 and Ror) or cortical (Tbr1 and Ctip2) neuronal identity despite the fact that they were induced with the same transcription factor (Fig. 5, B and C).
Our data suggest that reprogramming of astrocytes into regionspecific neurons is a consequence of their shared gene expression through a common lineage. However, it does not exclude the possibility that region-specific reprogramming is influenced by environmental signals provided by other local cells. To test this, we cultured astrocytes from the thalamus and cortex and examined their newly acquired neuronal identity for region-specific gene expression following reprogramming by Neurog2 ( fig. S8, A and B). As observed in vivo, thalamic and cortical induced neurons exhibited signatures of the thalamus and cortex, respectively, as shown by the differential expression of thalamic markers such as Slc17a6 (vGlut2), Ror, Gbx2, Pou2f2, or Lef1 or cortical markers such as Tbr1 or Ctip2 ( fig. S8, C to G). To exclude a prominent role of the environment in specifying the regional identity of induced neurons, we cocultured thalamic or cortical astrocytes undergoing reprogramming with neurons or astrocytes from the cortex or thalamus, respectively. Neurons induced from thalamic astrocytes expressed thalamic markers, irrespective of whether they had been cultured alone or with cortical cells. Conversely, cortical astrocytes gave rise to neurons expressing cortical markers irrespective of the culture conditions (Fig. 5, D to F). These experiments revealed that the regional identity of induced neurons is largely cell autonomous.
Last, as astrocytes and neurons from distinct thalamic sensory nuclei share expression of nucleus-specific genes, we hypothesized that reprogramming of thalamic astrocytes may yield neurons with nucleus-specific signatures. To this end, we isolated and reprogrammed astrocytes from dLG, VPM, and MGv in vitro with Neurog2 (Fig. 5G).
We found that induced neurons derived from dLG astrocytes expressed dLG-specific genes Sp9 and Hs6st2, while those derived from MGv astrocytes expressed MGv-specific genes Crabp2 and Tshz1. Last, induced neurons of VPM astrocyte origin expressed the VPM marker Cck (Fig. 5H) (22). Together, these results show that Neurog2 triggers specific neuronal gene expression in astrocytes that reflects their place of origin.

Gbx2 respecifies cortical astrocytes toward thalamic fate
The aforementioned results strongly suggest that transcriptional signatures shared between neurons and astrocytes drive the regional specification of the latter during neuronal reprogramming. To directly test this, we examined whether coexpression of a thalamic fate determinant Gbx2 (29), a factor being shared between astrocytes and neurons of the thalamus, could induce an early and fast redirection of neuronal reprogramming of cortical astrocytes toward a thalamic identity (Fig. 6A). Whereas in cortical astrocytes, expression and astrocytes datasets of every thalamic nuclei. Color code according to significance of overlap. Right, bar plots representing the percentage of gene overlap between As and Ns from each thalamic nucleus. (E) Heatmap showing the overlapping DEGs between As and Ns in each nucleus. Each column represents a biological replicate and the color code represents the z score normalized expression (up-regulated genes in yellow, down-regulated genes in purple). (F) Box plots showing expression levels of nuclei-specific shared genes between astrocytes and neurons in the distinct sensory-modality thalamic nuclei. TMP, transcripts per million. Data are plotted with box-andwhisker plots, which give the median, 25th and 75th percentiles, and range. Dots in (F) represent every single value. ***P < 0.0005.
of Neurog2 for 2 days induced the expression of the cortical neuron fate determinants Tbr1 and Ctip2, coexpression of Gbx2 strongly suppressed this. Moreover, combined expression of Neurog2 and Gbx2 increased thalamic signature genes such as Pou2f2 and Slc17a6 (vGlut2) in cortical astrocytes (Fig. 6B). These data provide strong support for the partial redirection of neuronal reprogramming toward a thalamic identity (Fig. 6C). In thalamic astrocytes, by contrast, Neurog2 sufficed for inducing significant expression of Pou2f2 and Slc17a6 expression (Fig. 6B). Genes that displayed differential regulation by Neurog2 with or without Gbx2 (Slc17a6, Pou2f2, Tbr1, and Ctip2) exhibited an epigenetically poised state in cortical or thalamic astrocytes, as determined by the ratio of active (H3K4me3) and repressive (H3K27me3) histone marks in their proximal regulatory elements ( Fig. 6D and figs. S9, A to C, and S10, A and B). In contrast, nonresponsive genes (Fezf2, Ror, and Lef1) exhibited origin-dependent baseline expression both transcriptionally and epigenetically in thalamic and cortical astrocytes ( Last, we addressed the question of whether a similar epigenetically poised state might explain the differential induction of nuclei-specific neuronal genes in astrocytes of distinct thalamic territories. To this end, we first compared basal expression levels and presence of active (H3K4me3) and repressive (H3K27me3) epigenetic marks at proximal regulatory elements of these genes, known to be differentially expressed in dLG, VPM, and MGv neurons (Fig. 3) (22). Intriguingly, irrespective of the baseline expression level, these genes exhibited an active (Sp9, Crabp2, and Tshz1) or poised/less repressed (Hs6st2 and Cck) epigenetic state of their proximal regulatory elements, consistent with their nuclear origin ( Fig. 6E and figs. S9D and S10, C and D). Nucleus-specific epigenetic priming might explain the observed differential transcriptional responsiveness to Neurog2 of genes whose levels of transcription are indistinguishable across nuclei before reprogramming. Future genome-wide analysis will be required to reveal the general importance of epigenetically poised states in dictating the region-specific gene expression following reprogramming. (G) Quantification of the dispersion of clonally related neuronal and nonneuronal cells from mixed clones, in the different thalamic sensory nuclei (n = 130 clones from four electroporated animals). (H) Schema representing the specificity in the nuclei-dependent localization of clonal cells. Cells coming from the same progenitor are colored with the same color. Note that most clonally related cells respect the nuclei segregation and only few cells are dispersed. Data are plotted with box-and-whisker plots, which give the median, 25th and 75th percentiles, and range. Scale bars, 100 m. ns, not significant; *P < 0.05, **P < 0.005, and ***P < 0.0005.

DISCUSSION
Using genome-wide analysis, we show that astrocytes of different brain regions actively transcribe genes that also correspond to regional genes in neurons. This remarkable relatedness between astrocytes and neurons from the same brain region correlates with their shared clonal origin, as shown for distinct sensory nuclei of the thalamus. Furthermore, region-specific molecular signatures create a strong bias intrinsic to astrocytes toward generating neurons of matching regional identity when reprogrammed by the proneural factor Neurog2. This latter finding is in line with reprogramming of cortical astrocytes into neurons with layer-specific properties in vivo (30), where a tight lineage relationship between starting and target cell is likely to exist. The transcriptional context of the starting cell might even account for acquisition of specific neuronal fates where region-specific determinants may be expressed more coincidently, such as fibroblast conversion into retinal photoreceptors (31).
Despite their common developmental origin, neurons and astrocytes constitute cell types easily distinguishable by their morphological and electrophysiological properties. However, our study reveals that these two cell types show an unexpected overlap in the expression of genes that confer regional identity. Such overlap can be found at the single-cell level and extends into adulthood. Among the shared genes, there was a substantial amount of transcription factors, many of which play well-described roles in neuron subtype specification (e.g., Gbx2, Lef1, Fefz2, and Tbr1) (29,(32)(33)(34). The physiological function in astrocytes of the shared region-specific genes remains to be determined. Future experiments should decipher whether these genes may adopt different functions in astrocytes as compared to neurons or whether their shared expression might act as a code to facilitate region-specific interactions of astrocytes with their sibling neurons (35). While these scenarios are not mutually exclusive, the latter may provide an attractive mechanism by which Experimental design for assessing the influence of the environment on the induced neurons identity. Isolated cortical or thalamic astrocytes were infected and then cocultured with thalamic or cortical astrocytes or neurons. (E) Immunostaining for the thalamic marker Ror in cortical or thalamic iNs (RFP + /Tuj1 + ) in the different conditions. (F) Quantification of the percentage of iNs generated from cortical or thalamic astrocytes that express vGlut2, Ror, Tbr1, or Ctip2 in control conditions or when mixed with astrocytes or neurons from the thalamus or the cortex, respectively (n = 6 to 14 independent cultures per condition). (G) Left, experimental design. Astrocytes from dLG, VPM, and MGv were isolated, cultured, and infected with Neurog2 retrovirus. Right, image of an iN from dLG astrocytes at 10 days post infection (dpi). (H) Reverse transcription (RT)-qPCR showing the expression of specific neuronal genes in the iNs after 10 dpi (n = 10 to 14 independent cultures per condition). Data are plotted with box-and-whisker plots, which give the median, 25th and 75th percentiles, and range. Dots in (C) represent every single value. Scale bars, 100 m in (B) (insets, 25 m) and 25 m in (E) and (G). *P < 0.05, **P < 0.005, and ***P < 0.0005.
neurons could modulate the spatial distribution of astrocytes (13). Our clonal analyses reveal that neurons and sibling astrocytes originating from the same thalamic progenitor clone populate very similar territories, respecting boundaries among thalamic nuclei, extending earlier observations of the existence of nucleus-specific progenitor domains in the thalamus (26,27,36). Recent single-cell spatial transcriptomic mapping has revealed that in the cortex, astrocytes exhibit heterogeneity that does not follow neuronal layering (13). However, it is not shown whether neurons and astrocytes populating the same neuronal layer are more likely to be clonally related than those of distinct layers. Nevertheless, cortical regional identity is clearly computed along the anteroposterior and mediolateral axes (19), and, indeed, our single-cell data analysis shows that gene expression profiles are shared by astrocytes and neurons along the anteroposterior and mediolateral axes and, thus, it may serve as a mechanism to impart cortical areal identity also to astrocytes, as Astrocytes from the thalamus and cortex were isolated, and the expression levels of some region-specific genes were assessed by RT-qPCR or ChIP-qPCR. Box-and-whisker plots represent the basal expression levels of the studied genes in thalamic and cortical astrocytes (left axis), and dots show the means ± SEM of the epigenetic state of the promoter of those genes, in terms of the presence of two histone marks, H3K4me3 and H3K27me3 (right axis) (n = 12 to 23 independent ChIP samples per condition). The red dashed line indicates the point where H3K4me3 and H3K27me3 marks are present at the same level. (E) Box-and-whisker plots showing the H3K4me3 and H3K27me3 ratio in vitro (left axis) (n = 14 to 18 independent ChIP samples per condition) and the basal in vivo expression of neuronal specific genes in thalamic astrocytes from each nucleus (right axis). ns, not significant; *P < 0.05, **P < 0.005, and ***P < 0.0005. observed in the thalamus. Most likely, different dimensions of gene expression patterns underlie the unexpected molecular and spatial heterogeneity of astrocytes in the CNS.
It seems plausible that the shared region-specific gene expression is accounted for by epigenetic signatures inherited from a common progenitor and maintained throughout postmitotic development. Our data provide evidence for region-specific differences in the epigenetic state of regulatory elements of these genes in cortical and thalamic astrocytes, even up to the level of thalamic nuclear divisions. Conversely, these region-specific genes apparently escape the long-term epigenetic repression that occurs at neuronal gene loci at the developmental switch from neurogenesis to gliogenesis (37,38). The epigenetic configuration at region-specific genes might function as a latent mechanism to keep some neuronal expressed genes in a "poised state" in astrocytes, which may become activated by reprogramming factors such as shown here. The fact that epigenetic configurations are heritable through cell divisions (4,38,39) might confer astrocytes with a specific and long-lasting regional differentiation potential as may occur during injury-induced neurogenesis (40,41). Last, the fine-grained heterogeneity of astrocytes between and within brain regions [this study and (10, 13)] may provide a basis for reconstructing diseased brain circuits that require the generation of multiple neuron types (30,42), with a minimal number of molecular manipulations.

Mouse strains
All transgenic animals used in this study were maintained on ICR/ CD-1, FVB/N-Tg, or C57BL/6J genetic backgrounds and genotyped by PCR. The day of the vaginal plug was stipulated as E0.5. The Gfap::Gfp line (16) (the Jackson Laboratory, stock number 003257) was in an FVB/N-Tg genetic background, the Gad67::Gfp line (43) was in C57BL/6J, and the R26 tdTomato Cre-dependent line (the Jackson Laboratory, stock number 007908) and the Gbx2 CreERT2/+ mouse line (18) were in an ICR/CD-1 genetic background. Tamoxifen induction of Cre recombinase in the double-mutant embryos (Gbx2 CreERT2/+ ::R26 tdTomato ) was performed as previously described (44). The Committee on Animal Research at the University Miguel Hernández approved all the animal procedures, which were carried out in compliance with Spanish and European Union regulations.

Isolation of astrocytes and neurons for RNA-seq
The brains (four brains were pooled for each sample) were extracted in ice-cold KREBS solution and cut in the vibratome in 300-m slices, and cells were dissociated as in a previous publication (22). Thalamic nuclei (dLG, VPM, and MGv) and somatosensory cortex (S1) were dissected and pooled in cold dissociation medium [20 mM glucose, 0.8 mM kynurenic acid, 0.05 mM d,l-2-amino-5-phosphonovaleric acid (APV), penicillin (50 U/ml), streptomycin (0.05 mg/ml), 0.09 M Na 2 SO 4 , 0.03 M K 2 SO 4 , and 0.014 M MgCl 2 ]. The tissue was transferred to sterile conditions and enzymatically digested in dissociation medium supplied with l-cysteine (0.16 mg/ml) and 70 U papain (Sigma-Aldrich) set to pH 7.35, at 37°C for 30 min with repeated shaking. The enzyme was then inhibited with dissociation medium containing ovomucoid (0.1 mg/ml) (Sigma-Aldrich) and bovine serum albumin (BSA) (0.1 mg/ml) set to pH 7.35, at room temperature. Tissue was transferred to iced Opti-MEM (Life Technologies) supplied with 20 mM glucose, 0.4 mM kynurenic acid, and 0.025 mM APV and mechanically dissociated until a single-cell suspension was obtained. Cells were concentrated by centrifugation with 850 rpm for 5 min and filtered through a cell strainer (BD Falcon). The genetically labeled live cells were separated based on green or red fluorescence intensity using fluorescence-activated cell sorting (FACSAria III, BD). FACS-purified cells were collected directly in lysis buffer of the RNeasy Micro Kit (Qiagen, no. 74004) that was used to recover total RNA according to the manufacturer's instructions. RNA quality for all samples was measured on an Agilent Bioanalyzer 2100 system. All samples with RNA Integrity Number (RIN) > 7 were used as input to library construction.

Library preparation and RNA-seq
Library construction and sequencing were performed at the CNAG-CRG (Centro Nacional de Análisis Genómico) genomics core facility (Barcelona, Spain). Briefly, cDNA multiplex libraries were prepared using SMARTer Ultra Low RNA Kit v4 (Takara, no. 634894) and NEBNext Ultra DNA Library Prep for Illumina according to the manufacturer's instructions (NEB, no. E7645). Libraries were sequenced together in a single flow cell on an Illumina HiSeq 2500 platform using v4 chemistry in 1 × 50 bp (base pair) single-end mode. A minimum of 25 million reads were generated from each library.
In the analysis datasets of cortical astrocytes and thalamic astrocytes and neurons generated for this study, significantly DEGs were identified using statistical significance threshold [Benjamini-Hochberg (BH)-adjusted P value < 0.1] and absolute log 2 fold change (log 2 FC) > 0 using shrunken log 2 FC using the adaptive t prior Bayesian shrinkage estimator "apeglm" (tables S1, S3, and S4) (47). To identify the top most differentially enriched genes between cortical and thalamic neurons, we used data generated in this study for thalamic neurons (P0) and publicly available dataset for cortical neurons (P1) from a previously published study (GSE63482) (20). Datasets from (20) consist of RNA-seq profiles of multiple classes of FACS-purified cortical neurons from ICR/CD-1 mice: callosal projecting neurons (CPN, n = 2), corticothalamic projecting neurons (CThPN, n = 2), and subcerebral projecting neurons (ScPN, n = 2) (20). Neuronal datasets from the cortex and thalamus were aligned from the raw sequence, and gene counts were generated using the same pipeline as indicated previously. Gene counts were normalized using the median of ratios method in DESeq2 R package, and the ratio between gene counts (regularized logarithm transformation of the normalized counts) were used to identify the top 400 most differentially enriched genes between cortical and thalamic neurons. Hypergeometric test (one-sided Fisher's exact test) was performed to test independence between lists of enriched or significantly DEGs from neurons and astrocytes from different brain regions and to obtain estimated odds ratios. RNA-seq coverage tracks for selected genes were generated using Integrative Genomics Viewer (IGV) (v2.4.14) and plotted in a 5′ to 3′ direction. Hierarchical clustering analysis was performed using "Manhattan" distance and "Ward.2" clustering method metrics to visualize significantly up-regulated and downregulated genes. In the functional enrichment analysis of the datasets from astrocytes, a more restrictive filtering criterion was used to detect high significantly DEG based on simultaneous threshold of BH-adjusted P value < 0.1 and absolute log 2 FC > 0.322. This analysis revealed 508 versus 444 DEGs enriched in the thalamus and cortex, respectively. The GO overrepresentation analysis and GSEA were performed using clusterProfiler (v3.10.1) (48). All enriched terms were considered significant at adjusted P values by BH with P value cutoff < 0.01 and 0.1, in the GO overrepresentation analysis and GSEA, respectively. The reference gene set used to perform the analysis was C5 (GO Biological Process) collection from the Molecular Signatures Database (MSigDB) (v6.2).

Bioinformatic analysis of the scRNA-seq
We analyzed recent work from scRNA-seq to interrogate thalamic and cortical cellular heterogeneity (23). The sequence data are publicly available at the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under accession SRP135960 (23). Briefly, scRNA-seq datasets (postfiltered count matrices) for the thalamus and cortex were downloaded from the associated wiki and processed with Seurat R package (v3.1.4) (49). First, we performed quality control analysis that confirmed that the data were of high quality. All cells had more than 600 detected molecules (UMIs) and the proportion of mitochondrial reads was below 5% for the vast majority of cells (see fig. S2, A and B). Next, data were preprocessed (log normalization and scaling) before performing linear dimensional reduction (PCA). Graph-based clustering approach using the top 30 principal components was used to identify cell populations (resolution was fixed to 0.8). FindAllMarkers function with default parameters was used to identify gene markers for each cluster and to assign cell-type identity to clusters (see fig. S2, C and D).
Cortical and thalamic scRNA-seq datasets were subsequently integrated as previously described (50). The UMAP (Uniform Manifold Approximation and Projection) algorithm was used to nonlinear dimensionality reduction, visualization, and exploratory analysis of the datasets. Differential expression analyses between thalamic and cortical neurons and astrocytes were performed using the Find-Markers function based on the nonparametric Wilcoxon rank sum test with the following parameters (logFC.threshold = 0.1; min.pct = 0). Genes with BH-adjusted P value < 0.1 were considered significantly differentially expressed (tables S2 and S5).

In utero electroporation of StarTrack vectors
For in utero electroporation, a procedure previously described was followed (51). Pregnant females (E11.5) were deeply anesthetized with isoflurane to perform laparotomies. The embryos were exposed, and the third ventricles of the embryonic brains were visualized through the uterus with an optic fiber light source. The combination of the plasmids of the StarTrack method at a final concentration of 2 g/l was mixed with 0.1% Fast Green (Sigma-Aldrich), as previously described (12,25). The plasmids used consisted of the coding sequence of six fluorescent proteins (EGFP, mCherry, mKusabian Orange, mTSapphire, mCerulean, and EYFP) subcloned under the regulation of the GFAP or UbC promoters for targeting specifically the astrocytes or all the cell types. Each reporter gene could be directed to the cytoplasm (PB-GFAP/UbC-XFP) or to the nucleus of the cell by fusing it with the H2B histone protein (PB-GFAP/ UbC-H2B-XFP). Constructs were flanked by PiggyBac sequences to be inserted into the genome of the targeted cell by a PiggyBac transposase. The plasmids were injected into the third cerebral ventricle by an injector (Nanoliter 2010, WPI). For electroporation, five square electric pulses of 45 V and 50 ms were delivered through the uterus at 950-ms intervals using a square pulse electroporator (CUY21 Edit, NepaGene Co., Japan). The surgical incision was then closed, and embryos were allowed to develop until P8. In the electroporated animals with the UbC-StarTrack combination, tamoxifen was administered at P1 as previously described (25) for removing nonintegrated copies of the electroporated plasmids through the Cre recombinase system.

Measurement of thalamic astrocytic clones
Images were acquired with an Olympus FV1000 confocal IX81 microscope/FV10-ASW software following previously defined settings (12). All the pictures were acquired with a 20× oil immersion objective and analyzed with ImageJ software. Only electroporated animals with labeled cells in the three first order thalamic nuclei (dLG, VPM, and MGv) were used. Then, only clones with at least three cells and with the presence of more than one reporter were analyzed.
First, we assigned a binary code to every cell based on the presence or absence of each reporter protein in the cytoplasm and/or the cellular nuclei and the expression of the neuronal marker NeuN in order to distinguish neurons from glial cells. Once all the cells had been analyzed, they were grouped on the basis of their shared binary code, thereby identifying those cells that originated from the same progenitor. Then, we quantified the distribution (in %) of cells belonging to the same clone across the thalamic nuclei.

Virus production
For the production of the retrovirus, Lenti-X 293T cells (catalog no. 632180, Clontech) were plated on 5-to 10-cm dishes. Encapsulation plasmids containing gag-pol and vsv-g sequences (provided by V. Borrell) were cotransfected with the plasmid of interest using LipoD293 (catalog no. SL100668, SignaGen). The medium was changed after 5 hours, and the virus was collected after 72 hours using Lenti-X concentrator (catalog no. 631231, Clontech).

In vivo viral and BrdU injections
Pups at P3 were anesthetized on ice and placed in a digital stereotaxic. The virus was injected using an injector (Nanoliter 2010, WPI) in the thalamus or cortex through a small skull incision. BrdU was injected intraperitoneally at 50 mg/kg immediately after viral injections from stock solution (10 mg/ml).

Histology
For immunofluorescence of reprogrammed neurons in vitro, cultures were fixed with 4% paraformaldehyde (PFA) in phosphate-buffered saline (PBS) (0.01 M) for 10 to 15 min at room temperature. Cultures were first incubated for 1 hour at room temperature in a blocking solution containing 2% BSA (Sigma-Aldrich) and 0.15% Triton X-100 (Sigma-Aldrich) in 0.01 M PBS. Subsequently, the cells were incubated overnight at 4°C with the primary antibodies listed in table S6. The cells were then rinsed in 0.01 M PBS and incubated for 2 hours at room temperature with adequate secondary antibodies listed in table S6. Counterstaining was performed by the fluorescent nuclear dye 4′,6-diamidino-2-phenylindole (DAPI) (Sigma-Aldrich).
For histology in postnatal brains, mice were perfused transcardially first with 0.01 M PBS and 4% PFA. Brains were kept on 4% PFA overnight, embedded with 3% agarose in 0.01 M PBS, and cut into slices of 80 m of thickness in a vibratome (Leica). For Tbr1, Ctip2, Aldh1l1, Ror, and Lef1 antibodies, an antigen retrieval step with sodium citrate was performed. For BrdU detection, slices were first incubated with 2 N HCl and 0.5% Triton X-100 at 37°C for 30 min, followed by an incubation with borax buffer at room temperature. Slices were incubated for 1 hour at room temperature in a blocking solution containing 1% BSA, 2% donkey serum, 2% goat serum, and 0.4% Triton X-100 in 0.01 M PBS and subsequently incubated overnight at 4°C with primary antibodies. Slices were incubated for 2 hours at room temperature with the appropriate secondary antibodies, washed, incubated with DAPI, and mounted. Images were acquired with a Leica DFC550 camera into a Leica DM5000B microscope, with an Olympus FV1000 confocal IX81 microscope/FV10-ASW software, or with a Zeiss confocal LSM880. Slices were washed four times with MABT and then revealed with TSA PLUS CYANINE 3 (Akoya, SKU NEL744001KT) diluted 1/500 in MABT. Once revealed, slices were washed with MABT and then immunofluorescence was performed as described above.

Purification of total RNA and quantitative real-time PCR
For specific isolation of reprogrammed astrocytes, a previously published method was followed (20) but with some modifications for cultured cells. Astrocytes from the thalamus, cortex, dLG, VPM, and MGv were cultured and infected with Neurog2 retrovirus, and after 10 days in vitro, they were collected by applying trypsin/EDTA (Gibco) to the plate, resuspended with culture medium, and centrifuged. Reprogrammed astrocytes were fixed with PFA 1% for 10 min at 4°C, after which the PFA was quenched by adding 55 l of glycine, 1.25 M per 500 l of PFA solution. Immunocytochemistry against Tuj1 and RFP was performed, and cells were separated (Tuj1 + /RFP + versus Tuj1 − /RFP + ) by a flow cytometer (BD FACSAria) based on their fluorescence (see schema on fig. S8, C to E). Once the cells were collected, they were centrifuged and incubated for 3 hours at 50°C with lysis buffer, their RNA was purified using TRIzol (Fisher), and cells were resuspended in RNase-free water.
cDNA was obtained from 1 g of total RNA using the specific protocol for first-strand cDNA synthesis in two-step reverse transcription (RT)-PCR using the High-Capacity cDNA Reverse Transcription Kit (Fisher) and stored at −20°C. qPCR was performed in a StepOnePlus real-time PCR system (Applied Biosystems, Foster City, CA, USA) using the MicroAmp fast 96-well reaction plate (Applied Biosystems) and the Power SYBR Green PCR Master Mix (Applied Biosystems). The primers used for detecting the expression of the different genes are listed in table S7. A master mix was prepared for each primer set containing the appropriate volume of SYBR Green, primers, and template cDNA. All reactions were performed in triplicate. The amplification efficiency for each primer pair and the cycle threshold (Ct) were determined automatically by the StepOne Software, v2.2.2 (Applied Biosystems). Transcript levels were represented relative to the Gapdh signal, adjusting for the variability in cDNA library preparation.

Patch-clamp recordings of iNs
For the electrophysiological analysis, astrocytes were infected with a retrovirus encoding CAG-Neurog2-ires-TauGFP. After 3 weeks, cultures were transferred to the recording chamber and were perfused with standard artificial cerebrospinal fluid (aCSF) containing the following: 119 mM NaCl, 5 mM KCl, 1.3 mM MgSO 4 , 2.4 mM CaCl 2 , 1 mM NaH 2 PO 4 , 26 mM Na 2 HCO 3 , and 11 mM glucose. The aCSF was perfused at a rate of 2.7 ml min −1 , continuously bubbled with a gas mixture of 95% O 2 + 5% CO 2 , and warmed at 30° to 32°C.
Somatic whole-cell recordings were made under visual control using an upright microscope (Leica DM-LFSA) and a water immersion (20 or 40×) objective. The intracellular solution contained the following: 130 mM K-gluconate, 5 mM KCl, 5 mM NaCl, 0.2 mM EGTA, 10 mM Hepes, 4 mM Mg-ATP, and 0.4 mM Na-GTP, pH 7.2 adjusted with KOH; 285 to 295 mOsm. Recordings were obtained in current-clamp and/or voltage-clamp mode with a patch-clamp amplifier (MultiClamp 700A, Molecular Devices, USA). No correction was made for the pipette junction potential. Voltage and current signals were filtered at 2 to 4 kHz and digitized at 20 kHz with a 16-bit resolution analog to digital converter (Digidata 1550B, Axon Instruments). The generation and acquisition of pulses were controlled by pClamp 10.6 software (Axon Instruments). Patch pipettes were made from borosilicate glass [1.5 mm OD (outer diameter), 0.86 mm ID (inner diameter), with inner filament] and had a resistance of 4 to 7 megohms when filled. Neurons in which series resistance was >30 megohms were discarded. Quantification of intrinsic membrane properties and spontaneous neuronal activity was performed on Clampfit 10.7 (Axon Instruments). The presence of putative spontaneous excitatory postsynaptic currents (sEPSCs) was assessed in voltage clamp recordings at −70 mV.

ChIP for H3K4me3 and H3K27me3
ChIP assays were performed following a previously published protocol (55). Cultured astrocytes from the thalamus and cortex were collected after 1 week in vitro when confluence is reached, centrifuged, and resuspended to approximately 500,000 cells. Cells were fixed with 1% PFA in PBS for 10 min at room temperature and quenched with 55 l of glycine, 1.25 M per 500 l of PFA solution with orbital shaking. After that, cells were lysed in 300 l of SDS lysis buffer (0.5% SDS, 10 mM EDTA, and 50 mM tris-HCl) supplemented with protease inhibitor cocktail (Roche, 11836153001), sonicated for 10 min in a Diagenode Bioruptor Pico, precleared with 30 l of washed Dynabeads (Invitrogen, 10003D), and diluted five times in ChIP IP buffer [20 mM Hepes, 0.2 M NaCl, 2 mM EDTA, 0.1% Na-DOC, 1% Triton X-100, and BSA (5 mg/ml)]. One percent of each sample was kept as input. Samples were divided into three tubes and incubated overnight at 4°C in a rotating wheel with 2.5 g per tube of the anti-H3K4me3 (Sigma-Aldrich, 07-473), anti-H3K27me3 (Abcam, ab6002), or control IgG antibody. The next day, washed and saturated Dynabeads were added and incubated with the samples for 2 hours at 4°C. Dynabeads were washed five times with LiCl buffer (50 mM Hepes, 1 mM EDTA, 1% NP-40, 1% Na-DOC, and 0.5 M LiCl) and once with TE buffer (10 mM Tris-HCl and 1 mM EDTA). Antibody/chromatin complexes together with the inputs were eluted by adding 100 l of elution buffer (50 mM NaHCO 3 and 1% SDS), 10 l of NaCl (5 M), and 1 l of proteinase K (Sigma-Aldrich, 3115836001) to each tube and put on a thermomixer, shaking at 1000 rpm at least 2 hours at 60°C. Samples and inputs were decross-linked by heating for 15 min at 95°C. Both samples and inputs were treated with RNase A (Roche, 10109142001) for 30 min at 65°C, and the DNA was purified with phenol/chloroform and ethanolprecipitated. Primers used for detecting the immunoprecipitated genomic regions are listed in table S7.

Primer design
For RNA expression analysis, Primer3 and Blast tools from NCBI webpage were used, using the accession numbers of the coding sequences of the genes of interest. For ChIP experiments, we used the information obtained from the in silico Neurog2 binding sites analysis and the open-source information of the ENCODE project. For primers design, regions on the promoters of candidate genes that included a putative binding site for Neurog2 and that were enriched in H3K4me3 and H3K27me3 signal were selected.

Quantification and statistical analysis
Statistical analysis was carried out in GraphPad Prism (v.6), Origin (v.8.0), and R (v3.5.1 Feather Spray) statistical computing and graphics platform. Data are presented as means ± SEM or with box-andwhisker plots, which give the median, 25th and 75th percentiles, and range. Statistical comparison between groups was performed using paired or unpaired two-tailed Student's t test or Mann-Whitney U test nonparametric two-tailed test when data failed a Kolmogorov-Smirnov or a Shapiro-Wilk normality test. For multiple comparison analysis, a one-way analysis of variance (ANOVA) test with Holm-Sidak's multiple comparisons test was used, and Kruskal-Wallis test with Dunn's multiple comparisons test was used when data failed a Kolmogorov-Smirnov or a Shapiro-Wilk normality test. Simple effect analysis was performed when interaction was significant. P values < 0.05 were considered statistically significant and set as follows: *P < 0.05, **P < 0.005, and ***P < 0.0005. In the bioinformatical analysis, DEGs were identified using a statistical significance threshold (BHadjusted P value < 0.1) and set as follows: *adj. P < 0.1, **adj. P < 0.01, and ***adj. P < 0.001. No statistical methods were used to predetermine the sample size, but our sample sizes are considered adequate for the experiments and consistent with the literature. The mice were not randomized. The investigators were blinded to sample identity.