![]() | ![]() |
Formats:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright : © 2006 Estrada et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. An Integrated Strategy for Analyzing the Unique Developmental Programs of Different Myoblast Subtypes 1 Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America 2 Howard Hughes Medical Institute, Boston, Massachusetts, United States of America 3 Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America Greg Gibson, Editor North Carolina State University, United States of America #Contributed equally. * To whom correspondence should be addressed. E-mail: michelson/at/receptor.med.harvard.edu ¤a Current address: Centre de Recherche du Centre Hospitalier de l'Université Laval (CRCHUL), Québec, Canada ¤b Current address: Department of Dermatology, Massachusetts General Hospital, Charlestown, Massachusetts, United States of America ¤c Current address: Department of Biochemistry and Center of Excellence in Bioinformatics, State University of New York at Buffalo, Buffalo, New York, United States of America Received November 21, 2005; Accepted December 28, 2005. This article has been cited by other articles in PMC.Abstract An important but largely unmet challenge in understanding the mechanisms that govern the formation of specific organs is to decipher the complex and dynamic genetic programs exhibited by the diversity of cell types within the tissue of interest. Here, we use an integrated genetic, genomic, and computational strategy to comprehensively determine the molecular identities of distinct myoblast subpopulations within the Drosophila embryonic mesoderm at the time that cell fates are initially specified. A compendium of gene expression profiles was generated for primary mesodermal cells purified by flow cytometry from appropriately staged wild-type embryos and from 12 genotypes in which myogenesis was selectively and predictably perturbed. A statistical meta-analysis of these pooled datasets—based on expected trends in gene expression and on the relative contribution of each genotype to the detection of known muscle genes—provisionally assigned hundreds of differentially expressed genes to particular myoblast subtypes. Whole embryo in situ hybridizations were then used to validate the majority of these predictions, thereby enabling true-positive detection rates to be estimated for the microarray data. This combined analysis reveals that myoblasts exhibit much greater gene expression heterogeneity and overall complexity than was previously appreciated. Moreover, it implicates the involvement of large numbers of uncharacterized, differentially expressed genes in myogenic specification and subsequent morphogenesis. These findings also underscore a requirement for considerable regulatory specificity for generating diverse myoblast identities. Finally, to illustrate how the developmental functions of newly identified myoblast genes can be efficiently surveyed, a rapid RNA interference assay that can be scored in living embryos was developed and applied to selected genes. This integrated strategy for examining embryonic gene expression and function provides a substantially expanded framework for further studies of this model developmental system. Synopsis Animal development requires cells in complex organs to acquire distinct identities. During the development of the body wall musculature of the fruit fly, a pool of apparently identical cells gives rise to two types of muscle precursors, both of which are required for the appearance of functioning muscles. These identities depend on broad programs of gene expression. The authors attempt to dissect the complements of expressed genes that define these two different cell types by integrating modern methods in genetics, genomics, and informatics. By purifying informative cells from normal embryos and mutants that perturb muscle development, assaying their genomewide gene expression programs, and combining experiments statistically, they have identified fivefold more founder-specific genes than were previously suspected to characterize this cell type. The expression patterns of hundreds of genes were examined in whole embryos to test the statistical predictions, permitting the authors to estimate how many more cell type–specific genes remain to be discovered. Finally, dozens of the genes highlighted by these methods were tested for direct involvement in muscle development, and several new players in this process are reported. The integrated strategy used here can be generalized for studying genetic programs in other complex tissues. Introduction Transcriptional regulation plays a central role in metazoan development by establishing cell-specific patterns of gene expression that represent coordinate responses to extrinsic signals and intrinsic programming [1,2]. Thus, detailed knowledge of the genes that are spatially and temporally coexpressed at the cellular level in a particular developmental context will not only provide insight into the logic of transcriptional networks but also define the downstream effectors of morphogenesis. Given the cellular diversity present in most tissues, it would be ideal to derive the entire genetic program of each individual cell type and to determine the response of each differentially expressed gene to perturbations of the pathways that regulate formation of that organ. Defining such cell-specific gene expression signatures and mapping the sequential steps involved in their generation are both essential to achieving a systems-level view of development [3,4]. Traditional studies have monitored only one or a few cell-type specific markers at a time using different genetic backgrounds to perturb the developmental process of interest. In many cases, such approaches have yielded sets of regulatory inputs and responses that provide the conceptual underpinnings for considering development in the broader terms of component interactions and network architecture [5,6]. However, to test the generality of hypotheses derived from the study of small numbers of genes, it is essential to acquire a comprehensive assessment of the gene expression changes occurring in response to a known set of developmental regulators. Elaborating an integrated and systematic experimental approach to identify and functionally characterize such genes and their cis-regulatory sequences in a metazoan model organism remains a significant and largely unsolved challenge. In yeast, pooled expression profiles derived for multiple genotypes and chemical treatments have proved extremely valuable for dissecting biological pathways [7]. In principle, it should be possible to generate equally illuminating expression profile compendia for the development of multicellular organisms. Large numbers of datasets have been combined in a few cases for this purpose [8,9], but these studies did not focus on a particular aspect of development. Here, we have used such a comprehensive approach to examine the molecular identities of myoblast subtypes in the Drosophila embryo, results that yield new information about the composition of the muscle regulatory network. Myogenesis initiates with the segregation of two types of myoblasts from the somatic mesoderm: founder cells (FCs) and fusion-competent myoblasts (FCMs) [10]. Each FC possesses a unique identity and seeds the formation of an individual myotube by fusing with the more homogeneous population of FCMs. Of the known early muscle-specific genes, some are specific to only one myoblast type, while others are expressed in both. Many of these genes encode transcription factors that are essential for myoblast specification [11–16]. Intercellular signals act in different combinations to promote the formation of FCs and FCMs [10,17]. This process is best understood for a subset of FCs that express even skipped (eve) [18–21]. Wingless (Wg, a Wnt family member) and Decapentaplegic (Dpp, a member of the bone morphogenetic protein superfamily) first cooperate to render a large domain of mesodermal cells competent to respond to a subsequent inductive signal mediated by two receptor tyrosine kinases (RTKs), an epidermal growth factor (EGF) receptor (EGFR) and the fibroblast growth factor (FGF) receptor (FGFR) encoded by heartless (htl). Localized RTK activation within the competence domain stimulates the Ras pathway and the formation of Eve-expressing equivalence groups [18]. Lateral inhibitory signaling by Notch then allows a single Eve progenitor to emerge from each equivalence group under the continued influence of Ras [19], with the remaining Notch-inhibited cells assuming an FCM identity characterized by expression of lame duck (lmd) [14–16]. Since FCs are derived by the asymmetric division of progenitors [22,23], the Ras pathway favors FC formation, while Notch promotes FCM development from mesodermal equivalence groups. Integration of the Wg, Dpp, and Ras pathways occurs through the direct convergent regulation of eve by the three corresponding signal-activated transcription factors bound to a specific enhancer in the context of two mesodermal selectors [24–26]. Thus, distinct myoblast identity codes are generated by the combinatorial functions of Wg, Dpp, EGF, FGF, and Notch signals. These signaling codes are in turn mirrored in transcriptional codes that induce the changes in gene expression that are characteristic of individual FCs and FCMs. Collectively, this knowledge provides the logical foundation for genomic and computational investigations of muscle gene transcriptional regulation in the Drosophila embryo. Gene expression profiling of the Drosophila embryonic mesoderm has been undertaken in several prior studies. In one approach, mutations in early dorsoventral patterning genes were used to eliminate or overproduce mesodermal cells, and genes whose expression is enriched in the mesoderm were identified by microarray analysis [14,27]. A modification of this approach in which the Ras or Notch pathway was constitutively activated in a Toll10b mutant—a genetic background that drastically disrupts gastrulation and converts the entire embryo to mesoderm—led to the identification of a small number of genes that are specific to FCs or FCMs [28]. However, the latter study was limited by several factors, including the complete lack of inductive ectoderm and its differentiated derivatives in Toll10b embryos, the absence of Dpp in these embryos, the disruption of normal cellular interactions within the overproduced mesoderm, independent validation of only a few microarray predictions so that a true-positive detection rate could not be reliably estimated, and the use of a cDNA microarray that represented only 40% of the genes in the entire Drosophila genome. It is likely, therefore, that many more FC and FCM genes remain to be discovered. To address this question, we designed a different strategy for analyzing cell type–specific genetic programs for a complex tissue that circumvents the previously encountered difficulties and is more generally applicable. This approach integrates genetic perturbations of development, purification of primary embryonic cells of interest, microarray-based genomewide transcriptional profiling, statistical meta-analysis of the pooled gene expression datasets, and large-scale validation by in situ hybridization of gene expression patterns predicted by the computational analysis. Applying this strategy, we identified and validated several hundred genes that are uniquely expressed in FCs, FCMs, or both myoblast types. Finally, we used in vivo RNA interference (RNAi) to rapidly assess the myogenic functions of several newly identified myoblast genes. In a separate but complementary effort, information derived from the present studies was applied to a new computational method for analyzing the relative contribution of individual transcription factor binding sites to combinatorial transcriptional codes (A. A. Philippakis, B. Busser, S. S. Gisselbrecht, F. S. He, B. Estrada, A. M. Michelson, and M. L. Bulyk, unpublished data). Taken together, the systematic strategy used here provides significant new insights into embryonic myogenesis and represents an integrated experimental framework that can be applied to related investigations in other developmental contexts. Results Purification of Mesodermal Cells by Flow Cytometry To increase the sensitivity of detecting myoblast transcripts in microarray expression profiling experiments, we first developed a method to purify both wild-type and mutant cells of interest from whole Drosophila embryos. Green fluorescent protein (GFP) was targeted to the mesoderm using the Gal4-UAS technique, with twi-Gal4 as a specific driver and a UAS-GFP transgene as the reporter (Figure 1
Embryos were collected, incubated to the stage during which FCs and FCMs are specified, and then gently dissociated to yield a single cell suspension. GFP-expressing and non–GFP-expressing cells were separated by fluorescence activated cell sorting (FACS), total cellular RNA was isolated from each population, and the RNA was labeled for hybridization to Affymetrix GeneChip arrays (Figure 1 Identification of Genes with Enriched Expression in Wild-Type Mesodermal Cells We first compared the RNA profiles for GFP-positive versus GFP-negative cells purified from wild-type embryos. Using the statistical methods detailed in Protocol S1, Analysis Method A, 335 probe sets were identified to have higher expression levels in GFP-positive cells than in the rest of the embryo. Of these, approximately 200 had not previously been described as having mesodermal expression. To validate these results, we undertook in situ hybridizations in wild-type embryos using probes corresponding to 207 genes enriched in the GFP-positive population (including some that had been described previously but had not been extensively characterized). Combining these results with data from the literature, we calculated a true-positive detection rate of 95.3% for genes enriched in GFP-expressing cells. Genes expressed in a wide variety of mesodermal derivatives were identified, including somatic and visceral muscle precursors, fat body, hemocytes, and heart (Figure S1 and Table S1). Having established the feasibility of expression profiling FACS-purified mesodermal cells, further experiments were designed to more completely characterize the expression programs of different myoblast subpopulations. Prediction of Candidate Myoblast-Specific Genes from a Compendium of Mesodermal Gene Expression Profiles A key feature of our experimental strategy is the use of specific genetic backgrounds to selectively perturb gene expression based on existing knowledge of relevant developmental pathways. The intercellular signaling network involved in Drosophila FC and FCM development is shown in Figure 2
A compendium of gene expression profiles specifically targeted to muscle development was generated for mesodermal cells purified from 12 genetic backgrounds (Figure 2
To score the genes with respect to FC- or FCM-like expression response, we used a statistical metric (“T”) [34], which is a weighted sum of the t-statistics from each genotype versus wild-type comparison (Protocol S1, Analysis Method E). The weights in this sum were optimized to account for the differential sensitivity of the genotypes in detecting training sets of FC or FCM genes (Figure 3 When all genes were ordered based on their FC and FCM T-scores, both training sets were preferentially located at the tops of their respective ranks (P < 10−13 for FC genes and P < 10−14 for FCM genes, using the Wilcoxon-Mann-Whitney U test; Table S2 and Figure 3 From the targeted expression profile compendium, we predicted a total of 373 (q = 0.002) and 276 (q = 0.002) genes with FC- and FCM-like responses, respectively (Protocol S1, Analysis Method F; Figure S2B and S2D). After extensive follow-up using in situ hybridization, lists of validated FC, FCM, or FC + FCM genes were then queried for relative enrichment of Gene Ontology (GO) terms (Table S3). For FC genes, overrepresented molecular function categories include transcriptional regulation, transmembrane receptor protein kinase activity, cytoskeletal protein binding, and small GTPase regulatory/interacting proteins, with enrichment for biological processes such as cell surface receptor–linked signal transduction, cell adhesion, cell motility, small GTPase mediated signal transduction, and mesoderm cell fate specification. In contrast, the validated FC + FCM gene candidates are biased toward ribosome and protein biosynthesis. There were too few validated FCM genes to yield many statistically enriched GO terms, but the two that passed our cutoff criteria were muscle and mesoderm development. We next clustered the expression profiling data derived for all genotypes and found that both the training sets and subsequently identified FC and FCM genes segregate into two broad subclusters for each cell type (Figure 3 Validation of Results Derived from the Targeted Expression Profile Compendium To validate microarray meta-analysis predictions, in situ hybridizations were performed for large numbers of genes using embryos with informative genotypes. For example, since Ras gain-of-function and Dl loss-of-function overproduce FCs at the expense of FCMs [18,19,38,39], a gene specifically expressed in FCMs or FCs should have reduced or increased expression, respectively, in these genetic backgrounds (Figure 4
To assess the accuracy of the meta-analysis, we examined how many true positives are found among the genes highly ranked as being expressed in each type of myoblast (Table S2). Of 213 randomly selected genes from among the top-ranked 373 FC candidates, 118 (55%) were validated as authentic FC genes, that is, actually expressed in founder cells by embryonic in situ hybridizations in the above-mentioned genetic backgrounds. When 123 of the predicted 276 FCM genes were similarly examined by in situ hybridization, 18 (15%) were found to have FCM-specific expression patterns, while an additional 40 (33%) were found to be expressed in both FCs and FCMs. Taken together, these findings suggest that, while FC gene predictions derived from the present experimental design are very accurate, the hypothesized specificity of the genetic manipulations for FCM genes is confounded by genes that are expressed in both myoblast types. Of note, this conclusion could only be derived from the large-scale in situ hybridization data obtained here, experiments that have not frequently been undertaken in other transcriptional profiling studies to validate microarray results. Using the present findings, it is apparent that a previous microarray-based study also had a significant false-positive rate of FCM gene prediction, although the authentic FC gene discovery rate in that case was comparably high. However, it is important to note that significantly fewer total gene numbers were detected in the earlier study for both myoblast classes [28] (see Table S1 for details). Pooling all of the currently available data, 160 FC and 51 FCM genes are known, of which 131 and 45, respectively, were identified and validated in the present studies. Extrapolating from our findings, we estimate that FCs and FCMs actually express a total of about 321 and 82 unique genes, respectively (see Protocol S1, Analysis Method F). Differential Regulation of FCM Genes by the Zinc Finger Transcription Factor, Lmd Expression of the vast majority of newly identified FCM genes requires lmd, which encodes a transcription factor that is essential for FCM development [14–16] (Figure 4
Functional Analysis of Newly Identified Myoblast Genes To screen for the developmental functions of newly identified myoblast genes, we modified a whole embryo RNAi assay [40] to permit the rapid scoring of muscle patterning phenotypes. Double-stranded RNAs (dsRNAs) were injected into blastoderm embryos expressing a tau-GFP fusion protein under myosin promoter control, which enables the complete muscle pattern to be visualized after the embryos develop [41] (Figure 6
Selected RNAi results are shown in Figure 6 The live embryo RNAi assay also can be used to identify genes involved in muscle function. We found that the muscle pattern was entirely normal in embryos injected with CG2708 dsRNA, but these muscles never contracted when compared with age-matched control embryos (Video S1). CG2708 is expressed only in FCMs (Figure 4 Finally, an RNAi phenotype was obtained for chicadee (chic) that encodes a Drosophila profilin homolog [47] that is expressed specifically in FCMs. RNAi for chic is associated with complete absence of cellularization at the blastoderm stage (data not shown), presumably due to dsRNA effects on both maternal and zygotic transcripts. Due to its maternal expression and essential involvement in oogenesis, it has not previously been possible to assess the early embryonic functions of chic using germline clonal analysis [48], underscoring another advantage of the RNAi approach used here. Discussion We have used an integrated strategy for systematically studying the development of a complex tissue by combining genetic perturbations of a particular biological process, computational analysis of a compendium of gene expression profiles that is targeted to the tissue by FACS purification of the cells of interest, large-scale validation of predicted gene expression patterns by whole embryo in situ hybridization, and RNAi-based functional studies of newly discovered genes. Specifically, we identified large numbers of genes that are coexpressed in different subsets of myoblasts by analyzing pooled microarray data obtained for embryonic mesodermal cells purified from multiple genetic backgrounds in which muscle development is selectively perturbed. A whole embryo RNAi assay then revealed the developmental functions of selected myoblast-specific genes. Collectively, the present work contributes valuable information to a more detailed understanding of the regulatory network governing somatic myogenesis in the Drosophila embryo, provides a substantially expanded framework for future studies of this developmental process, and offers a unified experimental approach that can be applied to other systems. Transcriptional Profiling of Complex Tissues Cell-specific genetic programs must be delineated in order to fully understand how diverse cellular identities are established during tissue and organ formation. Previous studies have addressed various aspects of metazoan development by combining genetic and genomic methods [9,14,27,28,49–55]. While highly informative for temporal aspects of gene expression in whole animals [50], in revealing sex-biased transcription [53], or in yielding cell-specific wild-type expression profiles [49,51,54,55], such studies have not examined the global changes in gene expression that are associated with genetic manipulations of regulatory pathways affecting the tissue of interest. Mutants that perturb large numbers of cells arising from subdomains of an embryonic axis have been used to enrich for the detection of tissue-specific transcripts, a strategy that works best for early aspects of development [14,27,28]. However, this genetic approach complicates the analysis of later steps in organogenesis since tissue organization and intercellular communication are severely disrupted by these major patterning mutations [28]. Perturbation of a single regulatory pathway in whole embryos has also been used for the discovery of cell-specific genes, but efforts like this have been limited by very high false-positive detection rates because the signal from the cells of interest is diluted by the rest of the embryo [52]. The present approach provides two major advantages for determining the gene expression programs of separate cell types in a developing embryo. First, isolating the tissue of interest—even without purifying individual cell populations—substantially increases the sensitivity of microarray experiments. Second, perturbation of multiple convergent pathways significantly augments both the statistical and biological power of the microarray compendium to resolve cell-specific expression patterns. While independent replicas of the same genotype yield statistical power, use of multiple genotypes has the additional benefit of reducing systematic biases that may be associated with a single genetic manipulation. Indeed, we found that different genotypes have distinct capacities to detect FC versus FCM genes, suggesting that perturbing multiple pathways is a more effective means to query diverse cell types present in the isolated tissue. For instance, the overall sensitivity of the approach is reflected in the high FC meta-analysis rank obtained for eve (108), even though it is expressed in less than 1% of mesodermal cells. Purification of specific cells and the inclusion of multiple informative genotypes in the acquisition of genomewide expression data for a particular tissue—what we have termed a targeted expression profile compendium—provide additional information that has not been available from prior genomic studies of mesoderm development [14,27,28]. For example, a related microarray analysis of myoblast gene expression [28] predicted a total of only 33 FC and 48 FCM genes compared with 373 and 276, respectively, predicted here. Several important differences in experimental design can account for the disparate outcomes of the two approaches, including use of different numbers of genetic perturbations of FC and FCM development (two in the previous study versus 12 here), different microarray platforms representing dissimilar fractions of the genome, and the absence of Dpp as an FC determining signal in the embryos used in the earlier study [28]. In this regard, we found that Dpp contributes significantly to FC gene identification, so its inclusion in any experimental analysis of muscle development appears to be critical. Our findings emphasize the importance of independently validating microarray data and computational predictions of genes expressed in different cell populations. Whereas whole embryo in situ hybridizations revealed that the FC gene prediction rate was very high, the fraction of true positive FCM genes was considerably smaller when the same datasets were analyzed using a similar rationale and statistical methods. The in situ hybridization results further demonstrated that the observed difference in the accuracy of FC and FCM gene prediction rates is largely attributable to an unanticipated number of genes expressed in both myoblast types that, from the microarray data analysis alone, were incorrectly scored as FCM-specific genes. This last outcome most likely occurred because transcripts expressed in both FCs and FCMs followed an FCM-specific pattern in the genetic perturbation and microarray experiments owing to the fact that FCMs greatly outnumber FCs in the purified cell fraction. This issue notwithstanding, the integrated approach we used facilitated the efficient identification of several hundred genes having different myoblast-specific expression patterns while entailing quite manageable false positive detection rates. The transcriptional profiling strategy elaborated here offers an information-rich approach that can be applied to other model organisms and developmental processes. Indeed, because the present experiments employed a general mesodermal Gal4 driver, the existing compendium of expression profiles should be applicable to mesodermal derivatives other than somatic muscle. Consistent with this expectation, a preliminary meta-analysis using a relevant subset of the present data was effective in predicting genes with cardiac expression (SEC and AMM, unpublished results). The sensitivity and specificity of these analyses can be further optimized by using the most appropriate combination of mutants, and by selectively targeting GFP for cell purification. Perhaps most important, the collective expression data obtained from such experiments provide vast amounts of information about the various regulatory inputs to each identified gene and allow detailed molecular signatures to be derived for specific cells within a complex tissue. Unanticipated Complexity of the Drosophila Muscle Regulatory Network Muscle FCs are specified by the convergent inputs of multiple intercellular signals [10,17]. The differential expression of a few cell-specific markers has in the past suggested that individual FCs have distinct signaling responses, causing each to acquire a unique identity prior to its differentiation into a particular muscle. With the discovery of substantially more genes expressed in different FC subsets, the present work substantiates this hypothesis. Moreover, earlier studies anticipated that distinct but related transcriptional codes would be responsible for different patterns of FC gene expression [24,56]. This model is supported by recent computational and empirical analyses of candidate cis-regulatory modules associated with the FC genes newly identified here (A. A. Philippakis, B. Busser, S. S. Gisselbrecht, F. S. He, B. Estrada, A. M. Michelson, and M. L. Bulyk, unpublished data). In contrast to FCs, the FCM population has been thought to be relatively homogeneous [15,16], an idea that is not supported by our findings. Rather, this second myoblast class is quite heterogeneous, and the control of FCM gene expression—while having some common features—is not uniform. For example, although transcription of most FCM genes requires lmd, others are entirely lmd independent. Still other FCM genes exhibit regional differences in their responses to perturbations of Ras and Notch signaling, while some lmd-dependent genes are not expressed in all FCMs in which Lmd is found. Finally, a subset of FCM genes is differentially controlled by Ras, Notch, and Lmd in the somatic and visceral subdivisions of the mesoderm, even though both types of muscle arise through fusion of similar myoblasts [57]. FCs and FCMs were found to have gene expression signatures comprising large numbers of unique genes, as well as numerous shared transcripts. Whereas transcription factors, signal transduction components, and adhesion molecules are overrepresented in FCs, proteins associated with metabolic functions predominate in both myoblast classes. The prominent expression of regulatory genes in FCs is in agreement with prior evidence that these myoblasts contain specific determinants of muscle identity [18,42] and suggests that cell fusion plays an important role in the acquisition of unique genetic programs by individual myotubes. Functions of Newly Identified Myoblast Genes The specific functions of each myoblast type are further emphasized by our RNAi results. For example, sola—which encodes the Drosophila homolog of verprolin, an actin binding protein—is expressed only in FCMs and is essential for myoblast fusion. Moreover, profilin, another actin binding protein encoded by chic [48], is also restricted to FCMs. These findings imply a different function or mode of regulation of the actin cytoskeleton in FCMs as opposed to FCs during fusion. While the cytoskeleton has previously been implicated in myotube formation [58], an asymmetrically expressed cytoskeletal component has not been uncovered, further highlighting the unique nature of the cytoskeleton in these myoblasts. In contrast, RNAi directed against the FC-specific gene suel/CG17492 causes an early myospheroid phenotype in a subset of muscles, suggesting a defect in myotube pathfinding and/or in formation of stable epidermal attachments, functions characteristic of FCs [42]. Although whole-genome RNAi screens have proved to be highly informative for C. elegans and for cultured cells where efficient dsRNA delivery methods are available [59], they are technically much more difficult to apply to Drosophila embryos. Restricting a whole embryo RNAi screen to a list of genes having tissue-specific expression patterns offers a more efficient approach to such functional discovery. This concept can also be applied to large-scale RNAi analysis of mouse embryonic development. The experimental strategy presented here has provided substantial insight into the complexity of components involved in muscle development in the Drosophila embryo. Many of our conclusions could only be drawn by examining the large, interrelated datasets that comprise a targeted expression profile compendium. Other findings are derived from more traditional studies of single genes that nevertheless depended on genomewide approaches for their identification. Further analysis of our existing results, and expansion of this database by performing similar experiments with additional informative genotypes and with smaller subsets of purified cells, should yield even greater knowledge of the architecture and function of the myogenic network. Furthermore, application of this integrated set of approaches in other developmental contexts, both in Drosophila and in other model organisms, can offer a systems-level view of cell fate specification and morphogenesis that provides a wealth of hypotheses for further testing by genetic and biochemical methods. Materials and Methods Fly strains and genetics. The following Drosophila stocks were used to obtain both wild-type and genetically modified mesodermal cells expressing GFP: twi-Gal4 UAS-2EGFP [30], UAS-λtop (constitutively activated EGFR) [60], UAS-dof UAS-λ-htl (constitutively activated Heartless FGFR together with Downstream of FGFR/Heartbroken/Stumps) [61,62], UAS-Ras1Act (activated Ras) [18], UAS-pntP2VP16 (activated Pointed) [24], UAS-tkvQD (activated Thick veins) [63], UAS-arms10 (activated Armadillo)[64], UAS-arms10; UAS-Ras1Act, SG24 wgCX4/CyO, wgIG22 UAS-2EGFP, UAS-Nintra [65], twi-Gal4 lmd1/TM3 ftz-lacZ, UAS-2EGFP lmd2/TM3 ftz-lacZ, twi-Gal4 DlX/TM3 ftz-lacZ, and UAS-2EGFP DlX/TM3 ftz-lacZ. The following stocks were used to determine gene expression patterns in mutant backgrounds: twi-Gal4, UAS-Ras1Act, DlX/TM3 ftz-lacz, and lmd1/TM3 ftz-lacz. The enhancer trap line rp298lacz was used to test for localization of gene expression to founder cells [32]. Fluorescence-activated sorting of cells from Drosophila embryos. Freshly laid embryos were collected and aged to stage 11, at which point a single cell suspension was prepared. Cells were separated into GFP-positive and -negative cell populations using a flow cytometer (see Protocol S1 for details). Microarray experiments and data analysis. Total cellular RNA (2.5 to 3 μg) was labeled in one round of linear amplification and used for hybridization to a single Affymetrix GeneChip using standard methods recommended by the manufacturer (http://www.affymetrix.com/support/technical/manual/expression_manual.affx). Each RNA sample was independently labeled and hybridized in triplicate. A detailed description of all computational methods used for analyzing the expression data can be found in Protocol S1. In situ hybridization and immunohistochemistry. Digoxigenin-labeled antisense RNA probes were synthesized using cDNA clones obtained from the Drosophila Gene Collection (DGC1 and DGC2, http://www.fruitfly.org/DGC/index.html). For genes without an available cDNA, gene-specific PCR primers were designed. A microtiter plate method was used for parallel synthesis of multiple probes (http://www.fruitfly.org/about/methods/RNAinsitu.html). In calculating the true-positive detection rate for genes enriched in wild-type GFP-expressing cells, we considered as true positive every gene validated as having mesodermal expression by our in situ hybridizations or annotated as such in the BDGP in situ database or in the published literature (Table S1); a small number of genes were included in this GFP-positive category that were found to be expressed in nonmesodermal cells that nevertheless expressed GFP at stage 11 under twi-Gal4 control (for example, due to GFP perdurance in cells of the endodermal and mesectodermal primordia in which twi is expressed at earlier stages [unpublished data]). Antibody stainings were carried out as described [18] Rabbit anti-Lmd (from H. Nguyen) was used at 1:1,000. Homozygous Dl or lmd mutant embryos were identified using a lacZ-marked TM3 balancer chromosome. RNA interference assay. Gene segments for dsRNA synthesis were selected to be 300 to 700 bp in length and common to all predicted splice variants of the targeted gene and to lack any consecutive 18 bp of identity to any other predicted gene. These sequences were PCR-amplified from primary embryonic cDNA using primers that incorporated T7 promoters on both ends (primer sequences are available upon request). Purified PCR product was transcribed in vitro and purified using the MEGAscript RNAi kit (Ambion, Austin, Texas, United States), precipitated, resuspended, and diluted to 2 mg/ml in DEPC-treated 1× injection buffer [40]. Dechorionated MHC-tau-GFP embryos [41] were injected mid-ventrally during the syncytial blastoderm stage, then allowed to develop to stage 16 to 17 before assessment. Each gene was initially injected and scored blindly, with negative control (lacZ dsRNA) and positive control (mbc or blow dsRNA) injections performed in parallel. Only embryos that developed robust GFP expression and lacked obvious major morphological defects (typically 60% to 80% of those injected) were included in the analysis. Figure S1: Embryonic Expression Patterns of Selected Genes Identified in Microarray Experiments as Being Enriched in Wild-Type Mesoderm RNA in situ hybridization shows that validated mesodermally enriched genes are expressed in different populations of mesodermal cells at stage 11, including somatic and visceral muscle precursors (A, C, D–N, P and A, C–L, O, respectively), hemocytes (O), and cardiac primordium (D, E, I, and L–N). Arrowhead (I) indicates representative cardiac primordium; arrow (K) indicates visceral mesoderm; asterisk (O) indicates hemocytes. (3.0 MB PDF) Click here for additional data file.(2.9M, pdf) Figure S2: Supporting Figures for FC and FCM Gene Meta-Analyses (A and C) Bar plot showing the weight of each genotype in the meta-analysis to identify genes with FC- and FCM-like expression, respectively (Protocol S1, Analysis E). Error bars show the standard deviation of weights within the approximately 2,000 weight profiles used to calculate each average weight profile. (B and D) Normalized median absolute deviation between the meta-analysis gene rank (x-axis) and individual genotype ranks (Protocol S1, Analysis F). The graph shows the average over all the genotypes, using the weights in (A) and (C), respectively. The black vertical line highlights the point at which the data cross the trend line (blue) derived from a smoothing function (see Protocol S1, Analysis Method F). (152 KB PDF) Click here for additional data file.(153K, pdf) Table S1: Description of Mesodermally Enriched Genes; Validated FC, FCM, and FC + FCM Genes; and Results from In Situ Hybridization and RNAi Experiments (337 KB XLS) Click here for additional data file.(338K, xls) Table S2: Meta-Analysis of the Transcriptional Profiling Data: Ranking of All Affymetrix Probe Sets by FC- or FCM-Like Gene Expression Pattern (5.9 MB XLS) Click here for additional data file.(5.7M, xls) Table S3: Comparison of Top-Ranking FC and FCM Gene Lists in Terms of GO Functional Category Enrichment (48 KB XLS) Click here for additional data file.(49K, xls) Video S1: Inactivation of the FCM Gene CG2708 by Injection of dsRNA Renders Embryos (right) Immotile When Compared with Age-Matched Embryos Injected with an Inactive Control dsRNA (left) Confocal images of GFP fluorescence were collected at 5-s intervals and presented at five frames per second. (1.4 MB MOV) Click here for additional data file.(1.3M, mov) Accession Numbers Microarray data described in the text are available from the Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo) with the accession number GSE3854. Flybase (www.flybase.org) ID numbers for genes cited in the text are eve, FBgn0000606; wg, FBgn0004009; dpp, FBgn0000490; Egfr, FBgn0003731; htl, FBgn0010389; Ras85D, FBgn0003205; N, FBgn0004647; lmd, FBgn0039039; Tl, FBgn0003717; twi, FBgn0003900; duf, FBgn0028369; Dl, FBgn0000463; CG13503, FBgn0034695; CG17492, FBgn0032742; CG2708, FBgn0010812; chic, FBgn0000308; dof, FBgn0020299; ftz, FBgn0001077; blow, FBgn0004133; CG14207, FBgn0031037; CG10275, FBgn0032683; CG10641, FBgn0032731; NHP2, FBgn0029148; RpI135, FBgn0003278; sns, FBgn0024189; GFP, FBgn0014446; Gal4, FBgn0014445; and lacZ, FBgn0014447. Acknowledgments We thank Jim Skeath, Hanh Nguyen, and Elizabeth Chen for fly stocks and antibodies; Jun Lu for initial advice in preparing embryo cell suspensions; John Daley and Susan Lazo for expert assistance with cell sorting; Josh Bayes, Bryan McGowan, Chris Benway, Lien Phun, Meryl Gold, and Trent Rector for technical support; and Anthony Philippakis, Martha Bulyk, Norbert Perrimon, and Richard Maas for illuminating discussions and comments on the manuscript. Abbreviations
Footnotes Author contributions. BE, SEC, MSH, and AMM conceived and designed the experiments. BE, SSG, SM, LR, and BWB performed the experiments. BE, SEC, SSG, SM, LR, BWB, and AMM analyzed the data. BE, SE, SSG, SM, LR, BWB, MSH, GMC, and AMM contributed reagents/materials/analysis tools. BE, SEC, and AMM wrote the paper. Funding. Support for this work is derived from Howard Hughes Medical Institute (AMM), National Human Genome Research Institute (NHGRI) grant K22HG002489 (MSH), an NHGRI Centers of Excellence Genomic Science grant (GMC), the Pharmaceutical Research and Manufacturers of America Foundation Center of Excellence in Integration of Genomics & Informatics (SEC), Brigham and Women's Hospital Research Council (SEC), and National Institutes of Health grant NRSA F32 GM67483-01A1 (SEC). Competing interests. The authors have declared that no competing interests exist. Citation: Estrada B, Choe SE, Gisselbrecht SS, Michaud S, Raj L, et al. (2006) An integrated strategy for analyzing the unique developmental programs of different myoblast subtypes. PLoS Genet 2(2): e16. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
Dev Cell. 2005 Oct; 9(4):449-62.
[Dev Cell. 2005]Proc Natl Acad Sci U S A. 2005 Apr 5; 102(14):4936-42.
[Proc Natl Acad Sci U S A. 2005]Development. 2002 Mar; 129(6):1295-306.
[Development. 2002]Curr Opin Genet Dev. 2002 Oct; 12(5):540-7.
[Curr Opin Genet Dev. 2002]Cell. 2000 Jul 7; 102(1):109-26.
[Cell. 2000]J Biol. 2004; 3(5):21.
[J Biol. 2004]Science. 2001 Sep 14; 293(5537):2087-92.
[Science. 2001]Curr Opin Genet Dev. 2001 Aug; 11(4):431-9.
[Curr Opin Genet Dev. 2001]Genes Dev. 1990 Dec; 4(12A):2086-97.
[Genes Dev. 1990]Development. 2002 Jan; 129(1):133-41.
[Development. 2002]Curr Opin Genet Dev. 2004 Aug; 14(4):343-50.
[Curr Opin Genet Dev. 2004]Genes Dev. 1998 Dec 15; 12(24):3910-22.
[Genes Dev. 1998]Cell. 2000 Sep 29; 103(1):63-74.
[Cell. 2000]Dev Biol. 2002 Dec 15; 252(2):225-40.
[Dev Biol. 2002]Science. 2001 Aug 31; 293(5535):1629-33.
[Science. 2001]Cell. 2002 Nov 27; 111(5):687-701.
[Cell. 2002]Development. 2003 Dec; 130(25):6257-72.
[Development. 2003]Development. 1993 Jun; 118(2):401-15.
[Development. 1993]Genesis. 2002 Sep-Oct; 34(1-2):135-8.
[Genesis. 2002]Dev Biol. 2002 Apr 15; 244(2):226-42.
[Dev Biol. 2002]Cell. 2000 Sep 29; 103(1):63-74.
[Cell. 2000]Curr Opin Genet Dev. 2001 Aug; 11(4):431-9.
[Curr Opin Genet Dev. 2001]Curr Opin Genet Dev. 2004 Aug; 14(4):343-50.
[Curr Opin Genet Dev. 2004]Science. 2001 Aug 31; 293(5535):1629-33.
[Science. 2001]Development. 2002 Jan; 129(1):133-41.
[Development. 2002]Genes Dev. 1998 Dec 15; 12(24):3910-22.
[Genes Dev. 1998]Funct Integr Genomics. 2003 Dec; 3(4):180-8.
[Funct Integr Genomics. 2003]Proc Natl Acad Sci U S A. 2003 Aug 5; 100(16):9440-5.
[Proc Natl Acad Sci U S A. 2003]Genes Dev. 1996 Dec 15; 10(24):3183-94.
[Genes Dev. 1996]Development. 2000 Dec; 127(24):5497-508.
[Development. 2000]Genes Dev. 1998 Dec 15; 12(24):3910-22.
[Genes Dev. 1998]Dev Biol. 2002 Apr 15; 244(2):226-42.
[Dev Biol. 2002]Cell. 1991 Oct 18; 67(2):311-23.
[Cell. 1991]Cell. 2000 Jul 21; 102(2):189-98.
[Cell. 2000]Science. 2001 Aug 31; 293(5535):1629-33.
[Science. 2001]Development. 2003 Dec; 130(25):6257-72.
[Development. 2003]Science. 2001 Aug 31; 293(5535):1629-33.
[Science. 2001]Development. 2002 Jan; 129(1):133-41.
[Development. 2002]Cell. 1998 Dec 23; 95(7):1017-26.
[Cell. 1998]Dev Cell. 2001 Nov; 1(5):705-15.
[Dev Cell. 2001]Development. 1995 Jul; 121(7):1979-88.
[Development. 1995]J Cell Biol. 1997 Aug 11; 138(3):589-603.
[J Cell Biol. 1997]Am J Pathol. 2003 Oct; 163(4):1395-404.
[Am J Pathol. 2003]Genetics. 1990 Oct; 126(2):345-53.
[Genetics. 1990]Annu Rev Genet. 2002; 36():455-88.
[Annu Rev Genet. 2002]Development. 1994 Apr; 120(4):717-28.
[Development. 1994]Science. 2002 Sep 27; 297(5590):2270-5.
[Science. 2002]Genome Res. 2000 Dec; 10(12):2030-43.
[Genome Res. 2000]Proc Natl Acad Sci U S A. 1999 May 11; 96(10):5559-64.
[Proc Natl Acad Sci U S A. 1999]Dev Cell. 2002 Oct; 3(4):511-21.
[Dev Cell. 2002]Dev Cell. 2003 Mar; 4(3):383-93.
[Dev Cell. 2003]Science. 2001 Aug 31; 293(5535):1629-33.
[Science. 2001]Cell. 2002 Nov 27; 111(5):687-701.
[Cell. 2002]Development. 2003 Dec; 130(25):6257-72.
[Development. 2003]Curr Opin Genet Dev. 2001 Aug; 11(4):431-9.
[Curr Opin Genet Dev. 2001]Curr Opin Genet Dev. 2004 Aug; 14(4):343-50.
[Curr Opin Genet Dev. 2004]Cell. 2000 Sep 29; 103(1):63-74.
[Cell. 2000]Genome Res. 2002 Jul; 12(7):1019-28.
[Genome Res. 2002]Development. 2001 Nov; 128(22):4489-500.
[Development. 2001]Development. 2002 Jan; 129(1):133-41.
[Development. 2002]Development. 2001 Sep; 128(17):3331-8.
[Development. 2001]Genes Dev. 1998 Dec 15; 12(24):3910-22.
[Genes Dev. 1998]Development. 1995 Jul; 121(7):1979-88.
[Development. 1995]Development. 1994 Apr; 120(4):717-28.
[Development. 1994]Trends Cell Biol. 2004 Aug; 14(8):452-60.
[Trends Cell Biol. 2004]Development. 1995 Jul; 121(7):1979-88.
[Development. 1995]Curr Opin Genet Dev. 2004 Oct; 14(5):470-6.
[Curr Opin Genet Dev. 2004]Genesis. 2002 Sep-Oct; 34(1-2):135-8.
[Genesis. 2002]Development. 1997 Oct; 124(19):3871-80.
[Development. 1997]Development. 1998 Nov; 125(22):4379-89.
[Development. 1998]Mol Cell. 1998 Oct; 2(4):515-25.
[Mol Cell. 1998]Genes Dev. 1998 Dec 15; 12(24):3910-22.
[Genes Dev. 1998]Genes Dev. 1998 Dec 15; 12(24):3910-22.
[Genes Dev. 1998]Cell. 1998 Dec 23; 95(7):1017-26.
[Cell. 1998]Dev Cell. 2001 Nov; 1(5):705-15.
[Dev Cell. 2001]