• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. Sep 2010; 186(1): 405–410.
PMCID: PMC2940304

Global Genetic Robustness of the Alternative Splicing Machinery in Caenorhabditis elegans

Abstract

Alternative splicing is considered a major mechanism for creating multicellular diversity from a limited repertoire of genes. Here, we performed the first study of genetic variation controlling alternative splicing patterns by comprehensively identifying quantitative trait loci affecting the differential expression of transcript isoforms in a large recombinant inbred population of Caenorhabditis elegans, using a new generation of whole-genome very-high-density oligonucleotide microarrays. Using 60 experimental lines, we were able to detect 435 genes with substantial heritable variation, of which 36% were regulated at a distance (in trans). Nonetheless, we find only a very small number of examples of heritable variation in alternative splicing (22 transcripts), and most of these genes colocalize with the associated genomic loci. Our findings suggest that the regulatory mechanism of alternative splicing in C. elegans is robust toward genetic variation at the genome-wide scale, which is in striking contrast to earlier observations in humans.

ALTERNATIVE splicing of pre-mRNAs is part of gene regulation and a major mechanism for increasing the protein repertoire and the resulting phenotypic diversity. Recently, in individual cases variations in number and ratio of splice variants have also been found in Caenorhabditis elegans in different developmental stages (Barberan-Soler and Zahler 2008b), tissues (Kuroyanagi et al. 2007), and genotypes (Fischer et al. 2008). However, the smaller number of alternative splicing patterns (Kim et al. 2007) and their strong evolutionary conservation in C. elegans (Barberan-Soler and Zahler 2008a) have been interpreted as signifying a fundamental difference in the way that worms and vertebrates generate diversity from their genetic information. The relative rarity of alternative splicing and the high degree of stabilizing selection are seen as having parallels in the limited cellular complexity and highly conserved, rigid developmental programs (Zhao et al. 2008) in worms compared to humans. If this is a general trend, and not restricted to just individual cases of splicing, the conservation of splicing patterns should be reflected at the whole-genome level.

In this article we explore this question by extending the genetical genomics strategy (Jansen and Nap 2001) to the characterization of the genetic factors contributing to variations in alternative splicing in 60 C. elegans recombinant inbred line (RIL) strains. This powerful new strategy, also known as expression genetics (Schadt et al. 2003), has emerged in recent years as a versatile tool to study the genetic basis of gene expression by integrating transcriptomics and classical quantitative genetics (Mackay et al. 2009). In this approach, molecular profiling on a large population of densely genotyped individuals is used to map genomic loci that modulate gene expression. This leads to the identification of expression quantitative trait loci (eQTL), i.e., polymorphic genetic loci that cause heritable differences in mRNA concentration. Using high-resolution tiling microarrays we were able to extend this concept to the detection of genetic determinants of alternative splicing (as)QTL and to the detailed quantification of the genetic robustness of the alternative splicing machinery in C. elegans on a genome-wide scale.

MATERIALS AND METHODS

Worm samples, genotyping, and Affymetrix GeneChips:

We used C. elegans recombinant inbred lines that were generated from a cross of N2 and CB4856 and were genotyped by Li et al. (2006). Age-synchronized C. elegans was cultured at 24° and the total RNA was isolated from the late L3 stage using the Trizol method. The RNA was cleaned using the QIAGEN (Valencia, CA) RNeasy Micro RNA cleanup kit. Double-stranded cDNA synthesis was done with the Affymetrix GeneChip WT double-stranded cDNA synthesis kit. We cleaned the cDNA using the GeneChip Sample Cleanup Module also from Affymetrix. For fragmentation and labeling, the GeneChip WT double-stranded DNA terminal labeling kit was used. The concentrations of RNA and cDNA were measured with a Nanodrop. After the fragmentation we determined the fragment size on a Nusieve 3:1 agarose gel. mRNA was hybridized to Affymetrix 1.0 C. elegans tiling arrays (2.9 million probes on each array) and the hybridization was done by ServiceXS (Leiden, The Netherlands). Since polymorphisms in the probe region can lead to spurious local eQTL (Alberts et al. 2007), 80,903 probes with known SNP (including predicted SNP; WS195 release) were removed for subsequent analysis. Each probe is annotated as exonic, intronic, or intergenic, when the entire probe of 25 bp falls in one of the three regions, respectively. Probes spanning exon–intron boundaries are labeled as boundary probes.

Data analysis:

Preprocessing of raw data:

The raw gene expression data from 60 microarrays (one RIL per array) were taken base-2 log transformed and then quantile normalized. Subsequently, the normalized intensity data were corrected for batch effects using the linear model

equation M1

where yi is the gene's intensity on the ith microarray (i = 1, …, 60), μ is the mean, Bi is the batch effect defined as the date of hybridization and measurement and treated as a categorical variable, and ei is the residual error.

Differential expression between genotypes (eQTL):

We used a robust and powerful statistical approach to associate microarray probe intensity and genotype data in the face of widely different hybridization properties of individual probes. Instead of computing significance of a statistical test, we evaluated a nonparametric effect size [Cliff's Δ (Cliff 1996)] for all 3 million probes at each genomic marker. For each probe on the array we compute the eQTL effect size using Cliff's nonparametric Δ-statistic

equation M2

where n1 and n2 are the numbers of carriers of the N2 and the CB4856 allele, and #(Xi1 > Xi2) is the number of possible pairwise comparisons where the expression level of gene i in an N2 carrier is larger than in a CB4856 carrier. The genotype information of the 60 RILs was previously described (Li et al. 2006). For an individual probe, a value of Δ = 0.45 corresponds to a P-value = 0.001 in a Wilcoxon rank sum test (del Rosal et al. 2003).

As several positions in the genome show a strongly imbalanced genotype ratio (i.e., the number of RILs carrying the N2 allele is far larger than the number of RILs carrying the CB4856 allele at a particular locus), the corresponding threshold (Wilcoxon's U-value) for each marker at significance level P = 0.001 was obtained first, taking the locus-specific imbalance into account. Then, these values were converted into the corresponding threshold for the effect size (Cliff's D) on the basis of D = 2U/(n1n2) − 1 (del Rosal et al. 2003). The threshold of distorted genome regions is expected to be larger than that of balanced marker positions. These marker-dependent thresholds were applied in further analysis.

Summarizing the eQTL effect for exons:

To increase the robustness of the procedure, the median effect size of probes within each exon was taken as representing the expression QTL effect size of this exon for each genomic marker. Subsequently, the eQTL profile at the marker with maximal summarized eQTL effect was obtained. To achieve a reliable estimate of eQTL effect size, only exons covered by more than three probes were considered here. Transcripts with a summarized eQTL effect larger than the threshold for at least one exon were declared as having a significant eQTL and were used for further analysis.

Classification of eQTL:

There are 435 transcripts with a significant eQTL in total. They were examined in greater detail and manually classified as shown in Figure 1. By visualizing the intensity level and eQTL size of the entire transcript, we first classified transcripts as having a consistent eQTL if all annotated exons show the same eQTL pattern at a threshold of Δ = 0.45 and there is no additional eQTL signal in the presumed intron regions. In addition, there are eQTL patterns that indicate the need for revised gene definitions (but no evidence for difference in splicing), which can be subdivided into five subcategories: (1) new exons (at least two consecutive intron probes showing a similar expression level and eQTL size as the exon probes of the gene), (2) new introns (at least two consecutive exon probes showing a clear decrease of expression level and eQTL size compared to the other exon probes of the gene), (3) intron inclusions (all probes corresponding to an intron showing the same expression and eQTL size as the exon probes), (4) exon extensions (at least two intron probes next to an exon showing similar expression levels and eQTL size as the adjacent exon), and (5) intron extensions (at least the first or the last two exon probes showing a decrease of expression level and eQTL size compared to the other exon probes of the gene). Most interestingly, there are also eQTL patterns that indicate potential heritable differences in splicing, i.e., genes showing alternative splicing QTL (Kwan et al. 2008). These can be subdivided into three classes according to the position of the alternatively spliced exon: cassette exon, alternative initiation, or alternative termination, where the expression level of the exon of interest in all cases follows an allele-dependent pattern. Transcripts showing evidence for multiple types of variation, e.g., having various exons with different patterns of heritable difference, were classified as complex cases. Heterogeneous cases contain transcripts showing very diverse eQTL patterns across probes and exons and belonging to none of the above-mentioned categories.

Figure 1.
Classification of genes showing heritable expression variation (eQTL). The 435 transcripts were classified into different groups according to their eQTL pattern: consistent eQTLs (brown) showing the same expression differences between the two genotypes ...

To validate the classification procedure, all classifications were performed independently by two researchers, and special cases were checked in more detail. A complete list of classifications is available in supporting information, Table S1 and the corresponding plots for all genes are available at www.wormplot.org.

Permutation:

A permutation approach was used to estimate the empirical false discovery rates for the detection of genetically regulated alternative splicing. We permute sample labels in the genotype matrix and keep the correlation structure between traits and the correlation structure between markers; this makes this empirical procedure perfectly suited to a nonbiased estimation of the significance under the multiple-dependence properties of the data (Breitling et al. 2008). The permuted data were reanalyzed for all genes at chromosome IV to keep the computational burden within reasonable limits: we repeated the QTL detection and classification as we did for the real data. On the basis of a total of 67,000 permuted instances of genes, we estimated the false discovery rate for the genetically regulated alternative splicing case being <1%.

Deleted genes:

We validated our ability to detect heritable expression differences by examining published gene deletions in CB4856 worms (Maydan et al. 2007). These genes should show consistently variable expression according to the local genotype. Of 531 CB4856-deleted genes, ~10% (53 genes) are detected as differentially expressed in our experiment. All of these genes show consistent eQTL across all probes with larger expression in the N2 allele, well above our threshold. This confirms the sensitivity of our approach.

Comparison with a previous experiment:

As a further validation step, we compared the detected eQTL to those observed in an earlier study using cDNA microarrays (Li et al. 2006). Nearly half of the top 500 highly expressed genes (231 genes) are shared in the two experiments. The eQTL effect size also shows strong correlation (locally regulated QTL, r = 0.72; distantly regulated, r = 0.48). Several strong distant eQTL were found in both experiments including ZK488.6, F10D2.9 (fat-7), F56H6.5 (gmd-2), C38D9.2, T21E8.1 (pgp-6), C05A9.1 (pgp-5), and F15D4.5.

Quantitative changes in alternative splicing:

Generally, the genetic effect on the abundance of transcript isoforms can be quantitative rather than qualitative (shifts in isoform ratios, rather than on–off effects). We calculated the expected effect size for all possible shifts of isoform ratio, assuming that two isoforms differ only by the presence or the absence of one exon and that there is no overall expression difference (Figure 2). It turns out that the difference in abundance of transcript isoforms should be at least ~1.86-fold to be picked up in our study. This means that our method has sufficient power to identify quantitative changes in isoform ratio like 90:10 (allele 1) → 20:80 (allele 2) or 60:40 (allele 1) → 12:88 (allele 2).

Figure 2.
Schematic illustration (A) and power of detection (B) for quantitative changes in alternative splicing. (A) We consider a transcript with two alternative splicing forms: the second exon is included in isoform 1 but excluded in isoform 2 (cassette exon). ...

RESULTS AND DISCUSSION

Here, we performed the first genome-wide analysis of genetic variation of alternative splicing in C. elegans using a comprehensive tiling microarray. We used 60 recombinant inbred lines of a cross between two very diverse strains, Bristol (N2) and Hawaii (CB4856), which have been genotyped using 121 markers (Li et al. 2006). By using tiling array data, with multiple probes targeting every exon of each gene, we obtained a more comprehensive and sensitive picture of heritable variation of gene expression than possible with previous technologies. It also allows us to dissect the genetic component for differences in isoform-specific gene expression. Thus we can detect asQTL, the genome regions controlling variation in isoform-specific expression. Two categories of asQTL can be distinguished, i.e., those that map in close vicinity to the gene itself (local) and those that map elsewhere in the genome (distant). Local activity can be explained, for example, by altered functional motifs in exonic splicing enhancers that will affect the splicing activity. The mechanism of distant regulation is often more complicated and can possibly be explained by a polymorphism in an auxiliary splicing factor (e.g., SR protein) that modulates the activity of the spliceosome. In this case we would expect to see a genetic master regulator at the locus of the splicing factor controlling isoform ratios for large groups of transcripts.

Using nonparametric effect size estimates, corrected for genotype imbalance (materials and methods) and corresponding to a P-value of 0.001 (Wilcoxon's test), we detected 435 genes with substantial heritable variation for at least one exon. The comparison of gene position and associated polymorphisms shows that most eQTL map in close proximity to the affected gene (local eQTL: 277 genes or 64%; Figure 3). There are 158 eQTL mapping to another chromosome (distant eQTL). Two hundred sixty-seven genes show higher expression in carriers of the N2 allele than in CB4856 carriers, including 53 cases of known gene deletions in the CB4856 strain (Maydan et al. 2007).

Figure 3.
Mapping location (A) and type (B) of heritable variation in gene expression. (A) Each dot represents a single transcript. The physical position of each transcript is indicated on the y-axis, and the position of the locus that is most strongly associated ...

A large majority of eQTL (319 or 70.4%) lead to a consistent differential expression across all exons of the affected gene. Interestingly, the genetic effects (eQTL size) of these consistent eQTL show a strong correlation (Spearman's ρ = 0.78) with a previous experiment using cDNA micorarrays (Li et al. 2006). As shown in Figure 1, 8.7% of cases show evidence for a necessary refinement of existing gene definitions, predominantly by expanding known exons (plotted results for all genes are available at www.wormplot.org for a detailed examination). In contrast to the large number of consistent eQTL, we find only 22 genes that show evidence for genetic variation of alternative splicing, i.e., an exon-specific asQTL (Figure 4). This genome-wide evidence for the genetic robustness of the alternative splicing machinery is consistent with the earlier indication that individual alternative splicing events in C. elegans are highly conserved and hardly tolerate genetic variation (Barberan-Soler and Zahler 2008a). Note, however, that variation in alternative splicing events restricted to a specific cell or tissue type can be diluted in measurements on whole-worm mRNA. In addition, 77% of asQTL were found to be locally regulated. This agrees with recent findings that alternative splicing can be regulated without involvement of an auxiliary splicing factor, by cis-acting RNA sequences that can function as a splicing silencer (Yu et al. 2008).

Figure 4.
Expression intensity and eQTL effect per probe along the genome for selected genes. (A) Detecting consistent heritable differences in gene expression with high resolution. Nearly 300 probes cover the area of this gene, Y87G2A.5. Exon probes show consistently ...

Most of the reported asQTL detected in our study have strong genetic effects (qualitative on–off patterns) and we found only a few cases of subtle quantitative effects on alternative splicing. However, this does not mean that this on–off behavior is a general property of alternative splicing patterns, but rather that despite the large population used in this study, technical noise and biological variation might limit our ability to detect subtle shifts in isoform proportions. To detect more quantitative effects (Figure 2), more precise technology such as deep-sequencing would be required. Even then, reliable detection of changes in isoform proportions will depend on extremely large read numbers.

Our genome-wide study provides the first genome-wide evidence supporting earlier hypotheses that in C. elegans the alternative splicing machinery exhibits a general genetic robustness, and only a minor fraction of genes show heritable variation in splicing forms and relative abundance. This observation points to a profound difference in the regulation of the alternative splicing machinery compared to that in humans (Kwan et al. 2008), which parallels the differences in cellular diversity and developmental flexibility in the two species and has important consequences for interpreting future studies using C. elegans as a model organism for metazoan splicing.

Acknowledgments

This work was supported by European Union grant FP7 PANACEA 222936 and The Netherlands Organization for Scientific Research, grant NWO-86504001.

Notes

Supporting information is available online at http://www.genetics.org/cgi/content/full/genetics.110.119677/DC1.

References

  • Alberts, R., P. Terpstra, Y. Li, R. Breitling, J. P. Nap et al., 2007. Sequence polymorphisms cause many false cis eQTLs. PLoS ONE 2 e622. [PMC free article] [PubMed]
  • Barberan-Soler, S., and A. M. Zahler, 2008a. Alternative splicing and the steady-state ratios of mRNA isoforms generated by it are under strong stabilizing selection in Caenorhabditis elegans. Mol. Biol. Evol. 25 2431–2437. [PMC free article] [PubMed]
  • Barberan-Soler, S., and A. M. Zahler, 2008b. Alternative splicing regulation during C. elegans development: splicing factors as regulated targets. PLoS Genet. 4 e1000001. [PMC free article] [PubMed]
  • Breitling, R., Y. Li, B. M. Tesson, J. Fu, C. Wu et al., 2008. Genetical genomics: spotlight on QTL hotspots. PLoS Genet. 4 e1000232. [PMC free article] [PubMed]
  • Cliff, D., 1996. Answering ordinal questions with ordinal data using ordinal statistics. Multivariate Behav. Res. 31 331–350.
  • del Rosal, A. B., C. San Luis and A. Sanchez-Bruno, 2003. Dominance statistics: a simulation study on the d statistic. Qual. Quant. 37 303–316.
  • Fischer, S. E., M. D. Butler, Q. Pan and G. Ruvkun, 2008. Trans-splicing in C. elegans generates the negative RNAi regulator ERI-6/7. Nature 455 491–496. [PMC free article] [PubMed]
  • Jansen, R. C., and J. P. Nap, 2001. Genetical genomics: the added value from segregation. Trends Genet. 17 388–391. [PubMed]
  • Kim, E., A. Magen and G. Ast, 2007. Different levels of alternative splicing among eukaryotes. Nucleic Acids Res. 35 125–131. [PMC free article] [PubMed]
  • Kuroyanagi, H., G. Ohno, S. Mitani and M. Hagiwara, 2007. The Fox-1 family and SUP-12 coordinately regulate tissue-specific alternative splicing in vivo. Mol. Cell. Biol. 27 8612–8621. [PMC free article] [PubMed]
  • Kwan, T., D. Benovoy, C. Dias, S. Gurd, C. Provencher et al., 2008. Genome-wide analysis of transcript isoform variation in humans. Nat. Genet. 40 225–231. [PubMed]
  • Li, Y., O. A. Alvarez, E. W. Gutteling, M. Tijsterman, J. Fu et al., 2006. Mapping determinants of gene expression plasticity by genetical genomics in C. elegans. PLoS Genet. 2 e222. [PMC free article] [PubMed]
  • Mackay, T. F., E. A. Stone and J. F. Ayroles, 2009. The genetics of quantitative traits: challenges and prospects. Nat. Rev. Genet. 10 565–577. [PubMed]
  • Maydan, J. S., S. Flibotte, M. L. Edgley, J. Lau, R. R. Selzer et al., 2007. Efficient high-resolution deletion discovery in Caenorhabditis elegans by array comparative genomic hybridization. Genome Res. 17 337–347. [PMC free article] [PubMed]
  • Schadt, E. E., S. A. Monks, T. A. Drake, A. J. Lusis, N. Che et al., 2003. Genetics of gene expression surveyed in maize, mouse and man. Nature 422 297–302. [PubMed]
  • Yu, Y., P. A. Maroney, J. A. Denker, X. H. Zhang, O. Dybkov et al., 2008. Dynamic regulation of alternative splicing by silencers that modulate 5′ splice site competition. Cell 135 1224–1236. [PMC free article] [PubMed]
  • Zhao, Z., T. J. Boyle, Z. Bao, J. I. Murray, B. Mericle et al., 2008. Comparative analysis of embryonic cell lineage between Caenorhabditis briggsae and Caenorhabditis elegans. Dev. Biol. 314 93–99. [PMC free article] [PubMed]

Articles from Genetics are provided here courtesy of Genetics Society of America
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...