• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of jlrAbout JLRASBMBJLRContactSubscriptionsSubmissionsThis Article
J Lipid Res. Sep 2011; 52(9): 1672–1682.
PMCID: PMC3151687

Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice[S]

Abstract

To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a “toolbox” of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits.

Keywords: quantitative trait loci, gene expression, genetics

One of the major predictors of the development of coronary artery disease (CAD) is lipid levels, which are determined by a complex interaction of genetic and environmental factors. High levels of low density lipoprotein (LDL) cholesterol and triglycerides (TG) are associated with higher incidence of heart disease (1). Recently, considerable interest has been given to the results of genome-wide association studies of triglycerides and lipids in humans (25). However, these studies often do not control for environment and only explain about 10% of the overall lipid variation, indicating that additional genes involved in lipid metabolism are yet to be discovered (5).

Studies in inbred mice successfully accommodate both genetic and environmental issues. Because we have such a tightly controlled environment in our mouse rooms, any phenotypic variation in triglyceride levels among inbred strains must be attributed primarily to genetic variation. This makes quantitative trait loci (QTL) analysis in inbred mouse strains a powerful approach for identifying loci and genes regulating lipid levels. To date, our laboratory and others have identified more than 30 mouse triglyceride QTL (6, 7). We continue to improve the ways in which we convert these QTL into the identification of QTL genes (QTG). For example, since 2003 we have used a list of QTG criteria published by the Complex Trait Consortium (CTC) to indentify causal QTL genes (8). These criteria are based on the premise that a QTG must carry a polymorphism between the parental strains of the mouse cross that affects either the structure/function of the gene (a nonsynonymous coding polymorphism) or the expression of the gene. Still, most of the CTC criteria involve in vitro and in vivo experiments and are not practical strategies when more than 100 genes are located under the QTL, a common characteristic of most QTL.

To improve and accelerate the process of gene identification, our laboratory and others developed a set of bioinformatic tools to help narrow QTL in the mouse and identify the most likely candidate gene(s) by adding support to each gene located within the QTL (912). As the availability and breadth of bioinformatic resources expands, so does the usefulness of these tools. The recent development of the mouse diversity genotyping array and imputation methods gives us access to the genome of a large number of mouse strains that were previously unavailable, allowing in-depth haplotype analysis and the identification of amino acid changes among these strains (13). In addition, a recent gene expression survey in 12 mouse inbred strains gives us the ability to identify differences in gene expression among these strains (14).

However, identifying a difference in expression between the parental strains of a mouse cross is not sufficient; the causative polymorphisms must be located under the QTL, meaning that the expression differences must be cis-regulated. Expression QTL (eQTL), the method of choice for identifying cis-regulated genes based on the gene expression profile of F2 mice, presents an ideal framework for estimating causality (15). The high cost of eQTL studies has limited expression microarray profiling in F2 crosses. Thus far, eQTL studies have been performed in a few F2 crosses in mice between inbred mouse strains and in congenic strains [C57BL6/J×DBA/2J (B6×D2) (16), C57BL6/J×C3H/HeJ (B6×C3H) (17), D2×AKR/J (18) and C57BL6/J×A/J (B6×AJ)] (S. W. Tsaih, personal communication) and in HG9 congenic mice (19). These powerful studies have helped identify new genes involved in complex traits. Here, we present an additional cross: MRL/MpJ×SM/J (MRL×SM).

In a QTL analysis between inbred mouse strains MRL and SM for triglyceride levels, we confirmed one QTL and identified three new QTL. We then applied our bioinformatic tools to narrow the QTL to just a few genes. We investigated databases for genotype information and for any potential functional amino acid change between the parental strains (MRL and SM). We also performed gene expression profile analysis from liver samples in the F2 mice. We used this information to identify cis-regulated genes that also correlate with triglycerides. Conditioning the gene expression on the phenotypes identified those QTL genes potentially causal to triglyceride variation. Our analysis confirms the value of our bioinformatic tools as an efficient model for identification of genes that regulate complex traits.

MATERIAL AND METHODS

Mice

MRL/MpJ (MRL) and SM/J (SM) mice were obtained from the Jackson Laboratory (Bar Harbor, ME). F1 mice were produced by intercrossing MRL females with SM males, while reciprocal F1 (RF1) mice were obtained by intercrossing SM females with MRL males. F1 mice were intercrossed by brother-sister mating to produce 282 F2 mice (135 females and 147 males). Parental strains, F1, RF1, and F2 mice were bred and housed in a climate-controlled, pathogen-free facility at the Jackson Laboratory with a 12:12 h light-dark cycle. F1, RF1, and F2 males and females were weaned at 21 days and fed chow diet (LabDiet® 5K52, PMI Nutritional International, Bentwood, MO). All experiments were approved by the Jackson Laboratory Animal Care and Use Committee.

Genotyping and phenotyping

Lipid measurements.

Mice were fasted for 4 h prior to retro-orbital bleeding at 8 weeks of age. Blood samples were collected with EDTA, and plasma was isolated by centrifugation within 2 h of the bleed. The serum was frozen at –20°C for a week until measured. Triglycerides were measured using a synchron CX Delta System (Beckman Coulter, Fullerton, CA).

Genotyping.

DNA was isolated and genotyped as described previously (20). Briefly, a total of 259 markers were genotyped, 258 with the Illumina platform comprising 760 single nucleotide polymorphisms (SNP) and one additional marker (rs33585432). Physical marker positions were determined in build 37 from the Mouse Genome Informatics (MGI) database, and sex-specific and averaged genetic positions were estimated from the newly calculated mouse genetic map (21).

Liver collection.

Livers in F2 mice were collected at 13 weeks of age as described previously (20). The mice were housed individually for 3 days prior to the tissue collection and fasted for 4 h on the day of the collection. Liver samples were preserved in RNAlater (Ambion, Applied Biosystems, Foster City, CA) and saved at –80°C prior to the gene expression study.

Sequencing.

We obtained MRL and SM genomic DNA from the Mouse DNA Resource at the Jackson Laboratory. We resequenced seven genomic loci embedding a polymorphism at a probe binding site and for which the Center for Genome Dynamics (CGD) database did not have genotypes available in MRL and SM (supplementary Table I). We used direct sequencing on the PCR products using Big Dye Terminator Cycle Sequencing Chemistry and the ABI 3700 Sequence Detection System (Applied Biosystems). Results were analyzed using Sequencher software (version 4.2).

Bioinformatic approach

Candidate genes.

The list of genes located under each QTL was downloaded from Ensembl in build 37. A gene was qualified as candidate if the gene i) was located within the 95% confidence interval (CI) of the QTL; ii) located within a region that differed in the haplotype of the parental strains; iii) segregated a damaging nonsynonymous coding polymorphism between MRL and SM; or iv) was differentially expressed between the two QTL strains (MRL and SM), cis-regulated, and correlated with the phenotype in the F2 mice. If a QTL was found specifically in males or females, the expression evidence (expression differences, cis eQTL, correlation, and causality modeling if the QTL was significant) was also investigated in that specific sex stratum. Candidate genes identified based on expression were subject to expression causality modeling if the QTL was significant. For the candidate genes showing causality based on expression differences, we searched the Ensembl database for polymorphisms located in probe binding sites. We determined the genotype of the SNP for MRL and SM using the CGD database at the Jackson Laboratory or by direct resequencing.

Haplotype analysis.

We used a haplotype approach to identify genes that differ between MRL and SM. For this purpose, we used the mouse diversity array database and its web tool available on the CGD website to identify the region where MRL differs from SM using an interval of 1 bp. We then identified the genes within these regions. Because of the high density of the SNPs and the non-uniformity of the SNP distribution within the genome, we also included any gene located within 10 Kb of these loci.

Nonsynonymous coding polymorphisms.

We used the high-density imputed SNP database available on the CGD website to identify any nonsynonymous coding polymorphism between MRL and SM. We evaluated its potential functionality using the Sorts Intolerant From Tolerant (SIFT) tool (22). If the polymorphism was characterized as “damaging” or it led to a stop codon, we concluded that the amino acid change is functional.

Microarray analysis for liver gene expression.

Microarray data were processed as previously described (20). Briefly, RNA was hybridized to the Mouse Gene 1.0 ST microarray (1M) (Affymetrix, Santa Clara, CA). The data were processed using the R language/environment version 2.7.2 for data analyses. Quality control and quantiles normalization (23) were performed with the affy V 1.20.0 and preprocessCore V1.6 packages from Bioconductor. The transcript analysis was performed with a custom CDF file (24) for Ensembl transcripts (ENST package V.11, 37,264 probesets) from the BrainArray (University of Michigan) website. There were 2,858 redundant probesets in the CDF file that were removed, producing a dataset with 34,406 probesets for following analyses. Microarray analysis was performed in males and females of the parental strains MRL and SM (N = 3 for each category) and in the F2 mice (N = 282). Difference in expression between the parental strains was assessed in the parental analysis and in the F2 mice. Cis-QTL and correlation were estimated in the F2 mice as described below. Microarrays have been deposited in the Gene Expression Omnibus (GEO accession: GSE25322).

Statistical analysis

Data analysis.

Parental strains, F1, RF1, and F2 mice were compared with ANOVA for females and males separately (JMP 7.0; SAS institute, Cary, NC). The data were transformed using a Van Der Waerden normal score (25).

QTL analysis.

Linkage analysis was performed using R/qtl (v1.09-43) (26). We performed a three-step analysis. First, triglyceride level was analyzed for main-effect QTL using sex as an additive covariate to account for the difference in lipid level between males and females (model 1). Second, sex was added as an interactive covariate (model 2); the difference between the additive and interactive models provided a test for QTL by sex interaction (i.e., one genotype affects the trait in males but not in females) as shown in Refs. 27 and 28. We also performed QTL analysis in males and females separately. We used the sex-specific positions to run the QTL analysis, but we translated these positions in the averaged genetic position to ease the comparison with the combined sex analysis. Third, epistatic effects were investigated using the pairscan function of R/qtl that tests for interacting QTL. For the chromosomes (chr) that showed a potentially secondary QTL on the same chromosome, we compared the best model with one QTL to the best model with two QTL. If the logarithm of the odds (LOD) score difference between these models was greater than 2, we concluded that two QTL were present. Thresholds for significant (P < 0.05) and suggestive (P < 0.63) LOD scores were based on 1,000 permutations of the observed data for the autosomes and 17,940 permutations for the X chromosome (29). The Bayesian method was used to determine the 95% CI (30). Briefly, the interval is obtained by assuming 10^LOD is the true likelihood function, assuming a priori that the QTL is equally likely to be anywhere on the chromosome. The posterior density can be derived from these two assumptions. The Bayes credible interval is defined as the interval for which the posterior exceeds a given probability, in this case 0.95. All suggestive and significant QTL were assessed in a combined multilocus model, and the proportion of the lipid trait explained by these QTL was determined through regression analysis. The genotypes and phenotypes are publically available at the CGD website.

Expression analysis.

To evaluate each transcript for cis expression, we performed eQTL analysis in the 282 F2 mice with sex as an additive covariate, and in the female and male only. We used the Haley Knott method with a 2 cM interval. Thresholds for significant (P < 0.05) and suggestive (P < 0.63) LOD scores were based on 10,000 permutations of the observed data. We defined a cis-QTL as a transcript for which the peak of the QTL was located within 20 cM of the genetic location of the gene with a suggestive LOD score. Pearson correlation coefficient and significance were calculated between the level of expression of the transcripts and the phenotype in the entire cross (after adjusting for sex) and in males and females only. All analysis was performed at the transcript level but reported at the gene level using the transcript with the strongest cis-QTL. For the candidate genes showing causality based on expression differences, we also verified that the cis-QTL was not due to the presence of a polymorphism between MRL and SM at the probe binding site. We first searched Ensembl for polymorphisms and confirmed the alleles of MRL and SM in the CGD imputed database or by resequencing. If a segregating polymorphism was present, we performed cis-QTL analysis at the probe level and verified that the probe carrying the polymorphism was not solely responsible for the overall cis-QTL of the gene.

Causal analysis.

Conditional genome scans can be used as a graphical modeling strategy to estimate causal relationships between traits (31, 32). We applied this strategy to the eQTL data to identify candidate genes that are causal (upstream) to the clinical trait (Y = triglyceride). In this approach, QTL mapping is performed with model 1 for the clinical trait (model 1: Y = β0 + β1Q + β2Sex + ϵ). This is compared with QTL mapping using a gene expression trait (X) as a covariate for the clinical trait (model 2: Y = β0 + β1Q + β2Sex + β3X + ϵ). A decrease in LOD score after conditioning on the gene expression trait (model 2) can be interpreted as the gene expression trait acting as a mediator of the QTL effect. The same strategy is then applied to the gene expression trait. QTL mapping for the candidate gene expression (model 3: X = β0 + β1Q + β2Sex + ϵ) is compared with QTL mapping for a model that includes the clinical trait as a covariate (model 4: X = β0 + β1Q + β2Sex + β3Y + ϵ). Genome scans from these models were compared to determine the extent to which the gene expression trait reacts to the clinical trait. In our application, a candidate gene in the QTL region was considered causal if two criteria were met: first, conditioning on the candidate gene transcript reduced the LOD score below the suggestive level (LOD < 2.2) for the clinical trait (model 2), and second, conditioning on the trait did not reduce the LOD score below the suggestive level for the gene transcript (model 4). This is a stringent criterion that will reveal the most causal candidates. Conditional linkage for the clinical traits was only applied on the QTL with significant LOD score (P < 0.05).

RESULTS

Lipid characteristics of the parental strains, F1, RF1, and F2 mice

Means and standard error of triglyceride levels are summarized in Table 1. No statistical difference in triglyceride levels was observed among the parental strains, F1, RF1, and F2 mice.

TABLE 1.
Triglyceride levels in the parental strains, F1s, RF1s, and F2s

Identification of genomic loci underlying triglyceride levels in the F2 mice

Genome-wide scans are represented in Fig. 1 and summarized in Table 2, with the 95% CI, LOD scores for the relevant model, closest marker, high allele strain at the locus, and mode of inheritance. We first added sex as an additive covariate and identified four main-effect QTL: one significant QTL on Chr15@38.8cM (Tgq35) and three suggestive QTL on Chr2@103.8cM, Chr7@10.1cM (Tgq34), and Chr17@33.2cM (Tgq1) (Fig. 1A and Table 2). Homozygous SM mice had higher triglyceride levels compared with homozygous MRL mice in a dominant and additive manner on Chr 7 and 15, respectively (Fig. 2). On Chr 17, homozygous MRL mice had higher triglycerides compared with homozygous SM mice (Fig. 2). On Chr 2, both homozygous MRL and SM had higher triglycerides compared with heterozygous mice (Fig. 2). We did not identify any significant QTL by sex interaction (Fig. 1B). However, we performed the QTL analysis in each sex separately (Fig. 1C, D and Table 2). We confirmed the QTL on Chr 7 and 17 in males and the QTL on Chr 15 in females. The QTL on Chr 2 was not replicated in the male- or female-only analysis, most likely due to the loss of statistical power by having fewer mice. Additionally, a new QTL was identified on Chr X in males, but the LOD score was low and may indicate a false-positive result. The Chr 7 QTL reached significance in the male-only QTL analysis, while the Chr 15 QTL was observed in the female-only QTL analysis, indicating that the allelic effect at these loci is stronger in one sex than the other. Overall, we were able to explain only 14.2% of the genetic variation in triglycerides, 18.2% in males and 13% in females (Table 3).

Fig. 1.
Genome-wide scan for triglycerides in 282 F2 mice. Analysis was performed with sex as an (A) additive covariate and (B) interactive covariate. The difference between both models (B) with a delta LOD > 2 indicates a QTL by sex interaction (dotted ...
Fig. 2.
Allele effect plot of the main effect QTL for triglycerides. Triglyceride levels are indicated as mean ±SD in mg/ml at each QTL, using the closest marker for each genotype (MM, homozygous MRL; MS, heterozygous; SS, SM homozygous).
TABLE 2.
Genome-wide QTL for triglycerides in the MRL/MpJ×SM/J F2 mice
TABLE 3.
Regression ANOVA for lipid traits in the F2 mice

Gene expression analysis in the parental and F2 mice

We performed microarray analysis in 3 males and 3 females of each parental strain (MRL and SM) as well as in 282 F2 mice. In the parental strains, 3,423 genes were differentially expressed at the significant level (P < 0.05) in males or females. In the 282 F2 mice, 2,840 genes were cis-regulated at the significant level (P < 0.05), and 1,411 additional genes were cis-regulated at the suggestive level (P < 0.63), while 3,399 and 3,150 genes were cis-regulated in males only and females only, respectively (2,095 and 1,907 at the significant level, respectively) (supplementary Table II). Overall, the expression of 631 genes was correlated with triglycerides in males and females together, 671 in males only, and 622 in females only. The genes showing the strongest correlation are indicated in supplementary Table III. Molecular evidence for QTL genes based on expression differences between the parental strains includes a gene that is differentially expressed between MRL and SM and whose expression is cis-regulated and correlated with triglyceride levels in the F2 mice. We identified 433 potential candidate genes in males and females together, 347 in males alone, and 201 in females alone. About 34% of the genes identified in the combined sex analysis were also identified in the male- or female-specific analysis. The genes that did not overlap reflected either a lower number of mice in the sex-specific analysis (if the gene was identified in the combined sex analysis but not in the male- or female-specific analysis, 27% of the genes), or a sex-specific cis-QTL or correlation (if the gene was identified in males or females but not in the combined sex analysis, 39% of the genes). Genes that were located under a significant QTL and showed molecular evidence of a QTG based on expression were subject to conditional linkage analysis.

Identification of Slc25a7 as a candidate gene for the Chr 7 triglycerides QTL (Tgq34)

The Chr 7 QTL was identified in the combined sex QTL analysis at the suggestive level (LOD = 2.2, 95% CI = 17.0-139.4 Mb) and in the male-specific analysis at the significant level (LOD = 4.1, 95% CI = 17.0-30.4 Mb) (Table 2), indicating that the allele effect is stronger in males than in females. We therefore examined this QTL in males only. Through haplotype analysis, we reduced the number of candidate genes from 660 to 395 (Fig. 3A). Among the candidate genes, we identified 3 genes with an amino acid change between MRL and SM characterized as damaging (Q386R in Psg29, pregnancy-specific glycoprotein 29; I27F in Igfl3, IGF-like family member 3; and E468G in Rasgrp4, RAS guanyl releasing protein 4) (Table 4). In addition, we identified 17 genes for which the expression was cis-regulated in the F2 male (N = 146), differentially expressed between MRL and SM males, and correlated with triglycerides in males (Table 4). We applied conditional expression analysis in males and identified 3 genes likely to be causal: Slc27a5 [solute carrier family 27 (fatty acid transporter, member 5)], Sae1 (SUMO1 activating enzyme subunit 1), and Cadm4 (cell adhesion molecule 4). The expression of all 3 genes was significantly different between MRL and SM, cis-regulated, and correlated with triglyceride levels. We verified that the cis-regulation of the gene expression was not due to the presence of a polymorphism at the probe binding sites in all 3 genes. No polymorphism was reported at any probe binding sites of Cadm4 in Ensembl. For Sae1 and Slc27a5, we identified several polymorphisms at the probe binding sites, and we performed QTL analysis at the probe level (supplementary Table IV). For Slc27a5, we identified 34 out 35 probes cis-regulated; 31 of them did not carry a polymorphism segregating between MRL and SM (supplementary Table IV). For Sae1, we identified 19 out of 23 probes cis-regulated, none of which carried a segregating polymorphism. This indicated that the cis-regulation of Slc27a5 and Sae1 was not due to a polymorphism at the probe binding site. Among the 6 final candidate genes (Psg29, Igfl3, Rasgrp4, Cadm4, Sae1, and Slc27a5), only 1 (Slc27a5) was known to affect triglyceride level, and we compared our bioinformatic results to the published data (33). Expression of Slc27a5 was higher in F2 males carrying the SM allele compared with MRL allele (−2.34-fold change, P < 0.001), cis-regulated (eQTL on Chr7@9.2, LOD = 13.5), and positively correlated with triglycerides (r = +0.34, P < 0.001) (Table 4 and supplementary Table III). In addition, at the QTL, homozygous SM mice had higher triglycerides compared with homozygous MRL mice. These results fit with the published knockout mouse model for Slc27a5 that exhibits lower triglyceride levels (33). Conditioning triglycerides for the expression of Slc27a5 in males lowered the LOD score from 4.1 to 1.1, and conditioning the Slc27a5 eQTL for triglycerides did not lower the LOD score below the suggestive threshold (from 13.5 to 11.2) (Table 5). Although we cannot exclude that one of the other candidate genes may be responsible for the QTL, because the bioinformatic evidence from our cross fit the in vivo evidence from literature, we concluded that Slc25a7 is the triglyceride QTL gene for the Chr 7 triglyceride QTL.

Fig. 3.
Using the bioinformatic tools to narrow the significant QTL for triglycerides on Chr 7 (A), Chr 15 (B), and Chr 17 (C). Each panel shows the relevant bioinformatic tools used to reduce the confidence interval of the QTL (upper bar) with haplotype analysis ...
TABLE 4.
Candidate genes for triglyceride QTL on Chr 7, 15 and 17 in the MRL×SM F2 mice
TABLE 5.
Conditional linkage analysis for the candidate genes for each significant QTL

Candidate genes for the QTL on mouse Chr 15 (Tgq35)

Peroxisome proliferator activated receptor alpha (Ppara), located at 85.5 Mb right at the QTL peak, is known to be involved in triglyceride metabolism (34) and is the most likely QTL gene. However, we failed to find any evidence supporting Ppara as a candidate gene in either the resequencing or expression studies. Ppara did not have any noncoding polymorphism between strains MRL and SM. Similarly, the expression studies did not fit the characteristics of the QTL. This QTL was found in females but not in males. Although the expression differed between the parental strains in males and females (+2.14-fold change MRL versus SM, P < 0.001 in males, and +1.46-fold change MRL versus SM, P = 0.026 in females), the expression in F2 mice showed that Ppara was cis-regulated only in males and not in females (supplementary Table V). The lack of cis-regulation in females indicates that the expression difference found between the parental strains cannot account for the QTL. In addition, the expression of Ppara was not correlated with triglycerides (P > 0.05) (supplementary Table V).

After failing to find any support for Ppara as a QTL gene, we investigated the QTL for new candidate genes by applying our bioinformatic approach. This QTL was observed in the combined sex (LOD = 4.1 at 82.8 Mb) and the female-only analyses (LOD = 4.1 at 78 Mb) at the significant level (Table 2). Through haplotype analysis, we reduced the number of candidate genes from 388 to 145 within the 18.3 Mb locus (Fig. 3B). Among these genes, we identified one candidate gene based on an amino acid change between MRL and SM characterized as damaging: Recql4 (recQ protein-like 4), L527M. We also identified 5 genes whose expression was strongly correlated with triglyceride levels in the F2 populations where the QTL was identified for males and females combined and females only (supplementary Table III): Polr3 h, polymerase (RNA) III (DNA directed) polypeptide H; Tspo, translocator protein; Ttl1l2, tubulin tyrosine ligase-like family, member 12; and Cyp2d22, Cyp2d26 or cytochrome P450, family 2, subfamily d, polypeptides 22 and 26. The differential expression of these 5 genes between MRL and SM were found to be causal in the combined sex or in the female-only analyses (Tables 4 and and5).5). Two of the genes (Polr3 h and Tspo) were not originally identified by haplotype analysis, but because of their strong correlation with triglyceride levels in the combined sex and female-only analyses, we suspect that a low SNP density in the database was responsible for their exclusion from the haplotype analysis. However, on the basis of the strong expression evidence, we added these 2 genes in our list of candidate genes. We also determined that the expression differences between MRL and SM as well as the cis-regulation and correlation for all genes were not due to polymorphisms within any probe binding sites. Cyp2d22 did not have any reported polymorphisms within any probe binding site in Ensembl. For Cyp2d26, all 22 probes were cis-regulated, and no segregating SNPs within any probe binding site were identified (supplementary Table IV). For Polr3 h, Ttll12 and Tspo, 19 out of 27 probes (70%), 18 out of 23 probes (56%), and 14 out of 25 probes (78%), respectively, were cis-regulated, and none of them carried a polymorphism that differed between MRL and SM (supplementary Table IV). Additional studies, such as congenic mice, will help determine which of these genes is the triglyceride QTL gene. None of these genes is known to affect triglyceride metabolism.

Additional candidate genes on Chr 17

The Chr 17 QTL was identified in the combined sex analysis (LOD = 3.2 at 64.6 Mb) and in males only (LOD = 2.3 at 74.1 Mb), both at the suggestive level (Table 2 and Fig. 3C). This QTL has previously been observed in an intercross between MRL and SJL. We did not have any information on which allele was the high or low triglyceride allele at this locus, but MRL is the common strain between both crosses. Therefore, we hypothesized that the same gene must be responsible for the QTL in both crosses. We reduced the number of candidate genes from 273 to 70 by haplotype analysis using the following criteria: MRL ≠ (SM = SJL). We identified four candidate genes at this locus. One is based on a segregating nonsynonymous polymorphism (4930564C03Rik, L150V), and three are based on expression differences between the parental strains, cis-regulation and correlation with triglycerides in the F2 mice: Ubr2 (ubiquitin protein ligase E3 component n-recognin 2), Treml4 (triggering receptor expressed on myeloid cells-like 4) and 2310039H08Rik (Fig. 3C and Table 4). We did not apply the conditional linkage approach because the QTL (Tgq1) was not significant. None of these genes is known to affect triglyceride metabolism.

DISCUSSION

In this study, we performed QTL mapping for triglycerides using an intercross between inbred mouse strains MRL and SM. We identified four QTL on Chrs 2, 7, 15, and 17. We then applied our mouse bioinformatic “toolbox” to identify candidate genes located under the significant QTL. Our bioinformatic toolbox is based on recommendations by the Complex Trait Consortium (CTC) (8). This powerful approach includes haplotype analysis and a search for the presence of nonsynonymous coding polymorphisms or differential expression between the parental strains (9, 10). To expand our “classic” bioinformatic toolbox, we also performed expression profiling in the F2 mice and applied conditional genome scan analysis. In F2 mice, eQTL data allowed us to determine i) cis-regulated genes, ii) significant correlations between gene expression and triglyceride level, and iii) genes for which the expression is likely to be causal to the QTL. In our study, our strength was to have all the tools available to combine and improve our discovery of the causal QTL gene.

The mouse bioinformatic toolbox, as described previously (9, 10), offers advantages that are not readily available for other animal models. The search for candidate genes, which consists of comparing the two parental strains, is straightforward and requires limited laboratory experiments. Databases and bioinformatic tools are publically available to perform haplotype analysis and explore the presence of a nonsynonymous coding polymorphism and differential gene expression between the parental strains.

However, we recognize that our bioinformatic approach itself has limitations. First, while our search for genes carrying different haplotypes has been previously successful in identifying complex trait genes (35, 36), it may also miss a candidate gene due to the scarcity of SNPs genotyped at the locus, which we suspect happened for Tspo and Polr3 h on Chr 15. Second, the large expression differences between the two parental strains or the strong cis-QTL could be due to the presence of a polymorphism at the probe binding site. In our study, we confirmed that this was not the case for our eight candidate genes, but we cannot exclude the possibility, especially if the cross involved a wild-derived strain (37).

Conditional causality modeling methods on their own are very powerful in narrowing the confidence interval of a QTL and identifying candidate genes for which an expression difference is responsible for the QTL (31, 32, 38). These studies require performing expression profiling in all F2 mice, thus increasing the cost of the microarray experiments. Therefore, only a few studies have been performed and published (1618), and they often focused solely on differential gene expression as the cause of a QTL. However, an amino acid difference between the parental strains can also cause the QTL by influencing the structure or function of the protein (35, 39). In our study, we strengthened our approach, including this possibility by screening the CGD SNP database for nonsynonymous coding polymorphisms segregating between MRL and SM and by using SIFT to further characterize these changes as damaging. However, rare variants (present only in specific strains not commonly used, such as MRL) are likely to have been overlooked by using the CGD SNP database. New SNP databases based on next-generation sequencing of the entire mouse genome are being developed by the Sanger Institute, but currently they are available for only a few strains (not MRL and SM). This valuable resource will give access to the complete information of the mouse genome and help complete the picture of the underlying molecular evidence of a QTL between two inbred strains by identifying i) polymorphisms within probe binding sites that could lead to a false difference of expression between the parental strains, and ii) polymorphisms within the coding sequence that could lead to a difference in structure or function of the protein.

Finally, the conditional modeling approach usually results in a short list of candidate genes. Unless additional in vivo work is performed, no gene can be determined as the QTL gene. In our study, we reduced each significant QTL to only a few genes. From the list of three candidate genes on Chr 7 found to be potentially causal based on expression (Slc27a5, Sae1, and Cadm4), we determined that Slc27a5 was the QTL gene in males based on a knockout mouse model for Slc27a5 that had previously shown a lower triglyceride level compared with controls (33). This fit with our expression level, where lower expression of Slc27a5 was found in the low triglyceride strain. The other QTL were reduced to only a few genes, such as the six genes on Chr 15, with some likely candidates based on function. Additional work in vivo, however, will be necessary to determine which gene is the QTL gene. On Chr 15, we identified two genes from the P450 cytochrome gene family, Cyp2d22 and Cyp2d26, for which the expression is likely to be causal to the QTL. While these two genes have not been shown to be involved in triglyceride metabolism, another member from the same gene family has been: the knockout for Cyp19a1 shows increased triglyceride levels (40). We also identified Ttll12 as a potential candidate gene for the Chr 15 QTL. This gene has recently been shown to be involved in tubulin posttranslational modification and chromosomal ploidy and may contribute to the development of tumors in prostate cancer (41). None of the identified genes has a known role in triglyceride metabolism, and additional molecular and in vivo studies, such as congenic mice, must be used to determine which gene is the QTL gene.

To conclude, we identified new genomic loci regulating triglycerides in mice. Most of the QTL identified in this study are new. The development of advanced bioinformatic tools, expression QTL analysis, methods for causal inference, and large SNP databases will help to identify the causal genes for these QTL. Their discovery will provide potential new targets for drug development and lead to improved treatment for coronary artery disease.

Acknowledgments

The authors would like to thank Harry Whitmore for his help with mouse husbandry, Joanne Currer for editing the manuscript, and Jesse Hammer for graphical assistance.

Footnotes

Abbreviations:

CAD
coronary artery disease
CGD
Center for Genome Dynamics
Chr
chromosome
CI
confidence interval
CTC
Complex Trait Consortium
LOD
logarithm of the odds
na
not applicable
Ppara
peroxisome proliferator activated receptor alpha
QTG
QTL gene
QTL
quantitative trait loci
eQTL
expression QTL
RF1
reciprocal F1
SIFT
sorts intolerant from tolerant
SNP
single nucleotide polymorphism
TG
triglyceride

This work was supported by National Institutes of Health Grants HL-077796 and HL-081162 (B.P.), an American Heart Association post-doctoral fellowship (M.S.L.), and National Institutes of Health, National Cancer Institute Core Grant CA-034196 (Jackson Laboratory). Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health or other granting agencies.

[S]The online version of this article (available at http://www.jlr.org)contains supplementary data in the form of five tables.

REFERENCES

1. Castelli W. P., Doyle J. T., Gordon T., Hames C. G., Hjortland M. C., Hulley S. B., Kagan A., Zukel W. J. 1977. HDL cholesterol and other lipids in coronary heart disease. The cooperative lipoprotein phenotyping study. Circulation. 55: 767–772 [PubMed]
2. Saxena R., Voight B. F., Lyssenko V., Burtt N. P., de Bakker P. I., Chen H., Roix J. J., Kathiresan S., Hirschhorn J. N., Daly M. J., et al. 2007. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science. 316: 1331–1336 [PubMed]
3. Kathiresan S., Melander O., Guiducci C., Surti A., Burtt N. P., Rieder M. J., Cooper G. M., Roos C., Voight B. F., Havulinna A. S., et al. 2008. Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans. Nat. Genet. 40: 189–197 [PMC free article] [PubMed]
4. Kathiresan S., Willer C. J., Peloso G. M., Demissie S., Musunuru K., Schadt E. E., Kaplan L., Bennett D., Li Y., Tanaka T., et al. 2009. Common variants at 30 loci contribute to polygenic dyslipidemia. Nat. Genet. 41: 56–65 [PMC free article] [PubMed]
5. Teslovich T. M., Musunuru K., Smith A. V., Edmondson A. C., Stylianou I. M., Koseki M., Pirruccello J. P., Ripatti S., Chasman D. I., Willer C. J., et al. 2010. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 466: 707–713 [PMC free article] [PubMed]
6. Wang X., Paigen B. 2005. Genome-wide search for new genes controlling plasma lipid concentrations in mice and humans. Curr. Opin. Lipidol. 16: 127–137 [PubMed]
7. Stylianou I. M., Langley S. R., Walsh K., Chen Y., Revenu C., Paigen B. 2008. Differences in DBA/1J and DBA/2J reveal lipid QTL genes. J. Lipid Res. 49: 2402–2413 [PMC free article] [PubMed]
8. Abiola O., Angel J. M., Avner P., Bachmanov A. A., Belknap J. K., Bennett B., Blankenhorn E. P., Blizard D. A., Bolivar V., Brockmann G. A., et al. 2003. The nature and identification of quantitative trait loci: a community's view. Nat. Rev. Genet. 4: 911–916 [PMC free article] [PubMed]
9. DiPetrillo K., Wang X., Stylianou I. M., Paigen B. 2005. Bioinformatics toolbox for narrowing rodent quantitative trait loci. Trends Genet. 21: 683–692 [PubMed]
10. Burgess-Herbert S. L., Cox A., Tsaih S. W., Paigen B. 2008. Practical applications of the bioinformatics toolbox for narrowing quantitative trait loci. Genetics. 180: 2227–2235 [PMC free article] [PubMed]
11. Cozma D., Lukes L., Rouse J., Qiu T. H., Liu E. T., Hunter K. W. 2002. A bioinformatics-based strategy identifies c-Myc and Cdc25A as candidates for the Apmt mammary tumor latency modifiers. Genome Res. 12: 969–975 [PMC free article] [PubMed]
12. Flint J., Valdar W., Shifman S., Mott R. 2005. Strategies for mapping and cloning quantitative trait genes in rodents. Nat. Rev. Genet. 6: 271–286 [PubMed]
13. Szatkiewicz J. P., Beane G. L., Ding Y., Hutchins L., Pardo-Manuel de Villena F., Churchill G. A. 2008. An imputed genotype resource for the laboratory mouse. Mamm. Genome. 19: 199–208 [PMC free article] [PubMed]
14. Shockley K. R., Witmer D., Burgess-Herbert S. L., Paigen B., Churchill G. A. 2009. The effects of atherogenic diet on hepatic gene expression across mouse strains. Physiol. Genomics. 39: 172–182 [PMC free article] [PubMed]
15. Rockman M. V. 2008. Reverse engineering the genotype-phenotype map with natural genetic variation. Nature. 456: 738–744 [PubMed]
16. Mehrabian M., Allayee H., Stockton J., Lum P. Y., Drake T. A., Castellani L. W., Suh M., Armour C., Edwards S., Lamb J., et al. 2005. Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits. Nat. Genet. 37: 1224–1233 [PubMed]
17. Cervino A. C., Li G., Edwards S., Zhu J., Laurie C., Tokiwa G., Lum P. Y., Wang S., Castellani L. W., Lusis A. J., et al. 2005. Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels. Genomics. 86: 505–517 [PubMed]
18. Bhasin J. M., Chakrabarti E., Peng D. Q., Kulkarni A., Chen X., Smith J. D. 2008. Sex specific gene regulation and expression QTLs in mouse macrophages from a strain intercross. PLoS ONE. 3: e1435. [PMC free article] [PubMed]
19. Farber C. R., Aten J. E., Farber E. A., de Vera V., Gularte R., Islas-Trejo A., Wen P., Horvath S., Lucero M., Lusis A. J., et al. 2009. Genetic dissection of a major mouse obesity QTL (Carfhg2): integration of gene expression and causality modeling. Physiol. Genomics. 37: 294–302 [PMC free article] [PubMed]
20. Leduc M. S., Hageman R. S., Meng Q., Verdugo R. A., Tsaih S. W., Churchill G. A., Paigen B., Yuan R. 2010. Identification of genetic determinants of IGF-1 levels and longevity among mouse inbred strains. Aging Cell. 9: 823–836 [PMC free article] [PubMed]
21. Cox A., Ackert-Bicknell C. L., Dumont B. L., Ding Y., T. Bell J., Brockmann G. A., Wergedal J. E., Bult C., Paigen B., Flint J., et al. 2009. A new standard genetic map for the mouse. Genetics. 182: 1335–1344 [PMC free article] [PubMed]
22. Ng P. C., Henikoff S. 2001. Predicting deleterious amino acid substitutions. Genome Res. 11: 863–874 [PMC free article] [PubMed]
23. Bolstad B. M., Irizarry R. A., Astrand M., Speed T. P. 2003. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 19: 185–193 [PubMed]
24. Dai M., Wang P., Boyd A. D., Kostov G., Athey B., Jones E. G., Bunney W. E., Myers R. M., Speed T. P., Akil H., et al. 2005. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 33: e175. [PMC free article] [PubMed]
25. Lehmann E. L. 1975. Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco
26. Broman K. W., Wu H., Sen S., Churchill G. A. 2003. R/qtl: QTL mapping in experimental crosses. Bioinformatics. 19: 889–890 [PubMed]
27. Su Z., Tsaih S. W., Szatkiewicz J., Shen Y., Paigen B. 2008. Candidate genes for plasma triglyceride, FFA, and glucose revealed from an intercross between inbred mouse strains NZB/B1NJ and NZW/LacJ. J. Lipid Res. 49: 1500–1510 [PMC free article] [PubMed]
28. Korstanje R., Li R., Howard T., Kelmenson P., Marshall J., Paigen B., Churchill G. 2004. Influence of sex and diet on quantitative trait loci for HDL cholesterol levels in an SM/J by NZB/BlNJ intercross population. J. Lipid Res. 45: 881–888 [PubMed]
29. Broman K. W., Sen S., Owens S. E., Manichaikul A., Southard-Smith E. M., Churchill G. A. 2006. The X chromosome in quantitative trait locus mapping. Genetics. 174: 2151–2158 [PMC free article] [PubMed]
30. Broman K. W., Sen S. 2009. A Guide to QTL Mapping with R/qtl. Springer, NY
31. Li R., Tsaih S. W., Shockley K., Stylianou I. M., Wergedal J., Paigen B., Churchill G. A. 2006. Structural model analysis of multiple quantitative traits. PLoS Genet. 2: e114. [PMC free article] [PubMed]
32. Chaibub Neto E., Keller M. P., Attie A. D., Yandell B. S. 2010. Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. Ann. Appl. Stat. 4: 320–339 [PMC free article] [PubMed]
33. Doege H., Baillie R. A., Ortegon A. M., Tsang B., Wu Q., Punreddy S., Hirsch D., Watson N., Gimeno R. E., Stahl A. 2006. Targeted deletion of FATP5 reveals multiple functions in liver metabolism: alterations in hepatic lipid homeostasis. Gastroenterology. 130: 1245–1258 [PubMed]
34. Lee S. S., Pineau T., Drago J., Lee E. J., Owens J. W., Kroetz D. L., Fernandez-Salguero P. M., Westphal H., Gonzalez F. J. 1995. Targeted disruption of the alpha isoform of the peroxisome proliferator-activated receptor gene in mice results in abolishment of the pleiotropic effects of peroxisome proliferators. Mol. Cell. Biol. 15: 3012–3022 [PMC free article] [PubMed]
35. Wang X., Korstanje R., Higgins D., Paigen B. 2004. Haplotype analysis in multiple crosses to identify a QTL gene. Genome Res. 14: 1767–1772 [PMC free article] [PubMed]
36. Su Z., Wang X., Tsaih S. W., Zhang A., Cox A., Sheehan S., Paigen B. 2009. Genetic basis of HDL variation in 129/SvImJ and C57BL/6J mice: importance of testing candidate genes in targeted mutant mice. J. Lipid Res. 50: 116–125 [PMC free article] [PubMed]
37. Verdugo R. A., Farber C. R., Warden C. H., Medrano J. F. 2010. Serious limitations of the QTL/microarray approach for QTL gene discovery. BMC Biol. 8: 96. [PMC free article] [PubMed]
38. Farber C. R., van Nas A., Ghazalpour A., Aten J. E., Doss S., Sos B., Schadt E. E., Ingram-Drake L., Davis R. C., Horvath S., et al. 2009. An integrative genetics approach to identify candidate genes regulating BMD: combining linkage, gene expression, and association. J. Bone Miner. Res. 24: 105–116 [PMC free article] [PubMed]
39. Suto J. 2005. Apolipoprotein gene polymorphisms as cause of cholesterol QTLs in mice. J. Vet. Med. Sci. 67: 583–589 [PubMed]
40. Jones M. E., Thorburn A. W., Britt K. L., Hewitt K. N., Misso M. L., Wreford N. G., Proietto J., Oz O. K., Leury B. J., Robertson K. M., et al. 2001. Aromatase-deficient (ArKO) mice accumulate excess adipose tissue. J. Steroid Biochem. Mol. Biol. 79: 3–9 [PubMed]
41. Wasylyk C., Zambrano A., Zhao C., Brants J., Abecassis J., Schalken J. A., Rogatsch H., Schaefer G., Pycha A., Klocker H., et al. 2010. Tubulin tyrosine ligase like 12 links to prostate cancer through tubulin posttranslational modification and chromosome ploidy. Int. J. Cancer. 127: 2542–2553 [PubMed]
42. Gu L., Johnson M. W., Lusis A. J. 1999. Quantitative trait locus analysis of plasma lipoprotein levels in an autoimmune mouse model: interactions between lipoprotein metabolism, autoimmune disease, and atherogenesis. Arterioscler. Thromb. Vasc. Biol. 19: 442–453 [PubMed]
43. Srivastava A. K., Mohan S., Masinde G. L., Yu H., Baylink D. J. 2006. Identification of quantitative trait loci that regulate obesity and serum lipid levels in MRL/MpJ x SJL/J inbred mice. J. Lipid Res. 47: 123–133 [PubMed]
44. Maltais L. J., Blake J. A., Chu T., Lutz C. M., Eppig J. T., Jackson I. 2002. Rules and guidelines for mouse gene, allele, and mutation nomenclature: a condensed version. Genomics. 79: 471–474 [PubMed]

Articles from Journal of Lipid Research are provided here courtesy of American Society for Biochemistry and Molecular Biology
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...