• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ajhgLink to Publisher's site
Am J Hum Genet. Apr 11, 2008; 82(4): 849–858.
Published online Apr 4, 2008. doi:  10.1016/j.ajhg.2008.01.018
PMCID: PMC2427263

On the Replication of Genetic Associations: Timing Can Be Everything!

Abstract

The failure of researchers to replicate genetic-association findings is most commonly attributed to insufficient statistical power, population stratification, or various forms of between-study heterogeneity or environmental influences.1 Here, we illustrate another potential cause for nonreplications that has so far not received much attention in the literature. We illustrate that the strength of a genetic effect can vary by age, causing “age-varying associations.” If not taken into account during the design and the analysis of a study, age-varying genetic associations can cause nonreplication. By using the 100K SNP scan of the Framingham Heart Study, we identified an age-varying association between a SNP in ROBO1 and obesity and hypothesized an age-gene interaction. This finding was followed up in eight independent samples comprising 13,584 individuals. The association was replicated in five of the eight studies, showing an age-dependent relationship (one-sided combined p = 3.92 × 10−9, combined p value from pediatric cohorts = 2.21 × 10−8, combined p value from adult cohorts = 0.00422). Furthermore, this study illustrates that it is difficult for cross-sectional study designs to detect age-varying associations. If the specifics of age- or time-varying genetic effects are not considered in the selection of both the follow-up samples and in the statistical analysis, important genetic associations may be missed.

Introduction

It is generally accepted that replication of identified associations in other populations is necessary to validate initial findings (that is, the differentiation of true-positive from false-positive results) and possibly generalize them to other populations. However, genetic-association studies have faced the challenge of inconsistent replication, both in the context of genome-wide association (GWA) studies and, more commonly, in the context of candidate-gene studies.2–7 Recent technological advances enabling large-scale SNP genotyping at relatively low cost, the development of several such platforms with polymorphism content of sufficient density, and the advent of powerful statistical methods for the analysis of such data have resulted in the successful implementation of GWA studies and mapping of complex-trait susceptibility loci in several diseases, including age-related macular degeneration, diabetes, obesity, and inflammatory bowel disease.8,9 The number of whole-genome scans on large cohorts is increasing rapidly, making replication imperative for the detection of valid associations. A 100K scan in the Framingham Heart Study (FHS) identified an association between the phenotype body mass index (BMI) and a common variant, a SNP with minor allele frequency of 0.30 located near the INSIG2 gene (INSIG2). In the original paper by Herbert et. al3 as well as in subsequent replication attempts by other groups, the original association between BMI and the SNP near INSIG2 was confirmed in some but not all cohorts, emphasizing both the complexity and the challenges for replications of association findings for complex traits.9–14

Difficulty in replicating genetic-association findings is most commonly attributed to insufficient statistical power, population stratification, or various forms of between-study heterogeneity, including differences in genetic ancestry (i.e., linkage-disequilibrium patterns), ascertainment schema (e.g., controls for different diseases are used to replicate an association for different phenotypes), and environmental influences.1 Here, we illustrate that genetic effects for complex traits can vary by age and that such an interaction can prevent replications if the age-varying character of an association is not taken into account in the selection of both the replication samples and the statistical analysis strategy. Although the importance of age-dependent genetic effects for BMI has been suggested,15 their incorporation into current genetic analyses is not common.

The previously mentioned association between obesity, measured as BMI kg/m2, and a common genetic variant near INSIG2 was identified via a scan of 86,604 SNPs among 923 individuals in the FHS offspring cohort.9 Recently, genotyping of the entire FHS sample with 1322 individuals was completed on the 100K platform, and genetic information for additional 399 subjects became available. We then reanalyzed the data, using the VanSteen Screening and Testing algorithm for family data.16 Given the availability of longitudinal BMI measurements in the FHS, we selected the family-based association test-principal components (FBAT-PC) approach, which is particularly suited to detect both genetic main effects and time-varying effects without having to make prior assumptions about the presence of either effect.17 Our reanalysis identified a second common genetic variant (rs1455832) at a different gene locus intronic to the gene roundabout axon guidance receptor, homolog 1 (ROBO1) (AHC [MIM 602430]) that achieved genome-wide significance (unadjusted p value p = 0.000624, Table 1). Because these data were longitudinal in nature (six BMI measurements per person), a profile plot that illustrates the effect of genotype on BMI over time was generated (Figure 1). This plot revealed a genetic effect between rs1455832 and BMI that varies by age (Figure 1). For the minor allele homozygote (CC), we observe a gene-age interaction associated with increased BMI that diminishes after age 45 (see Figure 1). We therefore selected as our replication hypothesis a gene-age interaction where the CC genotype is associated with increased BMI until the age of 45, at which point the effect diminishes.

Figure 1
Profile Plots of Body Mass Index by Age for CC and CT or TT Genotypes in FHS and CAMP
Table 1
Screening and Testing of SNPs for Association with Body Mass Index on the Basis of the 86,604 SNP Scan in 1,322 Subjects of the Framingham Heart Study: The Top Ten Powered SNPs

We then attempted to replicate the association in eight additional cohorts totaling 13,584 subjects. These cohorts included studies of unrelated individuals and families, different ethnicities, varying ascertainment conditions, and different age ranges (childhood, adult). The objective of the replication effort was the assessment of whether the age-varying effect that was observed in the FHS was study-specific or could be generalized to the other samples. Further, assuming that the age-varying effect was unknown before the replication analysis, we assessed whether the default statistical analysis of the sample would have detected an age-dependent effect.

Material and Methods

Replication Samples

We used data on 1322 participants from the FHS offspring cohort in this analysis. Details of the study design, recruitment, and follow up are explained in more detail elsewhere.18–20 In brief, this cohort consisted of the offspring and spouses of offspring of participants in the original FHS cohort, who have been followed prospectively since 1971, and their children. The mean and median ages at the first examination were 31.9 years and 31 years, respectively. Individuals were recruited for this study because they lived in the town of Framingham, MA, without regard to the presence of a particular disease. Obesity was documented in BMI, which is calculated by the division of weight in kilograms by height in meters squared: kg/m2. Measurements of BMI were taken six times throughout an average time span of 23.5 years. The second examination occurred approximately 8 years after the first examination, and subsequent examinations occurred on average every 4 years thereafter. A total of 116,204 SNPs were initially genotyped, of which 86,604 were used in the association analyses after the appropriate exclusions were made.9 The samples were genotyped with the Affymetrix GeneChip Human Mapping 100K SNP set. The genotype call rate was 99.1% with no discordance among replicate samples.

Family-based association analyses were performed with the FBAT-PC methodology implemented in PBAT on the six longitudinal measures of BMI.17 In the initial analysis, additive, dominant, and recessive models were all considered. The FBAT-PC methodology works optimally over multiple time points when the genetic effect varies over time and in the presence of environmental correlation. When constructing the overall phenotype, the FBAT-PC methodology up-weights the time points with heritability evidence in the phenotypes and down-weights the time points with little evidence of heritability in the phenotypes.

Eight independent study samples were used to replicate the initial findings. These study populations were selected to provide information across a range of ethnicities and age ranges. Replication populations include families of children diagnosed with asthma from the Central Valley of Costa Rica (Costa Rica); families of children diagnosed with asthma from a North American clinical trial (CAMP); a child-based sample from Athens, Greece (Gene Diet Attica Investigation on childhood obesity [GENDAI]); a population-based sample from Germany (KORA_S4); a population-based sample from Iceland (Iceland); and three cohorts of lean and obese people from Europe (European Sample), the United States (US sample), and the Merck pharmaceutical company (Merck).

Costa Rica

The rs1455832 polymorphism was genotyped in 424 nuclear families of children with asthma in the Central Valley of Costa Rica. Most of the approximately 2.85 million current residents of the Central Valley descend from approximately 4000 individuals counted in the census of 1697. Short screening questionnaires were sent to the parents of 9054 children ages 6 to 14 years enrolled in 95 schools in the Central Valley of Costa Rica from February of 2001 to March of 2005. Children were eligible for inclusion in the study of parent-child trios if they had physician-diagnosed asthma, at least two respiratory symptoms (wheezing, cough, or dyspnea) or a history of asthma attacks in the previous year, and high probability of having at least six great grandparents born in the Central Valley of Costa Rica (as determined by the study genealogist on the basis of the paternal and maternal last names of each of the child's parents).21 Phenotypic information collected from the subjects included measures of weight and height, from which BMI was calculated. This population was not ascertained on the basis of morphometric phenotypes. Genotyping was performed with the Illumina BeadStation 500G system.22 Genotyping completion rate was greater than 99.8% with no discordancies among replicate genotypes. Because of Mendelian inconsistencies, nine families were excluded from the analysis. In total, 409 families had sufficient SNP and BMI information to be included in the analysis. Written parental consent was obtained for participating children, for whom written assent was also obtained. The study was approved by the Institutional Review Boards of the Hospital Nacional de Niños (San José, Costa Rica) and Brigham and Women's Hospital (Boston, MA).

CAMP

The Childhood Asthma Management Program (CAMP) is a multicentered North American clinical trial evaluating the long-term effect of inhaled antiinflammatory medications in children with mild to moderate asthma. DNA samples were obtained from participating children and their parents as previously described.23–25 In brief, children ages 5 through 12 were enrolled between December 1993 and September 1995. Otherwise healthy children were eligible for enrollment if they had a diagnosis of asthma as defined by methacholine hyperreactivity (FEV1PC20 < 12.5 mg/ml) and at least one of the following criteria for 6 months in the year prior to enrollment: (1) asthma symptoms at least two times per week, (2) at least two usages per week of an inhaled bronchodilator, and (3) use of daily asthma medication. Measures of height and weight were collected at baseline and throughout the study. Because chronic inhaled corticosteroid use (one of the three treatment arms in the CAMP trial) is known to influence height (and in turn BMI),25 we restricted the statistical analysis to the baseline measure of BMI when we could not appropriately control for the treatment arm assignment in another way. When the treatment arm could be adjusted for properly, we used all repeated measurements. A total of 408 individuals of European descent with BMI and SNP information were used in the analysis. Genotyping was performed with the Sequenom system with a genotyping completion rate of 98%.26 Informed assent and consent were obtained from the study participants and their parents to collect DNA for genetic studies. The Institutional Review Board of the Brigham and Women's Hospital, as well as those of the other CAMP study centers, approved the study.

GENDAI

DNA samples were obtained from 1020 school-aged children in Athens, Greece, as previously described.27 The study group was composed of children between 9 and 14 years of age from 40 of 593 schools in selected areas of the Athens Attica region. Enrollment began in November 2005 and continued through 2006. The children were each examined by a physician and surveyed through parent questionnaires. Exclusions included lack of interest, refusal, chronic or acute illness, absence from school, relocation, or incomplete data. Data collected include anthropometrics, a fasting blood sample, 24 hr dietary and physical activity recall, food frequency questionnaire, and questionnaires on meal patterns, lifestyle, personal, and family characteristics. Genotyping was completed by an allele-specific primer extension of amplified products with detection by matrix-assisted laser desorption ionization time-of-flight mass spectroscopy with a Sequenom platform. Genotyping call rate was 99.6% on 796 samples with adequate DNA concentration, showing no consensus errors. The IRB of Children's Hospital Boston and of Harokopio University and the Greek Ministry of Education approved the study. Subjects with poor genotyping quality were excluded. In total, 796 subjects were included in the statistical analysis.

KORA_S4

This is the fourth population-based survey (S4) of the platform “Kooperative Gesundheitsforschung in der Region Augsburg” (KORA_S4). Details about the study, the BMI distribution and its dependence upon gender and age can be found elsewhere.28 In brief, this survey was conducted during the years 1999–2001. The sample was drawn to represent the population of the city of Augsburg and surrounding counties in a stratified fashion by 10 year age groups and by gender. Subjects were recruited via registry, were of German nationality, 25–75 years of age, 50% females, and mostly white Europeans (99%). Anthropometric measurements including weight and height were assessed according to the WHO MONICA protocol by trained staff.28 The association was analyzed in all subjects with genotype and phenotype available, yielding data on 3861 subjects. Genotyping was performed with MALDITOF MS (Sequenom, Mass ARRAY System, San Diego, CA).29 The genotype call rate was 95.84% with no discordancies among replicate samples. All study participants gave informed written consent according to the ethics committee of the Bavarian Medical Association and every attempt was made to ensure anonymity of the participants.

Iceland

DNA samples were obtained from a population-based group of 4865 Icelanders. Any possible relatedness between Icelanders was accounted for as described previously.10 The study group was composed of individuals who participated in studies of the genetic etiology of cardiovascular and metabolic diseases, and the majority of these subjects were recruited as unaffected relatives of subjects or as controls and did not have any history of metabolic or cardiovascular diseases. All participants in the study signed informed consent. All personal identifiers associated with tissue samples, clinical information, and genealogy were encrypted by the Data Protection Authority (DPA), with a third-party encryption system in which DPA maintains the code.30 The genotyping procedure has been previously described.31 Genotype call rate was 93.72%. Ethical approval for the present study was granted by the National Bioethics Committee (NBC reference number 01-033) and the Icelandic DPA.

The p values and confidence intervals were adjusted for relatedness of the Icelandic individuals with simulations as previously described.32 Genotypes for the causal SNP were simulated through the Icelandic Genealogy Database, based on more than 600,000 Icelanders, and the association test was performed. This procedure was repeated 50,000 times to generate an empirical distribution, from which the standard deviation under the null hypothesis of no association was generated and used to adjust both the p value and the confidence interval.

European and US Cohorts

The rs1455832 polymorphism was genotyped in obese and lean Caucasian subjects from Poland (EUROPEAN) and the United States (US) with data provided by Genomics Collaborative. Obese subjects were selected from among individuals in the 90th−97th percentile of the BMI distribution, and lean subjects were selected from the 5th–12th percentile. Such distributions were determined with information on gender, country of origin, and age by decade as previously described.2 The obese and lean subjects were matched by gender, decade of age, and country of origin of grandparents. Genotyping completion rates were 99.0% for the European sample and 99.6% for the US sample. There were a total of 701 cases and 331 controls in the European sample and 1218 cases and 624 controls in the US sample. Ages for these samples range from 31 to 70 years. Approval for this work was obtained from the IRB at Children's Hospital Boston.

Merck

The rs1455832 polymorphism was genotyped in a case-control study in which the cases had been recruited as part of a multicenter, double-blind, randomized, placebo-controlled study. Obese patients with BMI 30-43 kg/m2 and between the ages of 21 and 65 who met other entry criteria were eligible to participate. Only white patients were considered for the genetic study. Prior to randomization, there was a 2 week single-blind placebo run-in period. The BMI measures used in the genetic analysis were collected at the end of the run-in period and before any drug was administered as part of the trial. Genetic and phenotypic information was collected on a total of 275 cases. The controls were acquired from GHC from a databank of individuals selected and consented for scientific study. We selected and genotyped 547 lean, healthy, white controls equally split between males and females, with BMI less than 25 and age greater than 50. In addition, the controls were selected because they were free of all forms of diabetes, cardiovascular disease, and obesity, and they had no first-degree relatives with these diseases. All samples and patient data were handled in accordance with the policies and procedures approved by a Merck IRB.

Statistical Methods

The PBAT screening approach, as described previously, was implemented in the 100K analysis of the FHS data to minimize the multiple comparisons problem.9,16 In summay, the PBAT screening and testing algorithm follows three steps: (1) Power is calculated for all SNPs with the conditional mean model, (2) the SNPs are rank ordered according to the power calculations, (3) the ten SNPs with the greatest power are selected, and (4) statistical analyses are performed on the selected subset of SNPs. We employed the PBAT screen while using FBAT-PC, which constructs one maximally heritable BMI component for each SNP from the six measures of BMI. Family-based association analysis is then applied to the univariate BMI phenotype.17 Golden Helix PBAT was used to perform these analyses.

Replication Analyses

We performed two analyses for each replication cohort: (1) a “uniform-analysis” approach that is the same for all eight replication cohorts and tests for both a genetic main effect and a gene-age interaction and (2) a “design-specific analysis” of the genetic main effect alone. When we refer to the design-specific analysis, we mean the default analysis that would typically be performed for a given study design. For the uniform analysis, a standard linear regression adjusting for the covariates age and gender was run, with genotype and genotype-by-age interactions as the primary predictors of interest and BMI as the response variable. The linear model used for these analyses was BMI = β0 + β1genotype + β2age + β3sex + β4age*genotype + epsilon, where genotype = 1 if CC and genotype = 0 otherwise. Obesity affection status was used as an additional covariate for the case-control studies. In the CAMP study, a multivariate longitudinal regression analysis was performed in which the 14 BMI measures were the dependent variable. All of the independent variables listed above were included, with the addition of an indicator variable for asthma treatment arm, because CAMP was part of a clinical trial. In all models, the genotype was modeled in a recessive fashion to replicate the initial findings. F tests were used to evaluate the significance of both the main genetic effect and the age-by-genotype interaction term.

Based on the results of this analysis, a study was considered either a replication or a nonreplication. If the parameter estimate for the gene-age interaction variable was significant (α < 0.05), associated with the recessive genotype (CC), and the parameter estimate was negative (indicating a decrease in BMI with older ages for the recessive genotype), the study was called a replication. Otherwise, it was referred to as a “nonreplication.” However, this created a dilemma for the pediatric cohorts (CAMP, Costa Rica, GENDAI) that were included in this replication attempt. The hypothesis of a gene-age interaction for subjects younger than 45 years was generated on the basis of the FHS, whose youngest subjects were 16–20 years old, whereas the pediatric cohorts covered an age range of 4–18 years. Because the age ranges of the pediatric cohorts and the FHS overlapped for only 2 years and a negative estimate for the gene-age interaction would have meant an increase in BMI with younger ages for the recessive genotype, a peak for the association with BMI must occur in the age range below 20 years. One could envision three hypotheses: (1) The peak age of association is at birth or a very early age, (2) the peak association age coincides with the ages of the pediatric cohorts, or (3) the peak association age is around 20, when the pediatric cohorts and the FHS overlap. Testing and distinguishing of all three hypotheses for the pediatric cohorts is technically challenging and requires larger sample sizes than those we have at hand. These three hypotheses have one thing in common that makes them consistent with the FHS findings: They predict higher BMI values for the recessive genotypes in the 15–20 year age range. We therefore considered a pediatric cohort to be a replication of the FHS finding if its interaction term was significant (α < 0.05) and the interaction had a positive parameter estimate associated with the recessive genotype (CC).

A Fisher's combined p value was calculated to determine the overall evidence for an age-by-genotype interaction for all models. Fisher developed a method for combining p values from k independent tests by summing over the log of the p value and multiplying the sum by −2. This statistic follows a chi-square distribution with 2k degrees of freedom. In this calculation, one-sided p values were used to account for scenarios in which the trend tended in the opposite direction.

Standard design-based analyses were performed for each of the replicate cohorts as follows. In the two family-based studies, Costa Rica and CAMP, we performed family-based association analysis of standardized BMI with age and sex as covariates, assuming a recessive genetic model, with Golden Helix PBAT. The standardization used in these analyses was recommended by the Center for Disease Control (CDC; see CDC-Defined BMI for Adolescents and Children in the Web Resources) and accounts for the effect of age and gender on BMI throughout childhood and adolescence. We restricted our analysis to the baseline measure of BMI in CAMP because one of the three treatment arms of the trial would have to be removed because of the known association of steroid use with BMI, and this would greatly reduce the sample size and therefore the power for this study. For the three case-control studies (European, US, and Merck), a logistic regression analysis of case-control status (obese and nonobese subjects) on rs1455832 was used. We performed the logistic regression analysis assuming a recessive model with a main genetic effect and adjusting for age and sex. Because the European and US samples were matched on age and gender, the addition of the variables as covariates contributed little to the analysis. In the population-based designs (KORA, Iceland, and GENDAI), we performed linear regression analysis of BMI on rs1455832 assuming a recessive genetic model and adjusting for age and sex. The Iceland analyses were performed with log-transformed BMI. No pooled analyses of the replication samples were performed because we did not have access to all of the individual genotype data.

Results

Preliminary Association of rs1455832 in the 100K FHS Scan

As previously reported,9 a total of 86,604 SNPs were screened for an association with BMI in the FHS. These SNPs were subsequently genotyped in an additional 399 individuals, totaling 1322 subjects. We then reanalyzed the data with the same methodology from the original paper; the results for the top ten powered SNPs are presented in Table 1. Of these, two SNPs reached genome-wide significance (unadjusted p value for rs7566605 = 0.002, unadjusted p value for rs1455832 = 0.0006). Details of the significant association between rs7566605 and obesity were described in the previous report.9 SNP rs1455832 is also common (minor allele frequency [MAF] = 0.271) and did not deviate from Hardy-Weinberg Equilibrium (HWE) (p = 0.27). SNP rs1455832 is located in intron 1 of the ROBO1 gene, which contains 30 exons. Of the additional 26 SNPs that were genotyped in the ROBO1 gene, 16 SNPs had allele frequencies that resulted in a large enough sample to evaluate genetic associations (the number of informative families greater than 20). The two SNPs adjacent to rs1455832 also had significant associations with BMI (p value for rs1455824 = 0.0092, p value for rs2311350 = 0.0181); however, this is not surprising because linkage disequilibrium exists between rs1455832 and both of these SNPs (r2 = 0.81). Details of the age-dependent relationship between BMI and the rs1455832 genotypes in the FHS cohort across a 50 year age range are presented in the first graph of Figure 1. In this analysis, we observed a strong age-varying genetic association in which the CC genotype was associated with increased BMI compared with the CT or TT genotypes until age 45 years, when this effect diminishes. Not surprisingly, the significance of the main genetic effect diminishes after an interaction term is added to the regression model, because this interaction term captures the age-dependent effect (Table 2).

Table 2
Findings in Replication Samples

Replication Studies in Eight Independent Samples

To validate this initial age-dependent association (i.e., age-by-genotype interaction with BMI) finding between rs1455832 and BMI, as well as assess its generalizability to other populations, we genotyped this variant and tested for evidence of association in eight independent populations. Descriptive summaries of the eight replication cohorts are provided in Table 3. Despite different ancestral histories, the allele frequency was similar across populations, ranging from 0.211 in Costa Rica to 0.289 in Iceland. Genotype distributions were in Hardy-Weinberg equilibrium in all eight populations.

Table 3
Descriptive Statistics on Replication Cohorts

For the uniform analysis across all replication studies, we performed a population-based linear regression analysis with an age-by-genotype interaction term, with a recessive model. The analysis was restricted to individuals younger than 45.

A summary of the main genetic effect and gene-age interaction for the age-varying effect on BMI is provided in Table 2. In all replication studies, with the exception of CAMP, the main genetic effect was not significant when an age-by-genotype interaction term was included in the model. In CAMP, the CC genotype was associated with an increased BMI compared with the CT or TT genotype. As illustrated in the second graph in Figure 1, this effect is not evident in the early age ranges but is apparent after age 10. This graph illustrates the relationship between BMI and the rs1455832 genotypes in the CAMP cohort across a 10 year age range, where 14 repeated measures of BMI were used in a multivariate regression model for each individual.

Replications of the FHS finding must meet three criteria: (1) The interaction term must be significant (α = 0.05), (2) the overall associated genotype must be consistent with the FHS (i.e., during the time frame in which we hypothesize an effect, the CC genotype is associated with an increased BMI), and (3) the interaction parameter estimate must be positive for pediatric cohorts or negative for adult cohorts. Five of the eight cohorts met these criteria: US, European, Merck, CAMP, and Costa Rica. Significant age-by-genotype interactions were observed in three adult cohorts whose age ranges match the FHS: US, European, and Merck. In these studies, the parameter estimate for the interaction variable is slightly negative, which is in the same direction and approximately the same magnitude of the effect observed in the FHS. This is to be expected because their age ranges are similar to those of the initial study. Of the pediatric cohorts, CAMP and Costa Rica replicated the FHS findings. As expected for very young ages, under the alternative hypothesis of an age-dependent genetic effect, the interaction parameter estimate for both of these studies was positive, which is in the opposite direction from the adult cohorts. For all five replications, the CC genotype was associated with an increased BMI (p < 0.05) during the time frame of interest. In sum, these studies reflect a stronger increase in BMI for the recessive genotype at younger ages for the pediatric cohorts and the diminishing effect of the recessive genotype in the adult cohorts. As discussed above, these five studies are considered valid replications of the initial finding under the alternative hypothesis of an age-dependent genetic effect.

Despite the lack of evidence for interaction in the Iceland, KORA_S4, and GENDAI cohorts (Iceland p = 0.910, KORA_S4 p = 0.968, GENDAI p = 0.851), the one-sided Fisher's combined p value for the age-dependent relationship in all replicate samples was highly significant (p = 3.92 × 10−9). For the pediatric and adult cohorts separately, the combined p values were 2.21 × 10−8 and 0.00422, respectively. In these calculations, information from GENDAI, KORA_S4, and Iceland cohorts do not contribute positively to this calculation because the direction of the association was opposite. For the computation of the combined p value, we accounted for the effect in the opposite direction by using one-sided p values. Because Fisher's method of combining p values is sensitive to highly significant results (which we have in the CAMP sample), we recalculated the combined p value eight times, each time removing one of the replication studies. The overall significance of the result did not depend on the findings from any individual replication sample because the combined p value remained significant at an α level of 0.01 when any single replication sample was removed from the analysis. It is apparent that the results from the CAMP study are highly significant and thus highly influential on the overall findings. However, if the CAMP study were completely removed from the analysis, all results and conclusions would remain the same, with the only difference being the magnitude of the combined p value (p = 0.0037). To assure that the data from the CAMP study are not spurious, we thoroughly checked the genotyping and analysis methods used. We found the quality of these data to be excellent and therefore have kept this sample in the paper.

In addition to the uniform analysis approach we used for all studies, standard study-design-specific analyses were conducted, as well. These analyses were focused only on genetic main effects. The analyses were performed for the family-based, case-control, and population-based cohorts separately. These findings are summarized in Table 4. With Golden Helix PBAT for the family-based designs, the association between standardized BMI and rs1455832 was of borderline statistical significance (p = 0.06) in 63 informative families in Costa Rica and tended in the risk direction. This association was found to be significant in the CAMP cohort with 96 informative families (p = 0.017). Therefore, the main genetic effect was detected in both of these child and/or adolescent samples. Logistic regression analyses did not result in a significant main genetic effect for any of the case control studies. In the KORA_S4 cohort, there was no association between the genotype and BMI among all subjects (p = 0.848). In the Iceland cohort, the genetic effect was also not significant (p = 0.596). No significant association was observed between genotype and BMI with the GENDAI cohort (p = 0.679).

Table 4
Findings in Replication Samples for the Main Genetic Effect: Analysis by Study Design

Discussion

We initially observed an age-varying (i.e., age-by-genotype interaction) association between BMI and rs1455832 in the FHS cohort, where the CC genotype was associated with increased BMI early in life and diminished over time. The initial finding was arrived at with the FBAT-PC approach, which is particularly suited to detect both genetic main effects and time-dependent effects. We sought to replicate the age-varying association in eight independent replication samples. The age-by-rs1455832 interaction with the same directional effect was observed in five of the eight replicate samples with statistically significant results. These replication samples each had varying demographic properties, ascertainment conditions, and study designs. Notably, the genetic association was not detectable in all populations (three of which had no significant age-by- rs1455832 interactions); however, the combined p value from all of the replication samples indicates a significant overall association. It is also important to note that, in all of the replication samples except CAMP, the genetic association effect would not have been detected had we only been testing for a main genetic effect and not an age-by-rs1455832 interaction.

One important aspect in identifying associations of interest is selecting replication samples in the age range of the effect that was initially observed. Therefore, if we are simply interested in evaluating the observed main genetic effect, samples in which all individuals are less than 30 years of age are ideal. If the cohort includes individuals younger and older than 30 years of age and an interaction term is not incorporated into the analysis, the genetic effect may be missed. In cohorts whose members are all younger than 30 years, we would expect to observe the main genetic effect. The two family-based replication samples, CAMP and Costa Rica, identified the genetic association with at least marginal significance (α < 0.10), whether or not the time-dependent effect was considered. This is probably because both of these studies recruited children and adolescents and are therefore well-powered to detect the association. The profile plot for the CAMP study suggests that the genetic effect is less clear at very young ages (less than 10 years). This might point toward an effect starting at or after puberty, and it might be speculated that hormone status plays a role. In contrast to the two family-based studies, the GENDAI cohort does not show the main genetic effect or the interaction. This could be attributable to the small age range in GENDAI; most subjects were either age 11 or 12, making it very difficult to detect a gene-age interaction.

We also observed that the findings are sensitive to how strongly the sample was ascertained for obesity. A big advantage of population-based studies (FHS, KORA_S4, Iceland, and GENDAI) is that their phenotypic range is not restricted through the ascertainment condition, making them ideal samples in which to test for gene-age effects. Although the CAMP and Costa Rica samples were ascertained for asthma, which has a weak association with obesity, the asthma cases are mild enough and the asthma-obesity relationship tenuous enough to conclude that neither of these samples was ascertained for obesity. Therefore FHS, KORA_S4, Iceland, GENDAI, CAMP, and Costa Rica can essentially be thought of as unbiased cohorts for BMI. When looking for an age-dependent relationship between a disease and SNP, the analysis of these cohorts is straightforward. In contrast, the analytic strategy to detect interaction with age is less clear for the Merck, European, and US samples, all of which were ascertained for obesity. When the data are ascertained by the case-control status, the phenotypic variation is reduced so substantially that the interaction effect is obscured. Therefore, case-control studies are better suited to detect main effects, but their design is limited to detect interactions.

These analyses also illustrate that longitudinal phenotypic measures provide more power to detect modest age-related effects in genetic studies. Age-dependent effects can easily be missed if only one time point is collected, because the data collection may not occur when the genetic effect is strongest. Longitudinal measures are also a powerful way to increase the power of genetic studies, which is important when the genetic effects are modest. In our analyses, the two strongest effects were observed with the FHS and CAMP cohorts, both of which had many repeated-measures of BMI at different ages. Population-based cohorts like the FHS, with longitudinal data during the proper time period, seem to be optimal for identifying age-varying genetic effects. Although it is true that such a model can be estimated with cohort or case-control data at a single time point, the statistical power of such a model to detect significant genetic associations, particularly gene-time interactions, is much higher when longitudinal data are available.

A potential cause for the observed age-related genetic effects could be period effects; however, given the relatively consistent findings across several samples collected at different time points (some of them more than 40 years apart) and in different geographical locations (US, Europe, Iceland), it is unlikely that the age-by-genotype interaction is an artifact of a cohort effect.

ROBO1 encodes for a receptor that is a member of the neural cell adhesion molecule family and is highly conserved. High tissue expression is found in brain, cochlea, mouth, adipose, and embryonic tissue (UniGene Hs. 13640). In humans, the gene maps on chromosome 3p12, and inactivating mutations were found in lung, breast, and kidney cancer. Genetic variations within the ROBO1 gene were reported to be associated with dyslexia.33 We can only speculate whether ROBO1 plays a role in obesity. ROBO1 is expressed in adipose tissue, though its function in this tissue is completely unknown.

As important findings arise from the increasing number of genetic-association studies and subsequent replication attempts become more common, it is vital that replication attempts carefully examine specific attributes of the initial study. Among necessary considerations are possible age-dependent effects from the initial association findings. In a common disease such as obesity, population-based samples are better suited than highly ascertained studies to replicate the initial findings. In conclusion, an age-genotype interaction has caused apparent inconsistencies in the attempts to replicate the finding of an association between rs1455832 and obesity in the FHS. This highlights the pitfalls of replication in cohorts with ascertainment conditions and age ranges different than those of the original study. Population-based samples with longitudinal data have greater statistical power to detect genetic effects that vary in strength by age. Given that current genotyping technology has made genotyping affordable in large-scale population-based studies, such cohorts seem to be ideal for both the discovery of new genetic effects and their replication.

Web Resources

The URLs for data presented herein are as follows:

Acknowledgments

Each of the replication cohorts would like to acknowledge the specific funding and individual efforts that made these studies possible. Therefore, the acknowledgements are separated by replication cohort.

FHS: These data were made available by the Framingham Heart Study of the National Heart Lung and Blood Institute (NHLBI) of the National Institutes of Health (NIH) and Boston University School of Medicine. This work was supported by the National Heart, Lung, and Blood Institute's Framingham Heart Study (Contract Number N01-HC-25195). The manuscript was not prepared in collaboration with investigators of the FHS and does not necessarily reflect the opinions or views of the FHS, Boston University, or NHLBI.

KORA_S4: This study was supported within the German National Genomic Research Network (NGFN) by the Federal Ministry of Education and Research (BMBF) and the Munich Center of Health Sciences of the Ludwig-Maximilians-University, Germany. The KORA research platform (KORA, Cooperative Research in the Region of Augsburg) was initiated and financed by the GSF-National Research Centre for Environment and Health, which is funded by the BMBF and by the State of Bavaria.

Iceland: Ethical approval for the present study was granted by the National Bioethics Committee (NBC, reference number 01-033) and the Icelandic Data Protection Authority (DPA).

Costa Rica: The Costa Rican study was approved by the Institutional Review Boards of Brigham and Women's Hospital and the Hospital Nacional de Niños in San José (Costa Rica). The Costa Rican study was funded by grants HL066289 and HL04370 from the NIH.

CAMP: The Institutional Review Board of Brigham and Women's Hospital, as well as those of the other CAMP study centers, approved this study. Informed assent and consent were obtained from study participants and their parents to collect DNA for genetic studies. We thank all the CAMP families for their enthusiastic participation in this study. We acknowledge the CAMP investigators and research team for collection of CAMP Genetic Ancillary Study data. CAMP was supported by contracts N01-HR-16044, 16045, 16046, 16047, 16048, 16049, 16050, 16051, and 16052 from the National Heart, Lung and Blood Institute. The CAMP Genetics Ancillary Study is supported by U01 HL65899 (STW PI) and PO1HL083069 (STW PI) from the National Heart Lung and Blood Institute. All work on data from the CAMP Genetics Ancillary Study was conducted at the Channing Laboratory, Brigham and Women's Hospital under appropriate CAMP policies and human subjects protections.

GENDAI: Funding for the implementation of GENDAI was received by Coca-Cola Hellas. We would like to thank Mary Yannakoulia for her contribution to this paper.

References

1. Ioannidis J.P., Ntzani E.E., Trikalinos T.A., Contopoulos-Ioannidis D.G. Replication validity of genetic association studies. Nat. Genet. 2001;29:306–309. [PubMed]
2. Lyon H.N., Florez J.C., Bersaglieri T., Saxena R., Winckler W., Almgren P., Lindblad U., Tuomi T., Gaudet D., Zhu X. Common variants in the ENPP1 gene are not reproducibly associated with diabetes or obesity. Diabetes. 2006;55:3180–3184. [PubMed]
3. Heo M., Leibel R.L., Boyer B.B., Chung W.K., Koulu M., Karvonen M.K., Pesonen U., Rissanen A., Laakso M., Uusitupa M.I. Pooling analysis of genetic data: The association of leptin receptor (LEPR) polymorphisms with variables related to human adiposity. Genetics. 2001;159:1163–1178. [PMC free article] [PubMed]
4. Heo M., Leibel R.L., Fontaine K.R., Boyer B.B., Chung W.K., Koulu M., Karvonen M.K., Pesonen U., Rissanen A., Laakso M. A meta-analytic investigation of linkage and association of common leptin receptor (LEPR) polymorphisms with body mass index and waist circumference. Int. J. Obes. Relat. Metab. Disord. 2002;26:640–646. [PubMed]
5. Jellema A., Zeegers M.P., Feskens E.J., Dagnelie P.C., Mensink R.P. Gly972Arg variant in the insulin receptor substrate-1 gene and association with Type 2 diabetes: A meta-analysis of 27 studies. Diabetologia. 2003;46:990–995. [PubMed]
6. Geller F., Reichwald K., Dempfle A., Illig T., Vollmert C., Herpertz S., Siffert W., Platzer M., Hess C., Gudermann T. Melanocortin-4 receptor gene variant I103 is negatively associated with obesity. Am. J. Hum. Genet. 2004;74:572–581. [PMC free article] [PubMed]
7. Swarbrick M.M., Waldenmaier B., Pennacchio L.A., Lind D.L., Cavazos M.M., Geller F., Merriman R., Ustaszewska A., Malloy M., Scherag A. Lack of support for the association between GAD2 polymorphisms and severe human obesity. PLoS Biol. 2005;3:e315. [PMC free article] [PubMed]
8. Duerr R.H., Taylor K.D., Brant S.R., Rioux J.D., Silverberg M.S., Daly M.J., Steinhart A.H., Abraham C., Regueiro M., Griffiths A. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science. 2006;314:1461–1463. [PubMed]
9. Herbert A., Gerry N.P., McQueen M.B., Heid I.M., Pfeufer A., Illig T., Wichmann H.E., Meitinger T., Hunter D., Hu F.B. A common genetic variant is associated with adult and childhood obesity. Science. 2006;312:279–283. [PubMed]
10. Lyon H.N., Emilsson V., Hinney A., Heid I.M., Lasky-Su J., Zhu X., Thorleifsson G., Gunnarsdottir S., Walters G.B., Thorsteinsdottir U. The association of a SNP upstream of INSIG2 with body mass index is reproduced in several but not all cohorts. PLoS Genet. 2007;3:e61. [PMC free article] [PubMed]
11. Rosskopf D., Bornhorst A., Rimmbach C., Schwahn C., Kayser A., Kruger A., Tessmann G., Geissler I., Kroemer H.K., Volzke H. Comment on “A common genetic variant is associated with adult and childhood obesity” Science. 2007;315:187. author reply 187. [PubMed]
12. Loos R.J., Barroso I., O'Rahilly S., Wareham N.J. Comment on “A common genetic variant is associated with adult and childhood obesity” Science. 2007;315:187. author reply 187. [PMC free article] [PubMed]
13. Dina C., Meyre D., Samson C., Tichet J., Marre M., Jouret B., Charles M.A., Balkau B., Froguel P. Comment on “A common genetic variant is associated with adult and childhood obesity” Science. 2007;315:187. author reply 187. [PubMed]
14. Hall D.H., Rahman T., Avery P.J., Keavney B. INSIG-2 promoter polymorphism and obesity related phenotypes: Association study in 1428 members of 248 families. BMC Med. Genet. 2006;7:83. [PMC free article] [PubMed]
15. Gorlova O.Y., Amos C.I., Wang N.W., Shete S., Turner S.T., Boerwinkle E. Genetic linkage and imprinting effects on body mass index in children and young adults. Eur. J. Hum. Genet. 2003;11:425–432. [PubMed]
16. Van Steen K., McQueen M.B., Herbert A., Raby B., Lyon H., Demeo D.L., Murphy A., Su J., Datta S., Rosenow C. Genomic screening and replication using the same data set in family-based association testing. Nat. Genet. 2005;37:683–691. [PubMed]
17. Lange C., Van Steen K., Andrew T., Lyon H., DeMeo D.L., Raby B., Murphy A., Silverman E.K., MacGregor A., Weiss S.T., Laird N.M. A family-based association test for repeatedly measured quantitative traits adjusting for unknown environmental and/or polygenic effects. Stat. Appl. Genet. Mol. Biol. 2004;3:Article17. [PubMed]
18. Kannel W.B. Review of recent Framingham study hypertension research. Curr. Hypertens. Rep. 2000;2:239–240. [PubMed]
19. Feinleib M., Kannel W.B., Garrison R.J., McNamara P.M., Castelli W.P. The Framingham Offspring Study. Design and preliminary data. Prev. Med. 1975;4:518–525. [PubMed]
20. Kannel W.B., Feinleib M., McNamara P.M., Garrison R.J., Castelli W.P. An investigation of coronary heart disease in families. The Framingham offspring study. Am. J. Epidemiol. 1979;110:281–290. [PubMed]
21. Escamilla M.A., Spesny M., Reus V.I., Gallegos A., Meza L., Molina J., Sandkuijl L.A., Fournier E., Leon P.E., Smith L.B., Freimer N.B. Use of linkage disequilibrium approaches to map genes for bipolar disorder in the Costa Rican population. Am. J. Med. Genet. 1996;67:244–253. [PubMed]
22. Oliphant A., Barker D.L., Stuelpnagel J.R., Chee M.S. BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. Biotechniques. 2002;56–8(Suppl):60–61. [PubMed]
23. Lyon H., Lange C., Lake S., Silverman E.K., Randolph A.G., Kwiatkowski D., Raby B.A., Lazarus R., Weiland K.M., Laird N., Weiss S.T. IL10 gene polymorphisms are associated with asthma phenotypes in children. Genet. Epidemiol. 2004;26:155–165. [PMC free article] [PubMed]
24. The Childhood Asthma Management Program Research Group The Childhood Asthma Management Program (CAMP): Design, rationale, and methods. Control. Clin. Trials. 1999;20:91–120. [PubMed]
25. The Childhood Asthma Management Program Research Group Long-term effects of budesonide or nedocromil in children with asthma. N. Engl. J. Med. 2000;343:1054–1063. [PubMed]
26. Tang K., Fu D., Kotter S., Cotter R.J., Cantor C.R., Koster H. Matrix-assisted laser desorption/ionization mass spectrometry of immobilized duplex DNA probes. Nucleic Acids Res. 1995;23:3126–3131. [PMC free article] [PubMed]
27. Papoutsakis C., Vidra N.V., Hatzopoulou I., Tzirkalli M., Farmaki A.E., Evagelidaki E., Kapravelou G., Kontele I.G., Skenderi K.P., Yannakoulia M., Dedoussis G.V. The Gene-Diet Attica investigation on childhood obesity (GENDAI): Overview of the study design. Clin. Chem. Lab. Med. 2007;45:309–315. [PubMed]
28. Heid I.M., Vollmert C., Hinney A., Doring A., Geller F., Lowel H., Wichmann H.E., Illig T., Hebebrand J., Kronenberg F. Association of the 103I MC4R allele with decreased body mass in 7937 participants of two population based surveys. J. Med. Genet. 2005;42:e21. [PMC free article] [PubMed]
29. Buetow K.H., Edmonson M., MacDonald R., Clifford R., Yip P., Kelley J., Little D.P., Strasberg R., Koester H., Cantor C.R., Braun A. High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Proc. Natl. Acad. Sci. USA. 2001;98:581–584. [PMC free article] [PubMed]
30. Gulcher J.R., Stefansson K. The Icelandic Healthcare Database and informed consent. N. Engl. J. Med. 2000;342:1827–1830. [PubMed]
31. Kutyavin I.V., Milesi D., Belousov Y., Podyminogin M., Vorobiev A., Gorn V., Lukhtanov E.A., Vermeulen N.M., Mahoney W. A novel endonuclease IV post-PCR genotyping system. Nucleic Acids Res. 2006;34:e128. [PMC free article] [PubMed]
32. Stefansson H., Helgason A., Thorleifsson G., Steinthorsdottir V., Masson G., Barnard J., Baker A., Jonasdottir A., Ingason A., Gudnadottir V.G. A common inversion under selection in Europeans. Nat. Genet. 2005;37:129–137. [PubMed]
33. Hannula-Jouppi K., Kaminen-Ahola N., Taipale M., Eklund R., Nopola-Hemmi J., Kaariainen H., Kere J. The axon guidance receptor gene ROBO1 is a candidate gene for developmental dyslexia. PLoS Genet. 2005;1:e50. [PMC free article] [PubMed]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • Gene
    Gene
    Gene links
  • GEO Profiles
    GEO Profiles
    Related GEO records
  • HomoloGene
    HomoloGene
    HomoloGene links
  • MedGen
    MedGen
    Related information in MedGen
  • PubMed
    PubMed
    PubMed citations for these articles
  • SNP
    SNP
    PMC to SNP links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...