Jump to: Authorized Access | Attribution | Authorized Requests

Study Description

Note: This substudy phs000667 CHARGE-S Cardiovascular Health Study contains harmonized phenotype data, whole genome, exome, targeted sequencing and exome chip data of a subset of the CHS cohort selected for NHLBI's CHARGE-S CONSORTIUM Project. Summary level phenotypes of the Cardiovascular Health Cohort study participants may be viewed at the top-level study page, phs000287 Cardiovascular Health Study (CHS) Cohort. Individual level phenotype data and molecular data for the top-level Cardiovascular Health Study and its substudies are available by requesting Authorized Access to the Cardiovascular Health Study (CHS) Cohort study phs000287.

Genome-wide association studies (GWAS) have successfully localized multiple loci containing common variations influencing coronary heart disease and its risk factors, but in most cases neither the gene underlying disease susceptibility nor the spectrum of candidate functional variants has been identified. Building on GWAS for NHLBI-diseases: the U.S. CHARGE consortium (the CHARGE sequencing (CHARGE-S) consortium) is a collaborative effort to leverage existing population, laboratory and computational resources to identify susceptibility genes underlying genome-wide significant and well-replicated GWAS findings for heart, lung and blood diseases and their risk factors. The sequencing approach was funded by NHLBI with funds provided by the American Recovery and Reinvestment Act of 2009 (ARRA). The U.S. CHARGE consortium consists of multiple large population-based longitudinal cohort studies, including the Atherosclerosis Risk in Communities (ARIC) Study (N=15,792), the Cardiovascular Health Study (CHS) (N=5,888), and the Framingham Heart Study (FHS) (N=14,428).

The study has taken a two pronged approach to following-up GWAS. First, regional capture targeted sequencing was performed in genomic regions influencing 15 phenotypes to localize causal variants that are responsible for the GWAS signal. The phenotypes examined were atrial fibrillation, blood pressure, body mass index (BMI), bone mineral density, C-reactive protein (CRP), carotid intima-media thickness (IMT), echocardiography, electrocardiogram PR and QRS interval, fasting insulin, hematocrit, pleiotropy, pulmonary function, retinal venule diameter, and stroke. A case-cohort study design was used in which a common reference sample was selected from all three cohorts at baseline. The cohort random sample included 2,000 individuals composed of 1,000 participants from the ARIC study, 500 participants from FHS, and 500 participants from CHS in a 1:1 gender ratio. The comparison groups were either selected cases for discrete phenotypes, or participants drawn from the top and/or bottom tail of the distribution for quantitative phenotypes. The size of each comparison group was 200 individuals. Approximately 2 Mb of the genome was sequenced for the targeted loci.

Second, whole exome capture sequencing and low-pass whole genome sequencing was completed for the cohort random sample and 7 phenotypes for which there were more than 3 GWAS signals in coding regions to detect novel rare and common variants. The phenotypes investigated by whole exome sequencing were age at menopause, electrocardiogram QT interval, fasting blood glucose, fibrinogen level, renal function, Stamler-Kannel-like extremes of risk factors, and waist-to-hip ratio.

Follow-up genotyping using the Illumina HumanExome BeadChip has also been completed. Additional information including variant annotation is available at: http://www.chargeconsortium.com/main/exomechip.

All sequencing was performed at the Human Genome Sequencing Center at the Baylor College of Medicine, and all genotyping with the HumanExome BeadChip was performed at Cedars-Sinai Medical Center. This study contains the Cardiovascular Health Study (CHS) study subset of CHARGE-S. Additional data from CHARGE-S is also available via dbGaP.

  • Study Weblinks:
  • Study Design:
    • Prospective Longitudinal Cohort
  • Study Type:
    • Case-Cohort
    • Cohort
  • dbGaP estimated ancestry using GRAF-pop
Authorized Access
Publicly Available Data (Public ftp)
Study Inclusion/Exclusion Criteria

The inclusion criteria for selection of participants for the CHARGE-S study were full informed consent, sufficient DNA for sequencing, self-reported ethnicity as non-Hispanic white, no first degree relatives within a subgroup of individuals selected for an extreme trait, and availability of genotyping results from the appropriate phenotype-specific GWAS. Participants from at least one of the three cohorts represented in the CHARGE-S study were included in each phenotype group. Inclusion and exclusion criteria for the individual phenotypes are described below:

A. Targeted sequencing

  1. Atrial fibrillation: Two hundred (200) subjects from the Massachusetts General Hospital Atrial Fibrillation study with early-onset atrial fibrillation occurring before 66 years of age were selected for sequencing. Participants with evidence of structural heart disease as assessed by echocardiography were excluded.
  2. Blood pressure: One hundred (100) individuals were selected from the ARIC study, 50 from FHS, and 50 from CHS from both extremes of the standardized residuals of systolic blood pressure and diastolic blood pressure after adjustment for age, age2, BMI, and study site if applicable. The regression was stratified by sex, and an equal number of individuals from both sexes were chosen for sequencing. Data from the first clinical visit available for an individual was used where there was data from multiple examinations. Systolic blood pressure and diastolic blood pressure was adjusted if participants were taking antihypertensive medication by adding 10 mm Hg/5 mm Hg. Individuals taking antihypertensive medication for the selection of subjects for the lower tail of the trait; with a history of heart failure prior to measurement of blood pressure; whose systolic blood pressure was < 60 mm Hg, or diastolic blood pressure < 20 mm Hg; or whose BMI was + 4 standard deviations from the mean were excluded.
  3. Body mass index (BMI): Two hundred (200) unrelated individuals including 100 participants from the ARIC study, 50 from CHS, and 50 from FHS were sequenced from the high tail of the distribution for BMI based on age- and sex-adjusted residuals. In FHS, subjects were greater than 25 years of age. In the ARIC study and CHS, there were no age restrictions. In all studies, individuals were excluded if BMI < 18.5 kg/m2.
  4. Bone mineral density: One hundred (100) individuals were selected for targeted sequencing from CHS and 100 were chosen from FHS who had extremely low femoral neck bone mineral density (FN BMD) with approximately twice as many women as men. The selection of participants was based on using FN BMD T-score (number of standard deviations below young normal values) < -2 and Z-score < -1.5.

    After the original CHARGE-S sequencing, the musculoskeletal working group received funding to perform targeted sequencing of all CHARGE-S working group loci in an additional sample of Framingham participants. The 325 samples were selected to have low FN BMD using the following sequential criteria until all 325 samples were selected:

    1. T-score < -2 (for both FN BMD and lumbar spine BMD (LS BMD)) and Z -score < -2 (for both FN BMD and LS BMD)
    2. FN BMD T-score < -2 and FN BMD Z-score < -2
    3. FN BMD T-score < -2 and FN BMD Z-score < -1.5
    4. FN BMD T-score < -1.5 and FN BMD Z-score < -1.5
    5. FN BMD T-score < -1.0 and FN BMD Z-score < -1.5
    6. FN BMD T-score < -1.0 and FN BMD Z-score < -0.5

    For criteria #2 to #6, individuals were excluded if they had LS BMD Z-score > 0

    Of the 325 samples, 300 were sequenced, and the ratio of women: men as approximately 2.5:1 since the GWAS findings that generated these candidate genes came from a sample with approximately the same ratio of women and men. Thus there were 81 men and 219 women included in the sample.

  5. C-reactive protein (CRP) level: One hundred (100) individuals from the ARIC study, 50 from CHS, and 50 from FHS with the highest CRP residuals were chosen from a sex-stratified sample after adjustment for age, hormone therapy, study site, BMI, and lipid therapy. Participants with residuals greater than four times the standard deviation were excluded.
  6. Carotid intima-media thickness (IMT): Participants were selected for sequencing from the high tail of the common carotid IMT distribution. The study sample included 100 subjects from the ARIC study, 50 subjects from CHS, and 50 subjects from FHS, with an equal number of men and women.
  7. Echocardiography (left ventricular diastolic dimension): Fifty (50) unrelated males and 50 unrelated females (n=100) from the highest end of the trait distribution in CHS and FHS were sampled for sequencing after adjustment for age, height, weight, and study site if applicable.
  8. ECG (electrocardiogram) PR interval: Two hundred (200) subjects from the upper tail of the PR trait distribution based on residuals of a model with PR interval as the dependent variable and age, sex, study center, BMI, and height as the independent variables were selected for sequencing, including 50 men with the highest residuals and 50 women with the highest residuals in the ARIC study, 50 participants from CHS, and 50 participants from FHS. Individuals with a history of atrial fibrillation at baseline; extreme PR interval (< 80 or > 320); pacemaker or defibrillator; Wolff-Parkinson-White (WPW) syndrome; third degree AV block; history of heart failure or myocardial infarction; use of digoxin or class I or class III antiarrhythmic blocking medication; or who were missing covariates used for adjustment were excluded.
  9. ECG (electrocardiogram) QRS interval: Two hundred (200) subjects were sequenced from the upper tail of the QRS trait distribution including 50 men and 50 women in the ARIC study, 50 individuals from CHS and 50 participants in FHS after applying exclusions. Individuals with atrial fibrillation; history of myocardial infarction or congestive heart failure; a QRS interval > 120; Wolff-Parkinson-White (WPW) syndrome; implantation of a pacemaker, and use of class I and class III antiarrhythmic blocking medication were excluded.
  10. Fasting insulin: Two hundred (200) subjects were sampled from the high tail of the distribution including 100 individuals in the ARIC study, 50 participants in CHS and 50 participants in FHS. Individuals with known diabetes; who were treated for diabetes; or those with a fasting glucose > 7 mmol/L, were excluded. The ARIC study and FHS applied a further exclusion of non-fasting individuals. Participants who were missing hemoglobin A1c values were also excluded in the ARIC study, and subjects with type 1 diabetes were excluded in FHS. Selection was gender-specific.
  11. Hematocrit: Two hundred (200) individuals were selected from the lower tail of the hematocrit distribution including 100 ARIC study participants, 50 CHS participants, and 50 FHS participants. A 50:50 gender ratio was maintained. The residuals from linear regression of hematocrit as a continuous trait with adjustment for age, sex, and study site for multicenter cohorts were calculated for each of the three cohorts. Individuals with hematocrits within 3 standard deviations of the sample mean for each cohort were included in the analysis. Individuals with known malignancies; who smoked; or who had renal failure were excluded.
  12. Pleiotropy: Pleiotropy, or the influence of a single gene on multiple traits, can be defined operationally as evidence that a region or locus containing one to many genes displays strong associations (p < 5 x 10-8) with 2 or more traits in multiple genome-wide association studies. Using novel bioinformatics tools that allow cross-trait queries to identify and visualize associations in regions that show a high degree of overlap across traits, 44 regions related to cardiovascular disease were selected for sequencing studies. These regions were assessed in all participants from the ARIC study, CHS, and FHS who were selected for targeted sequencing.
  13. Pulmonary function: Severe cases of chronic obstructive pulmonary disease (COPD) were selected based on forced expiratory volume in the first second (FEV1) that was less than 65% of the predicted value, and its ratio to forced vital capacity (FEV1/FVC) that was less than the lower limit of 'normal', based on NHANES III prediction equations. A random sample of 200 subjects was selected for sequencing among those who met the severe COPD definition at visits 1 and 2 in the ARIC study, and who had non-missing covariate data.
  14. Retinal venule diameter: Individuals were selected for sequencing from the highest quartile of the trait distribution adjusted for age and sex from the ARIC study (n=166) and CHS (n=34). All participants had retinal photography and retinal arteriolar and venular caliber measured from computer software using standardized protocols.
  15. Stroke: Stroke was defined as a focal neurological deficit of presumed vascular cause with sudden onset and lasting for at least 24 hours or until death if the participant died less than 24 hours after the onset of symptoms. Participants with incident ischemic stroke based on clinical and imaging criteria excluding cardioembolic events were eligible for selection. This phenotype, corresponding to both large and small artery atherothrombotic strokes, yielded the largest hazard ratio in the CHARGE meta-analysis. From among individuals meeting these criteria, the individuals with the earliest strokes with onset past the age of 65, and equal numbers of men and women were selected in numbers proportional to the size of the participating cohorts: 80, 70, and 50 individuals from the ARIC study, CHS, and FHS, respectively.

B. Exome sequencing

  1. ECG (electrocardiogram) QT interval: QT interval measures were adjusted for age and RR interval, and the highest standardized residuals were used as the basis of selection of 100 participants from the ARIC study, 50 participants from CHS, and 50 participants from FHS. Individuals who had a bundle branch block; QRS interval > 120; atrial fibrillation; pacemaker activity, or who were using QT prolonging medication were excluded. Selection was gender-specific.
  2. Fasting glucose: Two hundred (200) subjects were sampled from the high tail of the native distribution of fasting blood glucose including 100 individuals in the ARIC study, 50 participants in CHS, and 50 participants in FHS. Individuals with known diabetes; who were treated for diabetes; or those with a fasting glucose > 7 mmol/L, were excluded. The ARIC study and FHS applied a further exclusion of non-fasting individuals. Participants who were missing hemoglobin A1c values were also excluded in the ARIC study, and subjects with type 1 diabetes, were excluded in FHS. Selection was gender-specific.
  3. Menopause: One hundred (100) women in the ARIC study, 50 in CHS, and 50 in FHS were sequenced from the lower tail of the natural menopause distribution. Age at natural menopause was defined on the basis of self-reported age at the last menstrual period, excluding those reporting any menstrual cycle in the previous 12 months, and including only natural menopause cases (i.e., cases due to hysterectomy, chemotherapy, oophorectomy, or unknown causes, were excluded). Early menopause was defined as natural menopause occurring between 35 and 44 years of age. Each cohort chose subjects at random from that tail.
  4. Plasma fibrinogen level: Participants with the highest fibrinogen levels based on residuals that were stratified by sex and adjusted for age and clinic site were sampled for sequencing. The top 25 individuals were selected for each sex from CHS and FHS, and the top 50 individuals for each sex were chosen from the ARIC study.
  5. Renal function: Each of the three cohorts selected their best chronic kidney disease (CKD) cases. A total of 200 subjects were chosen based on low glomerular filtration rate estimated by serum creatinine (eGFcrea). In CHS, CKD was defined as eGFR < 60 ml/min/1.73m2 based on a single measurement at the baseline visit (n=50). ARIC (n=100) and FHS (n=50) used a cumulative definition of CKD based on measurements at several study visits. In the ARIC study, subjects were chosen based on low eGFR at visits 1, 2, and 4 if they were not included in the cohort random sample or other case groups. At each visit, individuals with eGFRcreas > 60 were excluded, and then a total of 100 participants with the lowest eGFR values at each visit were selected stratified by gender. In FHS, the cumulative prevalence of CKD was defined as low eGFRcrea at both the earlier examination cycle (15th for the Original Cohort and 2nd for the Offspring Cohort) and the later examination cycle (24th for the Original Cohort and 7th for the Offspring Cohort), or if diagnosed at the later examination cycles.
  6. Stamler-Kannel: The extremes of the Stamler-Kannel design were chosen by generating residuals for the following traits adjusted for age, age2, sex, BMI, BMI2, and study site: systolic blood pressure, diastolic blood pressure, triglycerides, total cholesterol, HDL cholesterol, glucose, and insulin. A principal components analysis was then fitted for these phenotypes and standardized across the three cohorts. Individuals with the highest and lowest values of the first principal component were selected for sequencing.
  7. Waist-to-hip ratio adjusted for BMI: Two hundred (200) unrelated participants including 100 individuals from the ARIC study, and 50 subjects from CHS and FHS were sequenced from each of the tails of the distribution for waist-to-hip ratio adjusted for BMI. In FHS, subjects were greater than 25 years of age and less than 65 years of age. In the ARIC study and CHS, there were no age restrictions. In all studies, individuals were excluded if BMI < 18.5 kg/m2.

C. Disease 2020: Large-Scale Sequencing and Analysis Center Initiated Projects/Baylor College of Medicine

    Disease 2020, a new strategic framework for the NHGRI Sequencing Program, was introduced with the goal of leveraging recent advances in genomic technology to systematically define the genetic basis of human disease and maximize the impact of genomic medicine. In response to this initiative, the Large-Scale Sequencing and Analysis Centers including the Baylor College of Medicine Human Genome Sequencing Center (HGSC), were invited to propose demonstration projects known as Center Initiated Projects (CIPs). Two CIPs were designed in accordance with the emphasis of the common disease component of Disease 2020 on 1) leading causes of morbidity and mortality for which a significant proportion of the heritability is still unexplained, and 2) availability of appropriately consented DNA samples from individuals enrolled in large deeply phenotyped cohort studies. Samples from the Cardiovascular Health Study (CHS) were included in the CIP described below:

  1. Rare and Common Variants Contribute to Age-Related Change in Brain Morphology and Cognitive Decline: CHS
  2. Genetic analysis of heritable endophenotypes, such as cognitive function and brain morphology that may be closer to the underlying disease pathophysiology and more directly related to gene expression, may help to uncover loci that increase susceptibility to Alzheimer's disease before diagnostic criteria are met. To evaluate the contribution of rare and low frequency coding variants to changes in cognition and brain morphology, exome sequencing was performed in 2,905 European-American participants from the ARIC Study, CHS and FHS. Selection of study subjects was carried out within each of the three cohorts. During the third ARIC examination, 966 European American and 960 African American participants aged 55 and older were invited to undergo cranial magnetic resonance imaging (MRI). Three validated neurocognitive tests chosen to represent different domains (Delayed Word Recall Test, verbal memory; Word Fluency Test, executive function; and Digit Symbol Substitution Test, processing speed) were administered to the entire cohort at the second and fourth ARIC clinic visits. In 2004-2006, 1,025 study participants underwent a second cerebral MRI examination and additional cognitive testing. Brain images were graded both semi-quantitatively (at visit 3 and at the Brain MRI follow-up visit) and quantitatively (follow-up visit only) for hippocampal volume. Participants from FHS were selected (n = 939) who had at least two MRI scans to assess hippocampal volume an average of six years apart. All FHS participants also had a 45-minute cognitive assessment at the same time as each MRI including a test of delayed recall (Logical Memory I adapted from the Original Wechsler Memory Scale). Individuals with at least one MRI to evaluate hippocampal volume (n = 835) were chosen from CHS. Among these CHS participants, 252 had hippocampal volume measured using a quantitative MRI scan, and 824 had data available for 6-year change in scores for the Digit Symbol Substitution Test. An additional group of 165 persons without hippocampal volume measurements but who had taken the Digit Symbol Substitution Test twice six years apart were also included. Meta-analysis strategies can be used to combine results for the three cohorts. This study contains the CHS study subset of the CIP. Additional data from ARIC and FHS is also available via dbGaP.

Molecular Data
TypeSourcePlatformNumber of Oligos/SNPsSNP Batch IdComment
Exome Sequencing Illumina HiSeq 2000 N/A N/A
Whole Genome Sequencing Illumina HiSeq 2000 N/A N/A
Targeted Regional Capture Sequencing Applied Biosystems SOLiD4 N/A N/A
Whole Exome Genotyping Illumina HumanExome-12v1_A N/A N/A
Study History

The U.S. CHARGE consortium consists of three large population-based longitudinal cohort studies, including the Atherosclerosis Risk in Communities (ARIC) Study, the Cardiovascular Health Study (CHS), and the Framingham Heart Study (FHS). The Cardiovascular Health Study (CHS) is described below:

    The Cardiovascular Health Study (CHS) is an NHLBI-funded observational study of risk factors for cardiovascular disease in adults 65 years or older conducted across four field centers. The original predominantly Caucasian cohort of 5201 persons was recruited in 1989-1990 from random samples of the Medicare eligibility lists and an additional 687 African-Americans were enrolled subsequently for a total sample of 5888. Starting in 1989, and continuing through 1999, participants underwent annual extensive clinical examinations. Follow-up for events remains ongoing through the present.
Selected publications
Diseases/Traits Related to Study (MeSH terms)
Links to Related Resources
Authorized Data Access Requests
See research articles citing use of the data from this study
Study Attribution