|Jump to:||Authorized Access|||||Attribution|||||Authorized Requests|
- Study Description
Important Links and Information
- ADSP data notices
Request access via Authorized Access
- Instructions for requestors
- Addendum to instructions: Application process for ADSP Sequencing Data
- Data Use Certification (DUC) Agreement
- Talking Glossary of Genetic Terms
The overarching goals of the Alzheimer's Disease Sequencing Project (ADSP) are to: (1) identify new genomic variants contributing to increased risk of developing Alzheimer's Disease (AD), (2) identify new genomic variants contributing to protection against developing AD, and (3) provide insight as to why individuals with known risk factor variants escape from developing AD. These factors will be studied in multi-ethnic populations in order to identify new pathways for disease prevention. Such a study of human genomic variation and its relationship to health and disease requires examination of a large number of study participants and needs to capture information about common and rare variants (both single nucleotide and copy number) in well phenotyped individuals.
Using existing samples from NIH funded and other studies, three NHGRI funded Large Scale Sequencing and Analysis Centers (LSAC) - Broad, Baylor, and Washington University - produced the DNA sequence data. Variant call data are being made available to the scientific community through NIH-approved data repositories. Statistical analysis of the sequence data is anticipated to identify new genetic risk and protective factors. The ADSP will conduct and facilitate analysis of sequence data to extend previous discoveries that may ultimately result in new directions for AD therapeutics. Analysis of ADSP data will be done in two phases.
The Discovery Phase analysis (2014-2018) is funded under PAR-12-183. The entire Discovery dataset contains whole-genome sequencing data on 584 subjects from 113 families, and pedigree data for > 4000 subjects; whole exome sequencing data on 5096 cases 4965 controls; and whole exome sequence data on an additional 853 (682 Cases [510 Non-Hispanic, 172 Hispanic]), and 171 Hispanic Control subjects from families that are multiply affected with AD.
The Replication Phase (2016-2021) analysis will be funded under RFA-AG-16-001 and RFA-AG-16-002 and is expected to include a combination of genotyping and sequencing approaches on at least 30,000 subjects. Targeted sequencing will be done by the LSACs.
- The first ADSP data release occurred on November 25, 2013. It included the whole-genome sequencing data in BAM file format on 410 individuals.
- The second ADSP data release occurred on March 31, 2014, and included the whole-genome sequencing data in BAM file format for an additional 168 individuals.
- The third ADSP data release occurred on November 03, 2014 and included whole-exome sequencing data in BAM file format for 10,939 individuals.
- The fourth ADSP data release occurred on February 13, 2015 and included revised ethnic data for subjects with whole-exome sequencing data.
- The fifth ADSP data release occurred on July 13, 2015 and included whole-genome genotypes and updated phenotypes as well as changes to pedigree structures and sample IDs.
- The sixth ADSP data release occurred on December 8, 2015, and included whole-exome genotypes and updated phenotypes as well as changes to subject IDs.
This seventh ADSP data release on April 12, 2016 includes:
(1) WES and WGS SNV VCF files
(2) WES and WGS Indel PLINK files
ADSP Data Available through dbGaP:
ADSP - Whole Genome Sequencing ADSP - Whole Exome Sequencing Comments DNA-Seq (BAM) n=578 n=10913 Sequence data available (plus n=38 replications w/out genotype data) Concordant SNV Genotypes (PLINK format) N/A n=10913 QC'ed genotypes that are concordant between the Atlas (Baylor's) and GATK (Broad's) calling pipelines (a subset of the consensus genotype set) Consensus Genotypes (PLINK and VCF format) n=578 n=10913 QC'ed genotypes that are concordant between Atlas and GATK pipelines as well as those that that were called uniquely by Atlas or GATK Concordant Indel Genotypes (PLINK format) n=578 n=10913 QC'ed genotypes that are concordant between the Atlas and GATK calling pipelines Phenotype Data n=4735 n=10913 Data of n=53 phenotype variables available (plus administrative data), including APOE genotype. WGS phenotypes include data of connecting family members.
Please use the release notes provided by dbGaP to obtain detailed information about study release updates.
The ADSP data portal provides a customized interface for users to quickly identify and retrieve files by covariates, phenotypes, and data properties such as sequencing facility or coverage. For more information about the ADSP study and the data portal, please visit https://www.niagads.org/adsp/.
- Authorized Access
- Publicly Available Data (Public ftp)
- Study Inclusion/Exclusion Criteria
The samples for the ADSP have been selected from well-characterized cohorts of individuals characterized for AD diagnosis as well as having known AD genetic risk factors. Investigators in the ADSP will obtain from the NIH approved data repositories: (1) quality control checked and 'cleaned' sequence data. 'Quality control checked and cleaned' means a set of routine checks have been performed for sample information, phenotype, and GWAS data to ensure the sequence data are of high quality and are ready for downstream genetic analysis and that likely sources of false-positives have been ruled out, and that samples that are outliers which may skew project-level analyses have been identified; (2) information on the composition of the study cohorts (e.g. case-control, family based, and epidemiology cohorts); (3) descriptions of the study cohorts included in the study; and (4) accompanying phenotypic information such as age at disease onset, self-reported race/ethnicity, gender, diagnostic status, and cognitive measures. The ADSP will determine what additional information, if any, is needed by its members to facilitate the project.
- Molecular Data
Type Source Platform Number of Oligos/SNPs SNP Batch Id Comment Whole Genome Sequencing Illumina HiSeq 2000 N/A N/A Whole Exome Sequencing Illumina HiSeq 2000 N/A N/A
- Study History
On February 7, 2012, a new Presidential Initiative was announced to fight Alzheimer's Disease (AD). As part of this effort, the National Human Genome Research Institute (NHGRI) was asked by the Director of the National Institutes of Health (NIH) to use $25M already committed to its Large-Scale Sequencing Centers (LSSC) for genomic studies in AD. The NIH director asked the National Institute on Aging (NIA) and the NHGRI to work together to develop and execute a large scale sequencing project to analyze the genomes of a large number of well characterized individuals in order to identify a broad range of AD risk and protective gene variants, with the ultimate goal of facilitating the identification of new pathways for therapeutic approaches and prevention. The analysis will also provide insight as to why individuals with known risk factor genes escape from developing AD. The project, developed jointly by NIA and NHGRI, is called the Alzheimer's Disease Sequencing Project (ADSP).
- Selected publications
- Diseases/Traits Related to Study (MeSH terms)
- Primary Phenotype: Alzheimer Disease
- Authorized Data Access Requests
- Study Attribution
- Richard A. Gibbs, PhD. Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
- Eric S. Lander, PhD. Broad Institute, Boston, MA, USA.
- Richard K. Wilson, PhD. The Genome Institute, Washington University, St. Louis, MO, USA.
- Gerard D. Schellenberg, PhD. Alzheimer's Disease Genetics Consortium (ADGC), University of Pennsylvania, Philadelphia, PA, USA.
- Sudha Seshadri, MD. Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium (CHARGE), Boston University, Boston, MA, USA.
- U24 AG041689. The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA.
- U01 AG032984. Alzheimer's Disease Genetics Consortium, University of Pennsylvania, Philadelphia, PA, USA.
- R01 AG033193. Cohorts for Heart and Aging Research in Genomic Epidemiology, Boston University, Boston, MA, USA.
- U24 AG021886. National Cell Repository for Alzheimer's Disease, Indiana University, Bloomington, IN, USA.
Funding Sources for Sequencing
- U54HG003079. National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
- U54HG003273. National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
- U54HG003076. National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
- Principal Investigators