Jump to: Authorized Access | Attribution | Authorized Requests

Study Description

The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing. TCGA is a joint effort of the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), which are both part of the National Institutes of Health, U.S. Department of Health and Human Services.

TCGA projects are organized by cancer type or subtype. Current projects are glioblastoma multiforme (GBM), squamos lung carcinoma and and ovarian serous cystadenocarcinoma.

TCGA project primary genomic sequencing datasets (controlled-access) and limited phenotype data (open-access) are available from this site. Comprehensive access to TCGA datasets, e.g. gene expression, copy number variation and full clinical information, is available via the TCGA Data Portal.

  • Study Weblinks:
  • Study Design:
    • Tumor vs. Matched-Normal
  • Study Type:
    • Tumor vs. Matched-Normal
  • Number of study subjects that have individual-level data available through Authorized Access:
Authorized Access
Publicly Available Data (Public ftp)

Connect to the public download site. The site contains release notes and manifests. The site also contains data dictionaries, variable summaries, documents, and truncated analyses, whenever available.

Study Inclusion/Exclusion Criteria

The Cancer Genome Atlas (TCGA) utilizes a strict set of criteria for inclusion into the study due to the rigorous and comprehensive nature of the work being performed. Collected tumors and normal matched control samples are curated and processed by the Biospecimen Core Resource, a centralized site that reviews sample data and processes all samples to ensure consistent pathology assessment and generation of molecular analytes (DNA and RNA) using standard, optimized protocols.

TCGA is focusing on primary untreated tumors that were snap frozen upon collection. All tumors must have a matched normal sample from the same patient. In many cases, the matched normal is a sample of the patient's blood.

Once at the BCR, all samples are subjected to a quality control protocol before they are accepted for full analysis into the TCGA pipeline. Each sample is reviewed by a pathologist to ensure the diagnosis is accurate and that the sample meets inclusion criteria. Specifically, TCGA requires that samples be comprised on at least 80% tumor nuclei and have less than 20-30% necrotic tissue. Once the sample passes the pathology review, nucleic acids are isolated and genotyping is performed to ensure each tumor sample is properly associated with the correct normal tissue. An important goal in establishing this central resource is to ensure that molecular analytes (i.e. DNA and RNA) extracted from tissue samples are of consistent and high quality. These analytes, in turn, undergo a molecular quality control process before they are distributed to TCGA Cancer Genome Characterization Centers and Genome Sequencing Centers for genomic analysis.

All samples in TCGA have been collected and utilized following strict human subjects protection guidelines, informed consent and IRB review of protocols.

See TCGA program website at http://cancergenome.nih.gov/.

Molecular Data
TypeSourcePlatformNumber of Oligos/SNPsSNP Batch IdComment
Whole Genome Sequencing Illumina Genome Analyzer II N/A N/A
Whole Exome Sequencing Roche 454 GS FLX Titanium N/A N/A
Exome Sequencing Applied Biosystems SOLiD N/A N/A
Selected publications
Diseases/Traits Related to Study (MeSH terms)
Authorized Data Access Requests
Study Attribution