Jump to: Authorized Access | Attribution | Authorized Requests

Substudies
phs000527.v18.p6 : National Cancer Institute (NCI) Cancer Genome Characterization Initiative (CGCI): Burkitt Lymphoma Genome Sequencing Project (BLGSP)
phs000528.v18.p6 : National Cancer Institute (NCI) Cancer Genome Characterization Initiative (CGCI): HIV+ Tumor Molecular Characterization Project - Cervical Cancer (HTMCP - CC)
phs000529.v18.p6 : National Cancer Institute (NCI) Cancer Genome Characterization Initiative (CGCI): HIV+ Tumor Molecular Characterization Project - Diffuse Large B-Cell Lymphoma (HTMCP - DLBCL)
phs000530.v18.p6 : National Cancer Institute (NCI) Cancer Genome Characterization Initiative (CGCI): HIV+ Tumor Molecular Characterization Project - Lung Cancer (HTMCP - LC)
phs000531.v18.p6 : National Cancer Institute (NCI) Cancer Genome Characterization Initiative (CGCI): Medulloblastoma
phs000532.v18.p6 : National Cancer Institute (NCI) Cancer Genome Characterization Initiative (CGCI): Non-Hodgkin Lymphoma

Study Description

The Office of Cancer Genomics at the National Cancer Institute sponsored a series of studies as part of the Cancer Genome Characterization Initiative (CGCI) to assess novel emerging sequencing technologies in cancer. The CGCI program included comprehensive characterization of the genetic aberrations found in different pediatric and/or adult tumors.

CGCI characterized a number of B-cell non-Hodgkin lymphomas (including diffuse large B-cell lymphoma (DLBCL) from patients with and without HIV+ infection, follicular lymphoma (FL), as well as adult and pediatric Burkitt lymphomas), and additional HIV-associated tumors (including DLBCL, lung, and cervical cancers). All data from these projects is available at the NCI Genomic Data Commons.

Individual project descriptions are available by disease on the substudy pages (links are included at the top of this page). Brief summaries are as follows:

  • Non-Hodgkin Lymphoma (NHL) - CGCI investigators probed genomic alterations more deeply than had previously been possible by using state-of-the-art RNA sequencing (mRNA-seq) and whole genome shotgun sequencing (WGS) coupled with leading edge bioinformatics, data management and analysis approaches. The project sequenced tumor DNA and/or RNA from 117 NHL tumor samples and 10 cell lines. This included the genomes or exomes of 1 Follicular Lymphoma (FL) and 13 diffuse large B-cell lymphoma (DLBCL) cases, all with matched constitutional DNA sequenced to comparable depths, RNA-sequencing (mRNA-seq) of 92 DLBCL, 12 FL and 8 B-cell NHL cases with other histologies and 10 DLBCL-derived cell lines. The DLBCL cases and cell-lines are from the two major subtypes of DLBCL: germinal center B-cell (GCB) and activated B-cell (ABC).

  • HIV+ Tumor Molecular Characterization Project (HTMCP) - This project was a joint effort of the Office of Cancer Genomics (OCG) and the Office of HIV and AIDS Malignancy (OHAM). Its goals were to characterize HIV-associated cancers (obtained from HIV-infected patients) and compare them to the same types of cancers from patients without HIV infection. Investigators performed 30X genome sequencing of 100 cases of paired tumor and germline DNA, along with transcriptome sequencing in each of 3 types of HIV+ tumors (DLBCL, lung and cervical cancers). These platforms allow discovery of mutations both in coding and non-coding genomic regions, gene expression and genomic alterations (including translocations, insertions and deletions). Comparing tumors of cancer patients both with and without HIV-infection provides insight into the potential function of this virus in certain cancers.

  • Burkitt Lymphoma Genome Sequencing Project (BLGSP) - This project was a collaborative effort between the National Cancer Institute and the Foundation for Burkitt Lymphoma Research to develop a databank of the many alterations found in Burkitt lymphoma (BL), an uncommon type of Non-Hodgkin lymphoma that occurs most often in children and young adults. The goal of the BLGSP was to explore potential genetic changes in patients with BL that could lead to better prevention, detection and treatment of the cancer. The project characterized the alterations of the tumors' genomes (with matched normal as control) and transcriptomes by sequencing the DNA and RNA of each case. Using the data generated, the ultimate goals of the project was to discover the molecular changes that are present in BL patients and then determine how those changes correlate with treatment regimen and outcome.

CGCI data is accessible at the NCI's Genomic Data Commons (GDC) via the GDC's Data Portal and from each of the CGCI Publication Pages at the GDC (Please see the "Supplemental Links" section of any CGCI publication's "Publication Information and Associated Data Files page" at the GDC.). Available datasets include raw sequencing data, datasets generated by the original CGCI research teams (fully annotated clinical information and higher-level/analyzed molecular characterization data), as well as higher level data generated by the GDC. Please see the CGCI Use and Publication Guidelines for updated details on the sharing of any CGCI substudy data, including how to cite CGCI.

To learn more about the CGCI studies, please visit the CGCI Program website.

Authorized Access
Publicly Available Data
Study Inclusion/Exclusion Criteria

All specimens and all clinical and laboratory data gathered for this project meet the strict set of criteria established by The Cancer Genome Atlas (TCGA). In particular, the following specific criteria will be met.

  1. Focus on primary untreated tumors that were snap frozen upon tissue resection.
  2. All samples are collected and utilized following strict human subjects protection guidelines, informed consent and IRB reviewed protocols.
  3. Whenever possible, clinical data are gathered prospectively and stored in continuously updated electronic format using a standard relational database (MS Access) employing caDSR compliant terminology and from which the data can be easily exported.

Additional information on specimen inclusion and exclusion criteria for the specific tumor types investigated as part of CGCI can be found on the CGCI Program website and within referenced publications for this initiative.

Study History

Cancer is a genetic disease. Alterations at the DNA level drive the cellular changes that are hallmarks of cancer including aberrant cell division and survival. Historically, genetic causes of cancer were studied by analysis of one or a few genes at a time. More recently however, novel high-throughput technologies have provided unprecedented capabilities to examine the cancer genome. These technologies allow systematic characterization of genetic and epigenetic alterations, allowing investigators to identify the underlying genetic changes found in cancer. The CGCI incorporates multiple approaches for genomic characterization including exome sequencing and transcriptome analysis using next generation sequencing. To encourage collaboration and leverage the collective knowledge and innovation of the entire cancer research community, all data collected will be publicly available through databases supported by the National Institutes of Health and National Cancer Institute.

All CGCI data is available at the NCI Genomic Data Commons. Please visit the CGCI Program website for additional information on CGCI projects.

Selected Publications
Diseases/Traits Related to Study (MeSH terms)
Authorized Data Access Requests
Study Attribution
  • Principal Investigator (NHL)
    • Marco A. Marra, PhD. British Columbia Cancer Agency Genome Sciences Centre, Vancouver, BC, Canada.
  • Principal Investigator (Medulloblastoma)
    • Victor Velculescu, MD, PhD. Sidney Kimmel Center, The Johns Hopkins University, Baltimore, MD, USA.
  • Principal Investigator (BLGSP, HTMCP)
    • Daniela S. Gerhard, PhD. National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.