Jump to: Authorized Access | Attribution | Authorized Requests

Study Description

The Gabriella Miller Kids First Pediatric Research Program (Kids First) is a trans-NIH effort initiated in response to the 2014 Gabriella Miller Kids First Research Act and supported by the NIH Common Fund. This program focuses on gene discovery in pediatric cancers and structural birth defects and the development of the Gabriella Miller Kids First Pediatric Data Resource (Kids First Data Resource). All of the genomic and phenotypic data from this study are accessible through dbGaP. The data is also available at the Kids First Portal, where other Kids First datasets can also be accessed in the cloud for data analysis, data visualization, collaboration and interoperability, open to all researchers and developers.

Myelomeningocele (aka meningomyelocele, MM) is the most severe form of spina bifida, a neural tube defect (NTD) in humans and the most common CNS birth defect. MM is considered a genetically complex disease, and occurs in 3.72/10,000 live US birth, and is partly preventable with prenatal folate, but the genetic basis and the mechanisms by which folate work to reduce disease incidence remain obscure. MM is associated nearly uniformly with prenatal hydrocephalus and the Arnold-Chiari malformation, as well as paraplegia and lifelong neuromotor disability. The genes for several rare syndromic forms of NTDs are known, but the causes for the majority with sporadic MM remain unknown.

Despite the importance of MM, most previous research has been limited to targeted sequencing and association studies of folate metabolism genes, or very small-scale exome sequencing. We hypothesize that de novo mutations (DNMs) produce likely gene disrupting (LGD) events that contribution to MM risk. Using conservative estimates of between 50-100 recurrently mutated discoverable genes contributing to risk, and our preliminary data demonstrating an excess of LGD DNMs in MM compared with control individuals, we estimate that with a cohort size of 1000 trios, we should uncover between 5-20 new recurrently mutated genes underlying MM, with minimal false-discovery. With this in mind, we formed the Spina Bifida Sequencing Consortium, and established a platform for data and sample sharing. Preliminary analysis of our first batch of 100 trios analyzed by WGS from GMKF suggests a wealth of important gene mutations. We have embarked on a new recruitment effort of an additional cohort of 400 new simplex MM trios, in collaboration with the US Spina Bifida Association, consented trios to allow for data sharing, and have performed detailed sample quality control. We also have preliminary data that use of dried bloodspot DNA from trios performs comparably to whole blood DNA, so will be happy to swap saliva for bloodspot recruitment at NIH’s preference. This cohort is now half-way assembled, with the remaining cohort to be ascertained in the next 6 months. We have established a workflow for de novo SNP/INDEL/SV detection from WGS and have ample computer storage and nodes to see the project to completion. We also plan to continue recruitment into the future with the goal of 2000 trios in the next 5 years. We propose a detailed bioinformatics workflow to identify gene mutations within a statistical framework, considering detailed scRNA expression profiling from developing mammalian neural tube, and have developed a robust functional validation workflow using Xenopus and mouse gene targeting. Our project has the potential to uncover a host of causes for this most common of the CNS birth defects, paving the way for future breakthroughs in detection, treatment, and prevention.

Authorized Access
Publicly Available Data
  Link to other NCBI resources related to this study
Study Inclusion/Exclusion Criteria

1. Family type: All samples are from trios (father, mother, affected child) where saliva or blood samples were obtained on both biological parents and the affected individual. These criteria were established to ensure that de novo mutations could be identified.

2. Diagnosis: For each trio, the child has myelomeningocele (MM) that was diagnosed prior to or at the time of birth, with paralysis below the level of the spinal lesion. Almost all cases have coexisting hydrocephalus that required a ventriculoperitoneal shunt at birth. The only exception are cases that had fetal surgery to repair the myelomeningocele, because studies shows that fetal surgery can reduce the risk of hydrocephalus 1. This criterion was established to limit phenotypic variability, and ensure that all affected individuals have a uniformly severe form of neural tube defect. Our criteria represent the most severe form of neural tube defect that is compatible with long term survival.

3. Demographics: Demographics including age and sex are presented for each affected child. In our cohort, most cases are between 1 and 20 years of age. Most parents are between 30 and 60 years of age. We maintain data on the year and location of birth so that we can track whether folate fortification was instituted by the time the child was conceived. Because the number of expected de novo mutations in an offspring correlates with the parental age at the time of conception 2, we track age of the parents at the time of conception, to allow correction for the number of expected de novo mutations

4. Clinical information: The location of the spinal lesion, medications, seizure history, ambulatory information and health status were collected on each case including notable pregnancy events, trauma and delivery history. Most cases were identified through the national Spina Bifida Association, through Facebook, or outreach to Spina Bifida clinics. Cases were selected from among those that responded to a study invitation based upon our entry criteria. We have ability to recontact all cases if needed, and ability to gather additional information if required in the future.

5. Family history: Family medical history relevant to spina bifida is collected, including history of neural tube defects, hydrocephalus, spina bifida occulta, or other birth defects.

Study History

We initiated this project in 2015 by recruiting patients locally from Rady Children’s Hospital, and accessing existing cohorts of MM trios from prior CDC-funded ascertainment efforts at Duke and U Texas, growing the cohort to ~300 trios. We created the Spina Bifida Sequencing Consortium, with the goal of collaborative discovery and recruited additional trios from around the country. Most of these prior samples were blood-derived, a subset of which were sequenced at Rady Children’s using philanthropy funds. Approximately half of these trios were conceived without folic acid supplementation, allowing us to stratify the cohort into two groups. The first 806 trios underwent WES analysis, and data is presented below. In 2019 GMKF awarded the Consortium 1000 WGS spots, but during the process of submission, two of the Consortium home institutions decided that the prior consent was not compatible with the broad data sharing of GMKF, so we substituted with trios later-recruited patients that were predominantly saliva in origin and thus required greater sequencing depth to achieve 30x human genomes read depth. Meanwhile the samples that were removed from GMKF had WES performed using philanthropy funds. The original GMKF 1000 WGS spots thus allowed WGS on 166 trios (i.e., 500 individuals, each sequenced with two WGS spots to account for bacterial contamination), now under analysis.

Since then, we have broadened our recruitment methods to focus on social media and direct patient outreach, and set a goal of 2000 trios over the next 5 years. For logistical reasons, we have transitioned to recruitment by mail using either saliva sampling or blood spot sampling. The saliva sampling is straightforward and very few trios decline at the saliva sampling stage.

Blood spot sampling is a relatively new option, made possible by the newly available Illumina ‘on-bead Tagmentation’ Tn5 transposase WGS library prep. While offered commercially, none of the GMKF centers have yet adopted the technology, so after discussion with GMKF NIH leadership, we were given permission to run a pilot Tagmentation project and were matched with HudsonAlpha, with the goal of comparing quality from blood-derived DNA extracted with standard salt extraction vs. blood spotted onto Whatman filter paper and left at room temperature for several months. Analysis of 2 samples from each of 3 individuals revealed bloodspots functioned just as well as blood-extracted DNA. This is exciting because now bloodspots can be a viable method for WGS, without the drawbacks of bacterial contamination from saliva. Adoption of this method by GMKF could allow more cost-efficient and streamline operations. At NIH’s preference, future recruitment could use bloodspots.

In last year’s application, we shared results from our first 179 WES trios, describing a large excess of LGD DNMs in patients compared with controls, and a greater burden of LGD DNMs in folate-exposed than non-exposed trios. Rather than repeat this information, with the current application, we share analysis of our full WES cohort of 806 trios, which have been more rigorously analyzed with best practices and joint calling with control cohort of 732 trios (Fig. 3a). Considering first DNMs, we found no differences in synonymous missense in either ‘all gene’ or ‘constrained gene (pLI >0.9), but a dramatic difference in LGD, de novo missense (D-Mis) as well as the sum of LGD + D-Mis, for both gene sets (Fig. 3b). Relative risk (RR) for ‘All genes’ was 1.42-to-1.98, meaning that roughly half of mutations in this gene set would be expected to occur by chance, whereas for ‘constrained genes’ RR was 1.73-to-5.55, meaning that fewer such mutations in MM patients are expected by chance. Thus, about half of LGD mutations contribute to disease, and with a cohort of ~1000 trios, we should identify 20-25 recurrently mutated genes. All DNMs underwent orthogonal validation using PCR amplification, revealing 96% validation rate. We found that DNMs detected by WES contribute to approximately 22% of MM risk, and identified a total of 203 damaging MM candidate genes, of which 8 are recurrently mutated.

We have also made good progress on WGS analysis of the trios sequenced to date, considering data was only returned 3 months ago. Applying best-practice WGS analysis workflow, we so far identified 83 total de novo LGD exonic and splice variants, a rate approximately twice that of WES, likely resulting from greater ability to interpret splice variants with SpliceAI (see below). Moreover, 5 of the genes identified by WGS were also on the WES list, and can now be considered recurrently mutated. We also have 20 DNMs occurring in ultra-conserved non-coding regions proximal to candidate genes. In the near future we will process data to assess for SVs, LINEs and STR expansions, consider mosaic mutations and assess inherited variants. We conclude that WGS is superior to WES in ability to assess the full range of variants that contribute to disease.

Selected Publications
Diseases/Traits Related to Study (MeSH terms)
Authorized Data Access Requests
See articles in PMC citing this study accession
Study Attribution
  • Principal Investigator
    • Joseph G. Gleeson. University of California, San Diego, San Diego, California, USA.
  • Funding Source
    • X01HD100672-01. National Institutes of Health, Bethesda, Maryland, USA.
    • P01HD104436. National Institutes of Health, Bethesda, Maryland, USA.