Format

Send to

Choose Destination
Nat Commun. 2018 Dec 5;9(1):5199. doi: 10.1038/s41467-018-07349-w.

A semi-supervised approach for predicting cell-type specific functional consequences of non-coding variation using MPRAs.

Author information

1
Department of Biostatistics, Columbia University, New York, 10032, NY, USA.
2
Department of Statistics, Columbia University, New York, 10027, NY, USA.
3
Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
4
Department of Biostatistics, Columbia University, New York, 10032, NY, USA. ii2135@cumc.columbia.edu.

Abstract

Predicting the functional consequences of genetic variants in non-coding regions is a challenging problem. We propose here a semi-supervised approach, GenoNet, to jointly utilize experimentally confirmed regulatory variants (labeled variants), millions of unlabeled variants genome-wide, and more than a thousand cell/tissue type specific epigenetic annotations to predict functional consequences of non-coding variants. Through the application to several experimental datasets, we demonstrate that the proposed method significantly improves prediction accuracy compared to existing functional prediction methods at the tissue/cell type level, but especially so at the organism level. Importantly, we illustrate how the GenoNet scores can help in fine-mapping at GWAS loci, and in the discovery of disease associated genes in sequencing studies. As more comprehensive lists of experimentally validated variants become available over the next few years, semi-supervised methods like GenoNet can be used to provide increasingly accurate functional predictions for variants genome-wide and across a variety of cell/tissue types.

PMID:
30518757
PMCID:
PMC6281617
DOI:
10.1038/s41467-018-07349-w
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center