Format

Send to

Choose Destination
BMC Bioinformatics. 2016 Sep 13;17(1):364. doi: 10.1186/s12859-016-1238-8.

A haplotype-based normalization technique for the analysis and detection of allele specific expression.

Author information

1
CHU Sainte Justine Research Centre, Department of Pediatrics, Faculty of Medicine, Universite de Montreal, 3175 Chemin de la Cote Sainte Catherine, Montreal, QC, Canada. alan.j.hodgkinson@gmail.com.
2
Department of Medical and Molecular Genetics, Guy's Hospital, King's College London, London, SE1 9RT, UK. alan.j.hodgkinson@gmail.com.
3
CHU Sainte Justine Research Centre, Department of Pediatrics, Faculty of Medicine, Universite de Montreal, 3175 Chemin de la Cote Sainte Catherine, Montreal, QC, Canada.
4
Ontario Institute of Cancer Research, Toronto, ON, Canada.
5
Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.

Abstract

BACKGROUND:

Allele specific expression (ASE) has become an important phenotype, being utilized for the detection of cis-regulatory variation, nonsense mediated decay and imprinting in the personal genome, and has been used to both identify disease loci and consider the penetrance of damaging alleles. The detection of ASE using high throughput technologies relies on aligning short-read sequencing data, a process that has inherent biases, and there is still a need to develop fast and accurate methods to detect ASE given the unprecedented growth of sequencing information in big data projects.

RESULTS:

Here, we present a new approach to normalize RNA sequencing data in order to call ASE events with high precision in a short time-frame. Using simulated datasets we find that our approach dramatically improves reference allele quantification at heterozygous sites versus default mapping methods and also performs well compared to existing techniques for ASE detection, such as filtering methods and mapping to parental genomes, without the need for complex and time consuming manipulation. Finally, by sequencing the exomes and transcriptomes of 96 well-phenotyped individuals of the CARTaGENE cohort, we characterise the levels of ASE across individuals and find a significant association between the proportion of sites undergoing ASE within the genome and smoking.

CONCLUSIONS:

The correct treatment and analysis of RNA sequencing data is vital to control for mapping biases and detect genuine ASE signals. By normalising RNA sequencing information after mapping, we show that this approach can be used to identify biologically relevant signals in personal genomes.

KEYWORDS:

Allele specific expression; Normalization; RNA sequencing

PMID:
27618913
PMCID:
PMC5020486
DOI:
10.1186/s12859-016-1238-8
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center