|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Feb 18, 2015 |
Title |
Hi-C, Thymus STL001, replicate one |
Sample type |
SRA |
|
|
Source name |
Thymus STL001
|
Organism |
Homo sapiens |
Characteristics |
tissue: Thymus
|
Treatment protocol |
None
|
Growth protocol |
Tissues samples were obtained from deceased donors at the time of organ procurement at the Barnes-Jewish Hospital (St. Louis, USA). Samples were flash frozen and pulverized prior to formaldehyde cross-linking. Research consent from family was obtained, and this study was approved by Mid-American Transplant Services.
|
Extracted molecule |
genomic DNA |
Extraction protocol |
Hi-C experiments were conducted using HindIII according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).). Sequencing libraries were constructed according to previous publication (Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289-93 (2009).).
|
|
|
Library strategy |
OTHER |
Library source |
genomic |
Library selection |
other |
Instrument model |
Illumina HiSeq 2000 |
|
|
Description |
Sample 1
|
Data processing |
library strategy: HaploSeq fastq: Illumina's HiSeq Control Software For Hi-C read alignment, we aligned Hi-C reads to the hg18 (human) genome. We masked any bases in the genome that were genotyped as SNPs in the individual genome. These bases were masked to āNā in order to reduce reference bias mapping artifacts. Hi-C reads were aligned as single end reads using Novoalign. After mapping was finished, read pairs were re-constructed from single reads using an in house pipeline. Unmapped reads were filtered out and PCR duplicate reads were removed. Haplotypes were generated from the final aligned bam file using the HapCUT algorithm. The details of HapCUT are described previously (Bansal and Bafna, Bioinformatics 24, i153-159, 2008). The final processed haplotypes were generateds after removing local biases through following three steps. First the alignment biases were removed by aligning simulated reads spanning surrounding variants location. If there is more than 5% difference between alleles those variant loci were considered to subject an inherent mapping bias. Second, we removed alleles located in copy number variable regions and allelic biased copy number variable regions by comparing the coverage between two alleles based on WGS data. Any variants that had more coverage than three standard deviation above the mean of each haplotype were excluded. Any variants showing biased WGS coverage between two alleles were also excluded (binomial test p-value 0.05 after Benjamini correction). Lastly, we remove erroneously called as heterozygous variant during genotyping. We calculated the probability of each heterozygous variants were actually homozygous from the likelihood of observing the coverage on each allele from whole genome sequencing. Only heterozygous SNPs that had a FDR of less than 0.5% were included in downstream analysis. HaploSeq generates two haplotypes for each chromosome, one for the maternal allele and one for the paternal allele. One allele is named as P1 (parent1) and another allele is named as P2 (parent2) since we do not have information regarding the parent of origin in each donor genome. For the chr9 we can generate haplotypes in each chromosome arm. The haplotypes in chrX STL002 was independtly generated based on hg19 genome build. Genome_build: hg18 Supplementary_files_format_and_content: The processed haplotypes for the individual genome ("*_haps.vcf") are available in VCF format.
|
|
|
Submission date |
Jun 24, 2014 |
Last update date |
May 15, 2019 |
Contact name |
Inkyung Jung |
E-mail(s) |
ijungkaist@gmail.com
|
Organization name |
KAIST
|
Department |
Biological Sciences
|
Street address |
KAIST, 291 Daehak-ro, Yuseong-gu
|
City |
Daejeon |
ZIP/Postal code |
34141 |
Country |
South Korea |
|
|
Platform ID |
GPL11154 |
Series (1) |
GSE58752 |
Integrative analysis of haplotype-resolved epigenomes across human tissues |
|
Relations |
BioSample |
SAMN02898030 |
SRA |
SRX641264 |
Supplementary file |
Size |
Download |
File type/resource |
GSM1419083_STL001_haps.vcf.gz |
13.9 Mb |
(ftp)(http) |
VCF |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|