We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

Results: 5

1.
Figure 3

Figure 3. Insertion allele frequency distribution. From: Characterization of Missing Human Genome Sequences and Copy-number Polymorphic Insertions.

The frequency of the insertion allele is shown for 189 loci that are fitted to distinct copy numbers and are consistent with a simple autosomal insertion-deletion variant. Values are shown for all 28 individuals (black bars) and separately for each HapMap population as indicated.

Jeffrey M. Kidd, et al. Nat Methods. ;7(5):365-371.
2.
Figure 4

Figure 4. Annotation of conserved and functional elements. From: Characterization of Missing Human Genome Sequences and Copy-number Polymorphic Insertions.

(a) The complete sequence of an OEA clone carrying 29 kbp of novel sequence is compared by miropeats to the reference genome. We identify a 95-bp conserved element within this sequence (green rectangles) as defined by a GERP analysis of 8 species (see Online Methods). A multiple sequence alignment of one of these conserved elements (black arrow) is highlighted. (b) A novel exon is predicted within the sequence of a 4.3-kbp insertion based on comparison with the PECAM1 transcript (NM_000442.3), as shown in blue. This alternate exon is supported by RNA-seq data and corresponds to a conserved element identified by alignment comparisons.

Jeffrey M. Kidd, et al. Nat Methods. ;7(5):365-371.
3.
Figure 1

Figure 1. Copy-number polymorphism of novel insertions. From: Characterization of Missing Human Genome Sequences and Copy-number Polymorphic Insertions.

ArrayCGH intensity data is displayed for novel sequences ordered along (a) chromosome 5 and (b) chromosome 14 based on anchored map locations (build35 coordinates, UCSC). Copy-number gains (orange) and losses (blue) are shown relative to the reference sample (NA15510). Each column in the heat map represents a probe on the array, and each row represents a sample ordered and separated (yellow lines) by corresponding HapMap population (CEU, CHB, JPT and YRI). The bottom row depicts a reference self-self hybridization as control. The red brackets group multiple contigs into loci that generally show a consistent hybridization pattern by arrayCGH.

Jeffrey M. Kidd, et al. Nat Methods. ;7(5):365-371.
4.
Figure 5

Figure 5. Genotyping sequenced variants through unique k-mer matches. From: Characterization of Missing Human Genome Sequences and Copy-number Polymorphic Insertions.

(a) Unique diagnostic k-mer sequences were identified for each variant using sequence-resolved breakpoints. For the deletion breakpoint, k-mers were required to have a single match to the reference genome and no matches to the fosmid sequences. For the insertion breakpoints, k-mers were required to have no matches to the genome and a single match to the fosmid. In order to be uniquely identifiable, a variant must have at least one deletion k-mer and at least one insertion k-mer that meet these criteria. (b) Effect of k-mer length and search stringency on ability to uniquely identify a variant. 71% (108/152) of the sequenced sites are uniquely identifiable with a criteria of k=36 and one substitution, while 97% (147/152) are assayable if k-mer length increased to 100 bp. (c) A comparison of genotypes determined using arrayCGH and breakpoint k-mer matching is depicted for sample NA18507. The search database consists of unique 36-mers (one substitution). Genotypes for 54 variants were successfully determined by both arrayCGH and breakpoint k-mer matching. Partitioning the breakpoint scores into distinct genotypes at 0.5 and 1.5 (red lines) results in 94.3% genotype agreement between the two methods. (d) Effect of sequence coverage on breakpoint k-mer genotyping. The number of variants genotyped (at least one matching read, solid line, left axis) and the percent agreement with arrayCGH results (dashed line, right axis) are shown at various sequence coverage levels (1–42X).

Jeffrey M. Kidd, et al. Nat Methods. ;7(5):365-371.
5.
Figure 2

Figure 2. Sequencing and genotyping insertions. From: Characterization of Missing Human Genome Sequences and Copy-number Polymorphic Insertions.

(a) The complete sequence of a clone (AC205876) carrying a 4.8-kbp novel insertion sequence is compared to the corresponding segment from chromosome 20 using miropeats (black lines connect segments of matching sequence; colored arrows correspond to common repeats; green: LINEs; purple: SINEs; orange: LTR elements; pink: DNA elements). The magenta lines denote the insertion breakpoints. The brown boxes correspond to the mapped position of three assembled novel sequence contigs. (b) ArrayCGH hybridization results represented as a heat map suggest that the deletion is fixed in CEU and CHB populations. The brown-red lines correspond to the three sequence contigs depicted in part (a) and are represented by 16, 15, and 18 arrayCGH probes respectively. The median log2 ratios (c) and single channel intensities (d) are shown for all probes matching AC205876. Note that the reference (blue bars) channel shows similar intensity across hybridizations. For this example the reference sample is inferred to have a copy number of 1. The signals form three distinct clusters that are assigned integer copy-number states of 0, 1, and 2. The dotted red, green, and blue lines correspond to the median intensities of each defined cluster. Using these genotypes an FST of 0.70 is calculated for this insertion. (e–h) A second example as described above depicting a 3.9-kb insertion (AC216083) within the first intron of the LCT (lactase) gene (red boxes represent exons as indicated).

Jeffrey M. Kidd, et al. Nat Methods. ;7(5):365-371.

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...
Write to the Help Desk