Figure 2Clustering of Lineage-Specific Gene Copy Number Variations, Segmental Duplications, and Sequence Gaps
HLS, LS, and OR_CASE BLAT analysis results were plotted along each chromosome (Build 35) using a modified version of the Genotator annotation browser [77]. HLS and LS refers to those genes identified by Fortna et al. [34] that showed aCGH-predicted gene copy number changes specific for the HLS and for one or more great ape lineages (LS), respectively. OR_CASE refers to those genes for which the aCGH-predicted copy number in human is different from one or more great ape lineages. All available ESTs were downloaded from GenBank for each IMAGE clone in the HLS, LS, and OR_CASE datasets. These ESTs were then aligned to the human genome (Build 35) using a locally installed version of BLAT. All BLAT hits with a score greater than 200 and a percent identity greater than 90% were kept for further analysis. Furthermore, the BLAT hits were parsed down such that only one hit per gene was reported to avoid multiple hits due to isoforms. The LS data set was split into subgroups to indicate orangutan, gorilla, and bonobo plus chimpanzee copy number differences. For these LS subgroups, all differences (gains and losses) are plotted as well as the copy number gains (indicated by a “+”) and copy number losses (indicated by a “−”). Furthermore, the WSSD and SDD annotations [23] were downloaded from UCSC (http://genome.ucsc.edu/) and plotted to illustrate the locations of recent (<40 Mya) segmental duplications in the human genome. Also included is the annotation of the known sequence gaps and an ideogram showing the location of the centromere (red) and the Giemsa staining patterns. Data for Chromosomes 1, 9, and 16 are shown. Data for all chromosomes can be found in Supplementary Figure S1.