• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ajhgLink to Publisher's site
Am J Hum Genet. Aug 2005; 77(2): 252–264.
Published online Jun 28, 2005. doi:  10.1086/432588
PMCID: PMC1224528

Identification of Risk and Age-at-Onset Genes on Chromosome 1p in Parkinson Disease

Abstract

We previously reported a linkage region on chromosome 1p (LOD = 3.41) for genes controlling age at onset (AAO) in Parkinson disease (PD). This region overlaps with the previously reported PARK10 locus. To identify the gene(s) associated with AAO and risk of PD in this region, we first applied a genomic convergence approach that combined gene expression and linkage data. No significant results were found. Second, we performed association mapping across a 19.2-Mb region centered under the AAO linkage peak. An iterative association mapping approach was done by initially genotyping single-nucleotide polymorphisms at an average distance of 100 kb apart and then by increasing the density of markers as needed. Using the overall data set of 267 multiplex families, we identified six associated genes in the region, but further screening of a subset of 83 families linked to the chromosome 1 locus identified only two genes significantly associated with AAO in PD: the γ subunit of the translation initiation factor EIF2B gene (EIF2B3), which was more significant in the linked subset and the ubiquitin-specific protease 24 gene (USP24). Unexpectedly, the human immunodeficiency virus enhancer-binding protein 3 gene (HIVEP3) was found to be associated with risk for susceptibility to PD. We used several criteria to define significant results in the presence of multiple testing, including criteria derived from a novel cluster approach. The known or putative functions of these genes fit well with the current suspected pathogenic mechanisms of PD and thus show great potential as candidates for the PARK10 locus.

Introduction

Parkinson disease (PD [MIM 168600]) is the second most common neurodegenerative disorder and is characterized clinically by rigidity, bradykinesia, and resting tremor. PD has been shown to have a genetic component contributing to the disorder, including mutations in several genes (e.g., α-synuclein, Parkin, UCHL1, LRRK2, and PINK1) that lead to rare Mendelian forms of the disease (Polymeropoulos et al. 1997; Kitada et al. 1998; Leroy et al. 1998; Paisan-Ruiz et al. 2004; Valente et al. 2004).

The age at which an individual first manifests symptoms of the disease (i.e., age at onset [AAO]) also appears to be genetically controlled (Destefano et al. 2002; Li et al. 2002), but only a few genes (e.g., APOE, GSTO1/2, and IDE) that affect AAO have been identified (Karamohamed et al. 2003; Li et al. 2003, 2004; Blomqvist et al. 2004). We previously conducted a genomic screen to identify chromosomal regions harboring genes influencing AAO in PD and found significant evidence of linkage on chromosome 1p (LOD = 3.41) (Li et al. 2002). Interestingly, this AAO linkage peak is essentially identical to a risk linkage peak for PD (at locus PARK10) reported by Hicks et al. (2002) in their study of an Icelandic population (fig. 1), which provides strong evidence that there is a major PD gene in this region.

Figure  1
Linkage curves. The lower line depicts a chromosome 1 AAO linkage curve (Li et al. 2002), and the upper line shows the overlapping PD risk linkage peak derived from the Icelandic study (Hicks et al. 2002). The vertical lines indicate the 1-LOD score region ...

Several approaches have been undertaken to identify susceptibility genes located in linkage regions. One is the candidate-gene approach. However, because of our limited understanding of the biological systems involved in PD, this approach has not provided much insight into which genes to select. We proposed a second approach, which we termed “genomic convergence,” whereby genes that lie under a linkage peak and are differentially expressed between cases and controls in a relevant tissue (e.g., substantia nigra of the midbrain for PD) are tested for association (Hauser et al. 2003). This approach was successful in identifying the glutathione S-transferase omega-1 and -2 complex (GSTO1/2), located in the chromosome 10q linkage region for AAO, as having an influence on AAO in Alzheimer disease and PD (Li et al. 2003). The possible role of GSTO1/2 in modifying the AAO of Alzheimer disease has been recently corroborated by others (Kolsch et al. 2004). A third approach relies on genotyping a dense set of SNPs across a linkage region and subsequently testing those SNPs for association with the trait of interest. However, this process can become extremely costly for a large number of markers and samples.

In this article, we present the application of the genomic convergence approach combined with a process we term “iterative association mapping” to screen a dense map of SNPs in the region that were 1 LOD score down from the peak (hereafter, the 1-LOD score region) of the chromosome 1p linkage peak. In this region, there are 199 Ensembl genes (National Center for Biotechnology Information [NCBI] build 35) and 4,924 SNPs with a minor-allele frequency (MAF) >10% in the white population. Our proposed approach allows us to extract valuable information on the basis of reasonable genotyping efforts. Using this approach, we have identified several genes that show association with AAO and, surprisingly, one gene that shows association with risk for PD.

Material and Methods

Patients and Families

Affected individuals and family members were collected by the Morris K. Udall Parkinson Disease Research Center of Excellence (PDRCE) located within the Duke Center for Human Genetics (DCHG) and by the 13 centers of the Parkinson Disease Genetics Collaboration (Scott et al. 2001). A standard clinical evaluation involves neurological examination, including evaluation with the Unified Parkinson’s Disease Rating Scale (Goetz et al. 2003). A rigorous clinical assessment was performed by all participating clinicians to provide a clear diagnosis of PD and to exclude any individuals who displayed atypical features of parkinsonism (Hubble et al. 1999; Scott et al. 2001). Individuals categorized as “affected” showed at least two of the cardinal signs of PD (resting tremor, bradykinesia, and rigidity). AAO for affected individuals was defined as the age at which an affected individual first noticed one of the cardinal signs of PD. Participants categorized as “unaffected” demonstrated no signs of the disease, and participants categorized as “unclear” showed only one cardinal sign and/or atypical features. All participants signed informed consent forms prior to blood and data collection. Institutional review boards at each participating center approved the study protocols and consent forms.

The data set consists of multiplex (n=267) and singleton (n=361) white families. We defined singleton and multiplex families on the basis of the total number of parent-child triads and discordant sib pairs (DSPs) in the family that can contribute to the association test. Singleton families have only one group (either a triad or a DSP) contributing to the association test—that is, they have one affected individual—and either the parent (affected or unaffected) or the unaffected sibling is sampled in addition to the affected individual. Multiplex families have at least two groups (triads or DSPs) contributing to the association test—that is, they have at least two affected siblings sampled in the family. Families with Parkin mutation carriers were excluded from this study. The multiplex data set includes 609 affected individuals (average AAO ± SD = 61.0 ± 11.6 years; range 14–90 years; 58.8% males) and 666 unaffected individuals (42.8% males). The singleton families include 391 affected individuals (average AAO ± SD = 55.5 ± 13.0 years; range 15–85 years; 69% males) and 356 unaffected individuals (42.7% males).

DNA Extraction and Genotyping

DNA samples were prepared and stored by the DCHG DNA bank core. Genomic DNA was extracted from whole blood by use of the PureGene system (Gentra Systems Autopure LS). A total of 284 SNPs (table 1) were genotyped with the use of Applied Biosystems (ABI) Assays-on-Demand (AoDs) or Assays-by-Design (AbDs) or with the use of primers and probes designed using ABI Primer Express 2.0 software. The SNPs were chosen first on the basis of their location (e.g., SNPs were initially chosen with an average distance of 100 kb between them) and then on the basis of frequency to capture a wide range of frequencies among all selected SNPs. The TaqMan allelic discrimination assay was used to genotype all SNPs. PCR amplification was performed in 5-μl reactions (comprising 2.6 ng of dried DNA, 1× TaqMan universal PCR master mix [ABI], and 1× genotyping mix for AoDs and AbDs or, for self-designed assays, 900 nM of each primer and 200 nM of each probe). PCR was performed using the GeneAmp PCR system 9700 thermocycler (ABI), with a 40-cycle program (1 cycle at 95°C for 10 min, followed by 40 cycles at 95°C for 15 s and at Tm for 1 min, where Tm is 60°C for AoDs and AbDs and ranges from 58°C to 64°C for self-designed assays). The fluorescence generated during the PCR amplification was read using the ABI Prism 7900HT sequence detection system and was analyzed with SDS software (ABI).

Table 1
SNPs Analyzed[Note]

Stringent quality-control measures were taken to ensure data consistency. Internal controls consisted of 24 duplicated individuals per 384-well plate. In addition, two samples from CEPH were plated eight times per plate to assure plate-to-plate consistency. All genotypers were blinded to these internal controls. Quality-control samples were compared at the DCHG Data Coordinating Center. Data were stored and managed by the PEDIGENE system (Haynes et al. 1995). To pass quality control, genotyping plates must have retained a 100% match for quality-control samples and must have at least 95% overall efficiency.

Candidate Genes Derived from the Genomic Convergence Approach

Two independent gene expression studies on human midbrain tissues from patients with PD and normal controls, by use of microarray and serial analysis of gene expression (SAGE) technologies, were conducted as part of current projects of the Duke PDRCE (Hauser et al. 2003; Noureddine et al. 2005). By combining these two studies, we found six genes that were significantly differentially expressed between patients with PD and controls and that mapped to the chromosome 1p AAO linkage region (table 2). In the present study, we tested SNPs in these six genes for association with risk and AAO in PD.

Table 2
Genes that Map to the Chromosome 1p AAO Linkage Peak, Differentially Expressed in Patients with PD versus Controls in Microarray and SAGE Experiments

Iterative Association Mapping

We developed a second approach, iterative association mapping, to identify candidate genes in a linkage region. The overall concept is to reduce the number of SNPs genotyped while maximizing the chance of discovering a significant association. SNPs are first chosen at 100-kb intervals and then tested for association with traits of interest, which, in this case, are risk and AAO in PD. If no significant association is detected, the marker-to-marker distance is decreased by one-half each time (to 50 kb, 25 kb, etc.), until a significant association result is found. When a significant association is detected, additional SNPs in the surrounding region are then tested on the basis of known linkage disequilibrium (LD) patterns or physical iteration in the surrounding region of the associated SNP if no previous LD patterns are available.

Statistical Analyses

All SNPs were tested for Hardy-Weinberg equilibrium (HWE) and LD in the affected group (comprising one affected individual from each family) and the unaffected group (comprising one unaffected individual from each family). An exact test implemented in the Genetic Data Analysis program was used to test HWE, for which 3,200 permutations were performed to estimate the empirical P value for each marker (Zaykin et al. 1995). The Graphical Overview of Linkage Disequilibrium (GOLD) package was used to calculate LD (as measured by the Pearson correlation coefficient, r2, and Lewontin’s standardized disequilibrium coefficient, D′) between pairs of SNPs (Abecasis and Cookson 2000). Both r2 and D′ range from 0 (no LD) to 1 (perfect LD). However, there is no clear definition to use for interpretation of intermediate LD values. Here, we chose an arbitrary cutoff and considered two markers to be in strong LD if r2>0.60 or D>0.90.

AAO was treated as a quantitative trait. We used both the orthogonal model (OM) (Abecasis et al. 2000) and the Monks-Kaplan method (MKM) (Monks and Kaplan 2000) implemented in the Linkage Disequilibrium Analyses for Quantitative and Discrete Traits (QTDT) program (see Web Resources) to test the association between markers and AAO. The MKM not only provides an association signal but also detects the direction of association—that is, positive association for allele A is declared when the majority of allele A carriers have an AAO higher than the average AAO. In addition to obtaining nominal P values, we also performed 10,000 permutation tests to obtain an empirical P value for each marker on the basis of the MKM. The global significance level was derived from permutation tests.

We performed haplotype analysis for genes with significant markers. Prior to the haplotype analysis, we identified tagging SNPs (tagSNPs) for each gene by using the ldSelect program (see Web Resources) (Carlson et al. 2004). The ldSelect program generates groups of markers in LD on the basis of a given threshold for r2. These groups are referred to as “LD bins.” A tagSNP is then selected from each LD bin. To perform the haplotype association analysis for AAO on the tagSNPs, we first used the FBAT –o option (Laird et al. 2000) to estimate the optimal offset of the AAO for each tagSNP. We then performed the HBAT –e option (Horvath et al. 2004) on the adjusted AAO data (subtracting from AAO the average optimal offset estimate), to test the association between haplotypes and AAO. When the number of tagSNPs is large, the computational time is substantial and the haplotype frequencies tend to be small, which makes interpretation difficult, even if significant P values are found. Therefore, we limited our haplotype computations to five tagSNPs. For genes with more than five tagSNPs, we analyzed all possible combinations of five tagSNPs.

The pedigree disequilibrium test (PDT) (Martin et al. 2000, 2003) was used to determine the association between markers and PD risk. Two PDT statistics were used: the PDT-sum statistic, for allelic effects, and the GenoPDT statistic, for genotypic effects. We also performed haplotype analysis on the risk genes detected by the PDT. The approach of selecting tagSNPs was as described above. We used the HBAT –e option to test the haplotype association between a set of tagSNPs and PD.

Several criteria were used to determine the final levels of significance in the presence of multiple comparisons. First, a significance level of P[less-than-or-eq, slant].05 was used for evaluating the initial set of markers with 100-kb spacing. Second, a cluster approach (described below) was used to generate a significance level for further iterations. This approach requires that two or more markers, which have an r2 correlation <0.6, be significant within a cluster of SNPs. Finally, at least one marker in the candidate gene or region needs to meet the global significance level derived from the permutation test.

Assume a total of N markers with low LD (r2<0.6) across the region of interest and x markers located in each cluster, which leads to y clusters (y=N/x). We hypothesized that a cluster would be significant only if two markers within the cluster are significant. We can formulate the probability (αc) that 1 of the y clusters is significant as a function of the probability that a marker is significant (where α is the significance level of a marker):

equation image

By restricting the significance level of a cluster to be αc, we can compute the probability that a marker is significant. In other words, the probability that two markers within a cluster are significant at the level of α will result in the probability αc that one cluster is significant. Clearly, α decreases when the number of significant markers within a cluster decreases or when αc, the significance level of a cluster, decreases. The calculation of the global significance level is described above.

The multiplex families used in this study include 167 families that were studied in the previous AAO linkage study (hereafter called “the linkage data set”) (Li et al. 2002). We performed SOLAR (Almasy and Blangero 1998) PEDLOD analysis with our previously found chromosome 1 peak marker (D1S2134), to obtain family-specific LOD scores for the 167 families. We then stratified the linkage data set into positive- and negative-linkage subsets on the basis of the family-specific LOD scores. The genes significantly associated with AAO in the overall data set were also tested for association with AAO by use of the MKM in the positive- and negative-linkage subsets. We did not use the OM, because it requires a normal distribution for the quantitative trait of interest, which is a problem for these small stratified data sets.

mRNA Analysis of USP24

Total RNA was isolated from human midbrain tissue and was reverse transcribed using poly-dT primers to generate a cDNA library. Primers used to amplify fragments of the USP24 transcript were designed using Primer3 (see Web Resources). (The sequences are available on request.) We generated several PCR products of the expected size from the cDNA library and sequenced them. The exon-intron structure of the complete USP24 transcript was deduced from genomic alignment of the overlapping RT-PCR fragments.

Results

Identification of the Linkage Subsets of Families

The SOLAR PEDLOD analysis of D1S2134 identified 83 families with positive LOD scores (i.e., with positive linkage) and 84 with negative LOD scores (i.e., with negative linkage) from the linkage data set (Li et al. 2002). Throughout the present study, we performed association analyses with the overall PD data set as well as with these two stratified linkage subsets.

Genomic Convergence

We identified two differentially expressed genes from a previous microarray study (Hauser et al. 2005) and four from a SAGE study (Noureddine et al. 2005) that mapped to our chromosome 1p AAO linkage region (table 2). We analyzed SNPs (table 1) in each of these six genes, using the PD multiplex data set. The LD patterns of these six genes (pairwise r2 values) are shown in tables tables3338.

Table 3
Pairwise Pearson Correlation Coefficient (r2) for ATP6V0B[Note]
Table 4
Pairwise Pearson Correlation Coefficient (r2) for UQCRH[Note]
Table 5
Pairwise Pearson Correlation Coefficient (r2) for RNF11[Note]
Table 6
Pairwise Pearson Correlation Coefficient (r2) for C1orf8[Note]
Table 7
Pairwise Pearson Correlation Coefficient (r2) for TTC4[Note]
Table 8
Pairwise Pearson Correlation Coefficient (r2) for PAP2B[Note]

The exclusion of a gene as a candidate from an association study is not always straightforward. The degree of confidence with which one excludes a gene from association is based on the depth of the search. One measure is at the level of LD defined by the current HapMap data set. Because we began genotyping our data set prior to the availability of the HapMap data set, and because we genotyped as many SNPs with as wide a variety of frequencies as possible from what was available in public (NCBI) and private (ABI) databases, some of our markers are not in the HapMap data set. To evaluate whether we have sufficiently covered each gene, we compared our SNP coverage of each gene with that available in the current HapMap data. The number of LD bins identified on the basis of HapMap SNPs with an MAF >10% is as follows: one LD bin each for ATP6V0B, UQCRH, and C1orf8; two for TTC4; three for RNF11; and 12 for PPAP2B. Overall, our SNPs included the HapMap tagSNPs in all genes except RNF11 and PPAP2B; we missed one HapMap tagSNP in RNF11 and covered only two HapMap tagSNPs (of seven genotyped SNPs) in PPAP2B.

None of these genes show significant association with PD risk, and only SNP 193 in C1orf8 showed significant association with AAO in PD (fig. 2). The association of SNP 193 was not verified in the positive-linkage subset.

Figure  2
Results of single-locus association tests of AAO in PD. Two methods were used to assess association with AAO in the overall PD data set: the MKM (triangles) and the OM (diamonds). The SNP numbers of the significant polymorphisms (i.e., those with P [less-than-or-eq, slant].01) ...

ELAVL4

The embryonic-lethal, abnormal vision, Drosophila-like 4 gene (ELAVL4) encodes for a neuron-specific RNA-binding protein. This gene was studied as a biological candidate gene through an ongoing project of the Duke PDRCE (Antic and Keene 1997). Two polymorphisms (SNPs 136 and 143) were previously found to be significantly associated with AAO in PD (Noureddine et al. 2005). However, these markers were not found to reach significant P values in the positive-linkage subset in the present study.

Iterative Association Mapping and LD

The initial association map consisted of 200 SNPs (1 SNP genotyped, on average, every 100 kb) in the 1-LOD score region (40.4–59.2 Mb on NCBI build 34) (fig. 1). With additional genotyping in the regions of interest, the average SNP density in our final association map was 1 marker every 66 kb, with a total of 284 SNPs genotyped. The MAFs of the SNPs varied from 0.03 to 0.50 (median and average of 0.29). All but 20 (7%) of the SNPs were in HWE in both the affected and unaffected samples at a P=.05 level (table 1). The genotype distributions of these 20 SNPs were reexamined by a technician in the laboratory and were tested for HWE again. The results remained the same. Given a 5% random chance of obtaining markers not in HWE, the 7% frequency detected in our project is within a reasonable range. Furthermore, it is important to note that the MKM and the PDT do not require HWE.

Figure 3 depicts the overall LD structure in this region (for SNPs that did not deviate from HWE, since LD calculations assume equilibrium) in the unaffected group. A similar LD pattern was observed in the affected group. LD is mostly restricted to intragenic areas, with no extensive LD for long stretches of DNA, or across distant loci for the majority of polymorphisms. Only SNPs with a low MAF (recent SNPs) show high values of D′ with most neighboring SNPs.

Figure  3
LD plot. The pairwise LD (as measured by the Pearson correlation coefficient, r2, and Lewontin’s standardized disequilibrium coefficient, D′) in the group without PD, between all 264 markers in HWE, is depicted. The top-right triangle ...

To obtain a P value for the cluster analysis, 210 markers were identified whose r2 was <0.6 for LD. Using these 210 markers and assuming 7 markers within each cluster, we derived a significance level of .01 for each marker. In addition, we obtained a global significance level of .001. Figures Figures22 and and44 depict the nominal P values of all 284 markers in −log10 scale. Among the first 200 SNPs studied (from the 100-kb map), evidence for association with AAO was found by either the OM or the MKM in the genes for the eukaryotic translation initiation factor EIF2B3 (for SNP 63, P=.009 by the OM and .0004 by the MKM), the testis-specific protein kinase 2 (TESK2 [for SNP 76, P=.008 by the MKM]), hypothetical protein FLJ14442 (for SNP 117, P=.01 by the MKM), and the ubiquitin-specific protease 24 (USP24 [for SNP 220, P=.004 by the OM]) (figs. (figs.22 and and5).5). All these markers have empirical P values by the permutation test that are slightly lower than the nominal P values. For example, the empirical P value for SNP 63 in EIF2B3 was .0002. Evidence of association with risk for PD by use of the multiplex data set was found only in the HIV type 1 enhancer-binding protein 3 gene (HIVEP3), for SNPs 13 (P=.008) and 19 (P=.004) (fig. 4). We proceeded to increase the SNP density in these genes.

Figure  4
Results of single-locus association tests of risk for PD. The allelic association with PD was tested using the PDT (squares), and the genotypic association was tested using the the GenoPDT (circles) with the overall PD data set. The SNP numbers of the ...
Figure  5Figure  5Figure  5Figure  5
Characterization of USP24L, with mRNA and predicted protein sequence of the USP24L transcript. The protein sequence in bold corresponds to the overlap with ...

TESK2 and FLJ14442

Additional SNPs (SNPs 72, 74, and 75 in TESK2 and SNPs 116, 118, 120, 122, and 124 in FLJ14442) were genotyped, to a final average density of 1 marker per 29 kb for TESK2 and 1 marker per 51 kb for FLJ14442. Although we detected two sets of cluster markers for AAO association, no markers were significant after correction for multiple testing, nor did they show evidence of association in the positive-linkage subset.

EIF2B3

Ten additional SNPs (SNPs 57–62 and 64–67) were genotyped in the EIF2B3 gene (size 136 kb), leading to a final average density of 1 marker per 12 kb (fig. 2). Interestingly, several markers that were close to significance in the overall data set became significantly associated with AAO in the positive-linkage subset (table 9), despite the fact that the subset is only one-third (83 families) of the total sample. Therefore, at least two clusters of markers in low LD (r2<0.6) (SNPs 59–61 and 62–64) are strongly associated with AAO in this gene. More interestingly, SNPs 62–64 are still significant after correction for multiple testing (P<.001).

Table 9
Summary of P Values Obtained by the OM and MKM for SNPs in EIF2B3 and USP24, in the Overall, Positive-Linkage, and Negative-Linkage Data Sets[Note]

Five tagSNPs (SNPs 59, 60, and 64–66) were found in EIF2B3. Haplotype analysis with these five tagSNPs and the use of the overall PD data set produced two haplotypes significantly associated with AAO: C-C-G-T-G (haplotype frequency 17.2%; P=.002) and A-C-A-T-G (haplotype frequency 15.2%; P=.002) (table 10). These two haplotypes showed P values comparable to what we detected for SNP 64 alone (P=.01 by the OM and .0001 by the MKM).

Table 10
Summary of Haplotypes Showing Significant Association with AAO in the Overall PD Data Set[Note]

USP24 and AK127075

In total, we genotyped 14 SNPs (SNPs 218–231) with ~17-kb spacing in the region from USP24 to the cDNA FLJ45132 clone BRAWH3037979 (GenBank accession number AK127075), a region in which seven SNPs (SNPs 220–222, 224, 227, 230, and 231) are significantly associated with AAO (P<.01) (fig. 2). The most significant marker was SNP 227, with P=.0006 by the OM and .007 by the MKM.

In silico, several lines of evidence suggested that the annotated USP24 gene in NCBI build 34 (as defined by the mRNA for KIAA1057 protein [GenBank accession number AB028980]) may actually be a truncated version of the full-length USP24 transcript. The 5′ end of the AB028980 transcript (exons 1–11) matches the 3′ end of the AK127075 mRNA (exons 25–35), and the human THC1877380 transcript from the TIGR Human Gene Index overlaps both genes. GenScan predicts the existence of the NT_032977.390 mRNA (composed of the AB028980 and AK127075 mRNAs and 12 additional exons at the 5′ end), and there is a cluster of human overlapping spliced ESTs (e.g., GenBank accession numbers BM458550, AW853346, and CD687922) that support the existence of a longer USP24 transcript. Furthermore, the mouse AK045043 significantly overlaps with this cluster of ESTs but has two additional distant exons at the 5′ end. The putative first exon is supported by the FirstEF program prediction, contains an ATG start codon with sequences conforming to a Kozak consensus ([A/G]CC ATG G), has a nearby CpG island, and is close to predicted promoter sequences; all of which strongly reinforce the idea that it encodes the first exon of the larger USP24 ORF. This gene produces a predicted mRNA of ~8 kb.

To evaluate the existence of this larger USP24 transcript, termed “USP24L,” we used strategically positioned primers to amplify overlapping transcript fragments from a human midbrain cDNA library. We obtained RT-PCR products of the expected sizes, and direct sequencing of these products confirmed the existence of USP24L. Using the BLAST tool implemented at the University of California–Santa Cruz Web site, we aligned the experimentally amplified composite cDNA with the genomic sequence. The sequence of our USP24L transcript carried more exons than the GenScan NT_032977.390 and XM_371254 predictions, some of which are supported by human or mouse ESTs. All splice junctions followed the canonical AG/GT rule. The composite cDNA is predicted to encode a protein of 2,590 aa (fig. 5), distributed over 69 exons and spanning 146 kb of genomic sequence (54904635–55050704 bp in chromosome 1). The LD block observed from SNP 216 through SNP 231 (fig. 3), which encompasses USP24L and flanking regulatory sequences only, also supports the predicted size of USP24L.

Since the SNPs significantly associated with AAO in this region completely span USP24L, and since strong LD exists throughout USP24L but not with neighboring genes, we concluded that the association originates from USP24L itself. Three LD bins were found in this region on the basis of 14 SNPs genotyped in this study (SNPs 218–231). The seven SNPs significantly associated with AAO were, in fact, originating from two LD bins. The first LD bin is formed by SNPs 220, 221, 224, and 230 (maximum P=.007), and the second is formed by SNPs 222, 227, and 231 (maximum P=.003), which implies that there are two independent polymorphisms in USP24L that have significant effect on AAO. Although none of the SNPs in USP24L were significantly associated in either the positive- or negative-linkage subsets by the MKM, SNPs 221, 224, and 230 were close to significant (.05<P<.06) in the positive-linkage subset (table 9).

Three tagSNPs (SNPs 218, 219, and 227) were identified in USP24. Two haplotypes, C-T-T (frequency 62.6%; P=.003) and C-T-C (frequency 19.9%; P=.026), were found to be significantly associated with AAO (table 10). Overall, these haplotypes in USP24 did not provide any more information on the association with AAO than did SNP 227 alone.

HIVEP3

A total of nine markers in this gene were genotyped, with a final average density of 1 marker every 45 kb. The new SNPs failed to reveal any further significant association with risk for development of PD (fig. 4). However, SNP 12 was close to significant in both the allelic (P=.058) and the genotypic (P=.057) association tests, and SNP 18 (P=.059) was close to significant in the PDT, since it is in relatively high LD with SNP 19 (r2=0.75 in the unaffected group). To test for association of SNPs 13 and 19 in a second independent data set, we genotyped these two markers in the PD singleton data set. We did not find evidence of association of these SNPs in the singleton data set alone. However, both markers showed stronger significant association in the combined multiplex and singleton data set (P=.006 for SNP 13; P=.002 for SNP 19) than in the multiplex data set. Clearly, some singleton families also contribute to the association of these two markers.

We identified eight tagSNPs (SNPs 13–17 and 19–21) in HIVEP3. Haplotype analyses based on five tagSNPs revealed the best results by use of tagSNPs 13, 15, 17, 19, and 21, in which a rare A-G-T-G-C haplotype (frequency 2.1%) was significantly associated with risk for PD (P=.003) (table 10). HIVEP3 is a relatively large gene (408 kb), and very low levels of LD were observed among the SNPs genotyped (fig. 3). The lack of LD between SNPs 13 and 19 (r2=0; D=0.02) provides two independent lines of evidence for the involvement of this gene in controlling risk for development of PD.

Discussion

The present study was motivated by the results of our previous genomic screen for AAO in PD, in which we reported a linkage region on chromosome 1p near D1S2134 (LOD = 3.41) (Li et al. 2002). In the present study, we present a systematic approach, termed “iterative association mapping,” to identify susceptibility genes and genetic modifiers in a linkage region. This methodology has the advantage of being unbiased by any preconceived ideas about the pathogenic mechanisms of a disease (as in candidate-gene studies). In addition, our analysis strategies include single-locus association tests in the overall set and in the positive- and negative-linkage subsets, as well as haplotype association analysis based on tagSNPs in the overall data set.

Because a large number of SNPs was tested in this study, we wished to correct for multiple testing while maintaining an appropriate threshold to screen for potential areas of association without eliminating any potential candidates. The Bonferroni correction is too conservative and would become exclusionary at a point when we want to avoid missing any potential associations. One can prioritize genes on the basis of the order of P values or use the global significance level derived from the permutation test, but either method may exclude too many potential leads; thus, these options do not fit the purpose of the first few iterations. Therefore, we added an intermediate criterion for analysis, since we considered the presence of multiple significant markers in low LD within a regional cluster to be more important than sporadic results across the region. The concept of this method is relatively straightforward: if multiple comparisons lead to significant SNPs only by chance, then these false-positive SNPs (if we assume, for the moment, that all SNPs in high LD have the same measure) should be randomly distributed across the physical region being tested. That is, there is no reason for them to be clustered together physically if they are significant only due to chance. Thus, we are seeking two SNPs with a defined level of significance that lie within a small physical region and that have a correlation that is low enough (r2<0.6) that the significant associations of each individual marker with AAO are not likely to be the result of measuring the same chance event. This approach allows us to lower the significance level, which is more stringent than the conventional approach of using a nominal significance level, and to take into account the locations of the significant markers.

EIF2B3 ranks as the most significant AAO gene in this region. Two clusters of markers in this gene were significantly associated with AAO in the overall set and positive-linkage subsets. We also detected two clusters of markers in USP24 that are significantly associated with AAO at both significance levels P=.01 and P=.001. However, the association evidence was not as strong as it was for EIF2B3, because of less significant findings in the positive-linkage subset. We therefore would consider USP24 to be the second most significant AAO gene in the region for further follow-up. Finally, HIVEP3 is the only gene found in this region that is associated with risk for development of PD.

The finding of multiple associated genes under the peak was somewhat unexpected. If one assumes that not all the statistically significant genes found here are biologically important in PD, is there a way to prioritize them for further study? Conceptually, since linkage analysis localized the initial peak (Li et al. 2002), the associations we identified should be “responsible” for the linkage. Thus, we identified those families contributing to the chromosome 1 linkage localization and examined that subset for association. However, by reducing the sample size to one-third (only 83 families had positive LOD scores at marker D1S2134), one would expect that the P values of the associated SNPs would become less significant, on the basis of power alone. But, in reducing the sample size, we also expect to render our sample more homogeneous and therefore to increase the significance for the true susceptibility polymorphisms. Interestingly, the most significant polymorphism in EIF2B3 remained equally significant despite the sample size decrease, whereas two polymorphisms in EIF2B3 (SNPs 59 and 61) that were close to significant in the overall data set became more significant in the positive-linkage subset. This implicates EIF2B3 in controlling the AAO of PD. The ability to subdivide the data on the basis of linkage also demonstrates one of the additional strengths of family-based association data.

EIF2B3 is the γ subunit of the heteropentamer EIF2B (which includes α, β, γ, δ, and epsilon subunits). The translation initiation factor EIF2B catalyzes the exchange of guanine nucleotides on the initiation factor, EI2F, which itself mediates the binding of the initiator Met-tRNA to the 40S ribosomal subunit during translation initiation. EIF2B is important because it regulates global rates of protein synthesis, particularly when the cell is under mild cellular stress. Protein synthesis is generally decreased during periods of cellular stress, to lower the amount of detrimental unfolded and damaged proteins that can be toxic to the cell (van der Knaap et al. 2002). Interestingly, EIF2B causes vanishing white-matter disease (VWM [MIM 603896]), an autosomal recessive disorder characterized by cerebellar ataxia, spasticity, inconstant optic atrophy, and a relatively mild mental decline. The early onset of this disease reflects the hypothetical maximum expression levels of EIF2B-β, -γ, -δ, and -epsilon during embryonic development and the lower levels associated with aging (Inamura et al. 2003). It is well known that mild head trauma or fever is highly correlated with rapid clinical decline in these patients. van der Knapp et al. (2002) suggested that this clinical deterioration is the result of the failure of EIF2B in the critical role of regulating protein synthesis under mild cellular stress. Furthermore, the observed phenotypic variation in patients with identical EIF2B mutations suggests that genetic polymorphisms may influence the mutation’s effect (van der Knaap et al. 2002). Thus, the biological activity of this gene fits well with the current ideas about cellular stress having a major role in PD.

USP24, the second most significant AAO gene, is a member of the family of the ubiquitin-specific proteases that remove polyubiquitin from target proteins, rescuing them from degradation by the proteosome. Whereas genes involved in the proteolytic pathway and aggregation of proteins (Parkin and α-synuclein) contribute to PD pathology, USP24 appears to be an excellent biological candidate gene for controlling AAO in PD. We identified several polymorphisms in USP24 significantly associated with AAO, one of which (SNP 220) is nonsynonymous (alanine→valine). The effect of this polymorphism on protein function is not currently known.

Unlike EIF2B3 and USP24, HIVEP3 was found to be associated with the risk for development of PD. The HIVEP3 protein is a member of the HIV enhancer-binding protein family that encodes large zinc-finger proteins and that regulates transcription via the κB enhancer motif (Allen et al. 2002). This motif is an important element controlling the transcription of viral genes and many cellular genes that are involved in immunity, cell-cycle regulation, and inflammation. As we reported elsewhere, GSTO1 is associated with AAO in PD (Li et al. 2003) and also possibly plays a role in inflammation during the pathogenesis of PD, because of its involvement in the posttranslational modification of the inflammatory cytokine interleukin-1β (Laliberte et al. 2003). The mouse homolog of HIVEP3, the κ recognition component (KRC), participates in the signal transduction pathway leading from the tumor necrosis factor receptor to gene activation and may play a critical role in inflammatory and apoptotic responses (Oukka et al. 2002). Interestingly, patients with HIV have been reported to have decreased levels of dopamine but normal levels of other neurotransmitters, which suggests selective and profound loss of dopamine neurons (Lopez et al. 1999).

As mentioned above, our linkage peak for AAO in chromosome 1 has a surprisingly accurate overlap with the risk linkage peak from the Hicks et al. (2002) study of an Icelandic population. It is interesting to note that we did not detect linkage signals in chromosome 1 in our genomic screen (Scott et al. 2001). However, this may be because of the genetically heterogeneous nature of our study population, compared with the population of Iceland.

In summary, through the use of a genomic iterative association mapping technique, we have identified three novel and biologically pertinent genes associated with risk or AAO in PD. We also demonstrate the potential usefulness of using linked subsets to further prioritize genes when presented with multiple associated genes in the same region. We are currently examining the biological role of these genes in modifying AAO and risk in PD.

Acknowledgments

We are grateful to all of the families whose participation made this project possible. We thank the members of the PD Genetics collaboration (Martha A. Nance, Ray L. Watts, Jean P. Hubble, William C. Koller, Kelly Lyons, Rajesh Pahwa, Matthew B. Stern, Amy Colcher, Bradley C. Hiner, Joseph Jankovic, William G. Ondo, Fred H. Allen Jr., Christopher G. Goetz, Gary W. Small, Donna Masterman, Frank Mastaglia, and Jonathan L. Haines) who contributed multiplex families to the study. We also thank the core personnel at the Center for Human Genetics, Duke University Medical Center, and the members of the clinical core of the Duke PDRCE. This research was supported by National Institutes of Health/National Institute of Neurological Disorders and Stroke grants R01 NS311530-10 and P50-NS-039764. S.A.O. received funding from a postdoctoral fellowship (BPD/5733/2001) granted by the Portuguese Science and Technology Foundation.

Web Resources

Accession numbers and URLs for data presented herein are as follows:

GenBank, http://www.ncbi.nlm.nih.gov/Genbank/ (for cDNA FLJ45132 clone BRAWH3037979 [accession number AK127075], the mRNA for KIAA1057 protein [accession number AB028980], and ESTs [accession numbers BM458550, AW853346, and CD687922])
GenScan, http://www.ncbi.nlm.nih.gov (for accession numbers NT_032977.390 and XM_371254)
Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim/ (for PD and VWM)
TIGR Human Gene Index, http://www.tigr.org/tdb/hgi/

References

Abecasis GR, Cardon LR, Cookson WO (2000) A general test of association for quantitative traits in nuclear families. Am J Hum Genet 66:279–292 [PMC free article] [PubMed]
Abecasis GR, Cookson WO (2000) GOLD—graphical overview of linkage disequilibrium. Bioinformatics 16:182–183 [PubMed]
Allen CE, Mak CH, Wu LC (2002) The κB transcriptional enhancer motif and signal sequences of V(D)J recombination are targets for the zinc finger protein HIVEP3/KRC: a site selection amplification binding study. BMC Immunol 3:10 [PMC free article] [PubMed]
Almasy L, Blangero J (1998) Multipoint quantitative-trait linkage analysis in general pedigrees. Am J Hum Genet 62:1198–1211 [PMC free article] [PubMed]
Antic D, Keene JD (1997) Embryonic lethal abnormal visual RNA-binding proteins involved in growth, differentiation, and posttranscriptional gene expression. Am J Hum Genet 61:273–278 [PMC free article] [PubMed]
Blomqvist ME, Silburn PA, Buchanan DD, Andreasen N, Blennow K, Pedersen NL, Brookes AJ, Mellick GD, Prince JA (2004) Sequence variation in the proximity of IDE may impact age at onset of both Parkinson disease and Alzheimer disease. Neurogenetics 5:115–119 [PubMed]
Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA (2004) Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 74:106–120 [PMC free article] [PubMed]
Destefano AL, Lew MF, Golbe LI, Mark MH, Lazzarini AM, Guttman M, Montgomery E, et al (2002) PARK3 influences age at onset in Parkinson disease: a genome scan in the GenePD study. Am J Hum Genet 70:1089–1095 [PMC free article] [PubMed]
Goetz CG, Poewe W, Rascol O, Sampaio C, Stebbins GT, Fahn S, Lang AE, Martinez-Martin P, Tilley B, Van Hilten B, Cleczka C, Seidl L (2003) The Unified Parkinson’s Disease Rating Scale (UPDRS): status and recommendations. Mov Disord 18:738–750
Hauser MA, Li YJ, Takeuchi S, Walters R, Noureddine M, Maready M, Darden T, Hulette C, Martin E, Hauser E, Xu H, Schmechel D, Stenger JE, Dietrich F, Vance J (2003) Genomic convergence: identifying candidate genes for Parkinson’s disease by combining serial analysis of gene expression and genetic linkage. Hum Mol Genet 12:671–677 [PubMed]
Hauser MA, Li YJ, Xu H, Stenger J, Noureddine M, Shao Y, Gullans S, Scherzer CR, Jensen R, McLaurin A, Scott B, Jewett RM, Hulette C, Schmechel DE, Vance JM (2005) Expression profiling of substantia nigra in Parkinson, PSP, and FTDP-17. Arch Neurol 62:917–921 [PubMed]
Haynes C, Speer MC, Peedin M, Roses AD, Haines JL, Vance JM, Pericak-Vance MA (1995) PEDIGENE: a comprehensive data management system to facilitate efficient and rapid disease gene mapping. Am J Hum Genet Suppl 57:A193
Hicks AA, Petursson H, Jonsson T, Stefansson H, Johannsdottir HS, Sainz J, Frigge ML, Kong A, Gulcher JR, Stefansson K, Sveinbjornsdottir S (2002) A susceptibility gene for late-onset idiopathic Parkinson’s disease. Ann Neurol 52:549–555 [PubMed]
Horvath S, Xu X, Lake SL, Silverman EK, Weiss ST, Laird NM (2004) Family-based tests for associating haplotypes with general phenotype data: application to asthma genetics. Genet Epidemiol 26:61–69 [PubMed]
Hubble JP, Weeks CC, Nance M, Watts RL, Koller WC, Stern MB, Colcher A, Ondo W, Jankovic J, Goetz C, Pappart E, Deane-Glaxo WPGC, Stajich JM, Scott BL, Vance JM, Pericak-Vance MA (1999) Parkinson’s disease: clinical features in sibships. Neurology Suppl 52:A13
Inamura N, Nawa H, Takei N (2003) Developmental changes of eukaryotic initiation factor 2B subunits in rat hippocampus. Neurosci Lett 346:117–119 [PubMed]
Karamohamed S, Destefano AL, Wilk JB, Shoemaker CM, Golbe LI, Mark MH, Lazzarini AM, et al (2003) A haplotype at the PARK3 locus influences onset age for Parkinson’s disease: the GenePD study. Neurology 61:1557–1561 [PubMed]
Kitada T, Asakawa S, Hattori N, Matsumine H, Yamamura Y, Minoshima S, Yokochi M, Mizuno Y, Shimizu N (1998) Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Nature 392:605–608 [PubMed]
Kolsch H, Linnebank M, Lutjohann D, Jessen F, Wullner U, Harbrecht U, Thelen KM, Kreis M, Hentschel F, Schulz A, von Bergmann K, Maier W, Heun R (2004) Polymorphisms in glutathione S-transferase omega-1 and AD, vascular dementia, and stroke. Neurology 63:2255–2260 [PubMed]
Laird NM, Horvath S, Xu X (2000) Implementing a unified approach to family-based tests of association. Genet Epidemiol Suppl 1 19:S36–S42 [PubMed]
Laliberte RE, Perregaux DG, Hoth LR, Rosner PJ, Jordan CK, Peese KM, Eggler JF, Dombroski MA, Geoghegan KF, Gabel CA (2003) Glutathione S-transferase omega 1-1 is a target of cytokine release inhibitory drugs and may be responsible for their effect on interleukin-1β posttranslational processing. J Biol Chem 278:16567–16578 [PubMed]
Leroy E, Anastasopoulos D, Konitsiotis S, Lavedan C, Polymeropoulos MH (1998) Deletions in the Parkin gene and genetic heterogeneity in a Greek family with early onset Parkinson’s disease. Hum Genet 103:424–427 [PubMed]
Li YJ, Hauser MA, Scott WK, Martin ER, Booze MW, Qin XJ, Walter JW, Nance MA, Hubble JP, Koller WC, Pahwa R, Stern MB, Hiner CB, Jankovic J, Goetz CG, Small GW, Mastaglia F, Haines JL, Pericak-Vance MA, Vance JA (2004) Apolipoprotein E controls the risk and age at onset of Parkinson disease. Neurology 62:2005–2009 [PubMed]
Li YJ, Oliveira SA, Xu P, Martin ER, Stenger JE, Scherzer CR, Hauser MA, et al (2003) Glutathione S-transferase omega-1 modifies age-at-onset of Alzheimer disease and Parkinson disease. Hum Mol Genet 12:3259–3267 [PubMed]
Li YJ, Scott WK, Hedges DJ, Zhang F, Gaskell PC, Nance MA, Watts RL, et al (2002) Age at onset in two common neurodegenerative diseases is genetically controlled. Am J Hum Genet 70:985–993 [PMC free article] [PubMed]
Lopez OL, Smith G, Meltzer CC, Becker JT (1999) Dopamine systems in human immunodeficiency virus-associated dementia. Neuropsychiatry Neuropsychol Behav Neurol 12:184–192 [PubMed]
Martin ER, Bass MP, Gilbert JR, Pericak-Vance MA, Hauser ER (2003) Genotype-based association test for general pedigrees: the genotype-PDT. Genet Epidemiol 25:203–213 [PubMed]
Martin ER, Monks SA, Warren LL, Kaplan NL (2000) A test for linkage and association in general pedigrees: the pedigree disequilibrium test. Am J Hum Genet 67:146–154 [PMC free article] [PubMed]
Monks SA, Kaplan NL (2000) Removing the sampling restrictions from family-based tests of association for a quantitative-trait locus. Am J Hum Genet 66:576–592 [PMC free article] [PubMed]
Noureddine MA, Li YJ, van der Walt JM, Walters R, Jewett RM, Xu H, Wang T, Walter JW, Scott BL, Hulette C, Schmechel DE, Stenger J, Dietrich F, Vance JM, Hauser MA (2005) Genomic convergence to identify candidate genes for Parkinson disease: SAGE analysis of the substantia nigra. Mov Disord (electronically published June 17; available at: http://www3.interscience.wiley.com/cgi-bin/abstract/110541328) (accessed 22 June 2005) [PubMed]
Oukka M, Kim ST, Lugo G, Sun J, Wu LC, Glimcher LH (2002) A mammalian homolog of Drosophila schnurri, KRC, regulates TNF receptor-driven responses and interacts with TRAF2. Mol Cell 9:121–131 [PubMed]
Paisan-Ruiz C, Jain S, Evans EW, Gilks WP, Simon J, van der BM, de Munain AL, Aparicio S, Gil AM, Khan N, Johnson J, Martinez JR, Nicholl D, Carrera IM, Pena AS, de Silva R, Lees A, Marti-Masso JF, Perez-Tur J, Wood NW, Singleton AB (2004) Cloning of the gene containing mutations that cause PARK8-linked Parkinson’s disease. Neuron 44:595–600 [PubMed]
Polymeropoulos MH, Lavedan C, Leroy E, Ide SE, Dehejia A, Dutra A, Pike B, Root H, Rubenstein J, Boyer R, Stenroos ES, Chandrasekharappa S, Athanassiadou A, Papapetropoulos T, Johnson WG, Lazzarini AM, Duvoisin RC, Di Iorio G, Golbe LI, Nussbaum RL (1997) Mutation in the α-synuclein gene identified in families with Parkinson’s disease. Science 276:2045–2047 [PubMed]
Scott WK, Nance MA, Watts RL, Hubble JP, Koller WC, Lyons K, Pahwa R, et al (2001) Complete genomic screen in Parkinson disease: evidence for multiple genes. JAMA 286:2239–2244 [PubMed]
Valente EM, Abou-Sleiman PM, Caputo V, Muqit MM, Harvey K, Gispert S, Ali Z, Del Turco D, Bentivoglio AR, Healy DG, Albanese A, Nussbaum R, Gonzalez-Maldonado R, Deller T, Salvi S, Cortelli P, Gilks WP, Latchman DS, Harvey RJ, Dallapiccola B, Auburger G, Wood NW (2004) Hereditary early-onset Parkinson’s disease caused by mutations in PINK1. Science 304:1158–1160 [PubMed]
van der Knaap MS, Leegwater PA, Konst AA, Visser A, Naidu S, Oudejans CB, Schutgens RB, Pronk JC (2002) Mutations in each of the five subunits of translation initiation factor eIF2B can cause leukoencephalopathy with vanishing white matter. Ann Neurol 51:264–270 [PubMed]
Zaykin D, Zhivotovsky L, Weir BS (1995) Exact tests for association between alleles at arbitrary numbers of loci. Genetica 96:169–178 [PubMed]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • EST
    EST
    Published EST sequences
  • Gene (nucleotide)
    Gene (nucleotide)
    Records in Gene identified from shared sequence links
  • MedGen
    MedGen
    Related information in MedGen
  • Nucleotide
    Nucleotide
    Published Nucleotide sequences
  • PubMed
    PubMed
    PubMed citations for these articles
  • SNP
    SNP
    PMC to SNP links
  • Substance
    Substance
    PubChem Substance links
  • Taxonomy
    Taxonomy
    Related taxonomy entry
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...