• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ajhgLink to Publisher's site
Am J Hum Genet. Feb 2002; 70(2): 384–398.
Published online Jan 4, 2002.
PMCID: PMC384915

A Genomewide Scan Identifies Two Novel Loci Involved in Specific Language Impairment*

The SLI Consortium*


Approximately 4% of English-speaking children are affected by specific language impairment (SLI), a disorder in the development of language skills despite adequate opportunity and normal intelligence. Several studies have indicated the importance of genetic factors in SLI; a positive family history confers an increased risk of development, and concordance in monozygotic twins consistently exceeds that in dizygotic twins. However, like many behavioral traits, SLI is assumed to be genetically complex, with several loci contributing to the overall risk. We have compiled 98 families drawn from epidemiological and clinical populations, all with probands whose standard language scores fall [gt-or-equal, slanted]1.5 SD below the mean for their age. Systematic genomewide quantitative-trait–locus analysis of three language-related measures (i.e., the Clinical Evaluation of Language Fundamentals–Revised [CELF-R] receptive and expressive scales and the nonword repetition [NWR] test) yielded two regions, one on chromosome 16 and one on 19, that both had maximum LOD scores of 3.55. Simulations suggest that, of these two multipoint results, the NWR linkage to chromosome 16q is the most significant, with empirical P values reaching 10−5, under both Haseman-Elston (HE) analysis (LOD score 3.55; P=.00003) and variance-components (VC) analysis (LOD score 2.57; P=.00008). Single-point analyses provided further support for involvement of this locus, with three markers, under the peak of linkage, yielding LOD scores >1.9. The 19q locus was linked to the CELF-R expressive-language score and exceeds the threshold for suggestive linkage under all types of analysis performed—multipoint HE analysis (LOD score 3.55; empirical P=.00004) and VC (LOD score 2.84; empirical P=.00027) and single-point HE analysis (LOD score 2.49) and VC (LOD score 2.22). Furthermore, both the clinical and epidemiological samples showed independent evidence of linkage on both chromosome 16q and chromosome 19q, indicating that these may represent universally important loci in SLI and, thus, general risk factors for language impairment.


Specific language impairment (SLI) is diagnosed in children who exhibit significant language deficits despite adequate educational opportunity and normal nonverbal intelligence. A diagnosis is made after the presence of other conditions—such as mental retardation, autism, hearing loss, cleft palate, and neurological disorders (e.g., cerebral palsy) that may give rise to language impairments—has been ruled out (Tomblin et al. 1996). Children with SLI differ in the degree to which they have problems articulating speech sounds, expressing themselves verbally, and comprehending the speech of others. Accordingly, SLI is broadly classified into three subtypes: phonological disorder, expressive-language disorder, and mixed expressive- and receptive-language disorder. However, the validity of this subtyping has been questioned, and, instead, it has been proposed that the variability in the profile of deficits may reflect variation in the severity of the underlying disorder.

Although there have been many epidemiological studies of SLI, differences in methodological approaches, in diagnostic criteria, and in category thresholds often render direct comparisons between investigations impractical. The majority of mainstream studies estimate the prevalence among English-speaking pre–primary-school children to be 2%–7% (Law et al. 1998). A substantial proportion of these children are reported to experience severe and persistent language difficulties, which are often associated with additional social, educational, behavioral, and psychological problems (Cantwell and Baker 1987; Beitchman et al. 1994; Snowling et al. 2001). Despite the differences in study design, it is worth noting that most investigations agree on the importance of genetic factors in the development of SLI and that many have demonstrated a strong familial aggregation of cases of language impairment (Bishop and Edmundson 1986; Neils and Aram 1986; Tallal et al. 1989). In a recent review, Stromswold (1998) reported that, across seven family studies, the prevalence of SLI in family members of probands was 24%–78% (mean 46%), compared with 3%–46% (mean 18%) in the control groups.

In addition, twin studies consistently have indicated a significant increase in MZ concordance rates compared with DZ concordance rates (Lewis and Thompson 1992; Bishop et al. 1995; Tomblin and Buckwalter 1998), suggesting that much of the reported familial aggregation can be attributed to genetic influences. Tomblin and Buckwalter (1998) studied 120 twin pairs in which affected individuals were defined as having both a composite language score (computed from four measures of receptive language and four measures of expressive language) 1 SD below that expected for their ages and a nonverbal IQ (i.e., performance IQ [PIQ]) >70. Using this sample, they demonstrated an MZ concordance rate of 96% and a DZ concordance rate of 69%. Bishop et al. (1995) studied a set of 90 same-sex twin pairs, all with at least one twin affected by a developmental-speech or -language disorder. Using a strict definition of language impairment (i.e., a discrepancy of [gt-or-equal, slanted]20 points between verbal and nonverbal abilities), they found a male-male MZ concordance of 70% and a male-male DZ concordance of 46%. Relaxation of the diagnostic criteria to include those cotwins who either lacked a large discrepancy between their (low) verbal skills and nonverbal ability or had a history of speech and language problems resulted in heightened MZ:DZ concordance rates of 92%:62% for male-male twins and 100%:56% for female-female twins. In an extension of this twin-pair study, individuals were subclassified according to the type of disorder that they displayed. Of the four subgroups formed—articulation with or without receptive disorder, articulation and expressive disorder with or without receptive disorder, expressive disorder with or without receptive disorder, and only receptive disorder—those which included children with expressive impairments (i.e., the second and third of these subgroups) showed probandwise MZ:DZ concordance rates close to 100%:50%. In contrast, those with only receptive disorders (i.e., the fourth subgroup) showed little evidence of genetic influence, having a probandwise MZ:DZ concordance rate of 71%:75% (Bishop et al. 1995).

Further support for a genetic etiology in language disorders is provided by estimates of heritabilities of quantitative measures of language-related components. In a series of investigations, Bishop et al. (1995, 1996, 1999) used the DeFries-Fulker method to demonstrate significant heritabilities in several psychometric language measures in families affected by SLI. Many of these measures were comparable to those used in the current study and showed levels of heritability close to 1.0. These include tests of receptive syntactic-language abilities (e.g., the Test for the Reception of Grammar) and tests of expressive-language skills (e.g., the Clinical Evaluation of Language Fundamentals–Revised [CELF-R] repeating-sentences subtest), as well as tests that examine specific processes thought to be important in language acquisition (e.g., tests of nonword repetition). Interestingly, although many language-related traits were shown to be highly heritable, when the discrepancy scores between these traits and PIQ were considered, no significant heritability was seen.

Although there is ample evidence to indicate that genes may play a significant role in the determination of absolute language abilities, family studies have failed to detect any clear cosegregation between phenotype and genotype, and most conclude that the genetic basis is likely to be complex (Bishop et al. 1995). This phenotypic and genotypic complexity has essentially precluded the use of traditional parametric approaches in the genetic mapping of SLI, with one exception. Family KE is a unique three-generation pedigree documented to have a severe speech-and-language disorder that follows an autosomal dominant pattern of inheritance. Investigation of this family and their monogenic trait led to the localization of the SPCH1 locus to chromosome 7q (Fisher et al. 1998) and, ultimately, to the identification of the first gene to be implicated in speech and language development—FOXP2 (Lai et al. 2001) (MIM 606354). The FOX genes encode a large family of transcription factors, all of which possess a winged-helix, or forkhead-box (fox), domain. Lai et al. (2001) have demonstrated that the language impairment in family KE cosegregates with a point mutation in the fox domain of FOXP2. They have suggested that the phenotype might result from haploinsufficiency of FOXP2 at a key stage of embryogenesis, which causes abnormalities in the development of neural structures important for speech and language. Clearly, the FOX family represent good candidate genes for SLI; however, their role in the etiology of more common forms of language impairment has yet to be evaluated.

Recent methodological advances have enabled the application of model-free nonparametric approaches to complex disorders, by use of large collections of small nuclear families and analysis of quantitative traits. In the current study, we present the results of the first systematic quantitative-trait locus (QTL)–based genomewide screen for SLI. We use three quantitative measures of different aspects of language abilities and implicate two novel locations, on chromosomes 16 and 19, neither of which coincides with any region previously associated with language impairment. The refinement of the regions reported here may allow the identification of causal genetic factors in cases of SLI and thus aid in the clarification of the etiological mechanisms underlying this disorder.

Subjects and Methods


Two centers recruited 473 individuals (including a total of 219 sib pairs) from 98 families. The Newcomen Centre at Guy’s Hospital, London, diagnosed and referred a clinically based sample, and the Cambridge Language and Speech Project (CLASP) provided families drawn from an ongoing epidemiological study.

The cases selected at Guy’s Hospital were identified through three special schools for language disorders and through Afasic, a support organization for people with developmental and language impairments; thus, these individuals can be considered as representing a self-referred sample of children with persistent language problems needing special schooling and are not representative of the total population in the community. Ethical permission was given by the Guy's and St. Thomas' Trust ethics committee.

CLASP is a community-based longitudinal investigation of speech and language difficulties. The children recruited into the study were initially ascertained during their 3d year of life. A three-stage procedure was employed for case identification, and a standard age design was used to control for divergence between developmental stage and chronological age. Accordingly, at age 36 mo, the population was first defined by means of a questionnaire; then, at age 39 mo, this sample was screened, in more detail, for language difficulties; and, finally, at age 45 mo, age screen–positive cases were assessed in depth. When the children reached 8 years of age, they and their siblings were assessed by the CELF-R and Wechsler Scales of Intelligence–Third UK Edition (WISC-III [Wechsler 1992]), and buccal-DNA samples were collected in families of SLI cases. A detailed description of the ascertainment procedure and sample is available from Stott et al. (in press).

In both the Guy’s Hospital sample and the Cambridge sample, probands were selected who, either currently or in the past, had language skills [gt-or-equal, slanted]1.5 SD below the normative mean for their chronological age, on the receptive and/or expressive scales of the CELF-R battery (Semel et al. 1992). Any proband or sibling found to have a PIQ <80 was excluded from the genome screen. Additional exclusion criteria included MZ twinning, chronic illness requiring multiple hospital visits or admissions, deafness, an ICD-10/DSM-IV diagnosis of childhood autism, English being a second language, care provision by local authorities, and known neurological disorders. In the Guy’s Hospital sample, those families with chromosome abnormalities, including fragile X, were excluded by cytogenetic testing. A summary of the genome-screen sample is shown in table 1.

Table 1
Number of Families and Sib Pairs in the Total Genome-Screen Sample, the Guy’s Hospital Sample, and the Cambridge Sample

Whole-blood or buccal-swab samples were collected from probands and all available siblings and parents, regardless of language ability. DNA was extracted by means of standard protocols, and all buccal-swab DNA samples were preamplified by a preamplification extension protocol (PEP). The PEP technique involves the random amplification of genomic DNA, using a pool of random 15-mer primers, and results in a 50–100-fold increase in template DNA for subsequent microsatellite amplification (Zhang et al. 1992). Prior to the genome screen, this approach was verified, across 20 primers, in a series of 27 controls. All controls showed comparable amplification of both genomic DNA and PEP DNA, and no evidence of preferential preamplification of specific alleles was seen (data not shown).

Phenotypic Measures

Three language measures were assessed for the genome screen: expressive and receptive language skills were scored by CELF-R, and a test of nonword repetition was used as a marker of phonological short-term memory. All three measures showed significant levels of familiality in the genome-screen sample (data not shown). No parental phenotype data were used, since the linkage analysis uses only information from sib-pair phenotype data.


CELF-R is a clinical tool widely used for the identification, diagnosis, and follow-up evaluation of language disorders in school-age children. The battery is split into receptive and expressive scales, which can be combined to provide a composite language score. Each scale consists of three subtests designed to be primarily receptive or expressive in nature. The exact combination of individual tests used depends on the age of the subject. Additive raw scores from each segment are then transformed to derive a standardized receptive language score (RLS) and an expressive language score (ELS), each with mean 100 and SD 15 in the general-population calibration sample (Semel et al. 1992). The CELF-R tests are generally considered to give a broad overview of a child’s general language abilities and are valid for children of age 5–17 years.

Nonword Repetition (NWR)

It has been proposed that children with SLI have language-learning difficulties due to a deficit in working memory. This means that the amount of time during which they are able to hold unfamiliar phonological forms in their short-term memory is insufficient to allow in-depth processing and transfer to the long-term memory (Baddeley and Wilson 1993; Baddeley et al. 1998). To test the capacity that phonological working memory has for novel speech sounds, Gathercole et al. (1994) have developed a measure of NWR. In this test, subjects are required to repeat tape-recorded nonsensical words of increasing length and complexity (e.g., “brufid” and “contramponist”). Studies show that individuals with current language impairments, as well as those who, during early childhood, had language difficulties that later resolved, perform poorly on this test (Gathercole et al. 1994; Bishop et al. 1999). All available children of age 7.5–18 years were tested by the NWR test.

All individuals in the Guy’s Hospital sample completed the published version of the children’s test of NWR (Gathercole et al. 1994); however, all individuals in the Cambridge sample were examined by a prepublication revision of this test. Although both tasks are similar in administration, and although some words are common to both tests, it was evident that the published standardization introduced flooring effects, which resulted in an undesirable skewing of the distribution of scores. For this reason, as well as to allow combination of the NWR scores across the two samples, both versions of the NWR test were administered to 111 subjects (age 4.8–53.6 years) from both samples, and a between-test-regression calibration coefficient was determined. Raw scores correlated 0.89 (P<.001) and were linearly related across the entire range, the relationship being the same for both the adults and the children. A linear-regression calibration equation gave raw scores from the prepublished form of the test that were 0.658 (standard error [SE] 0.009) times the raw score from the published test. Raw scores for the Guy’s Hospital sample were therefore multiplied by this factor, to make them comparable to the raw scores for the Cambridge sample. Standard scores for a British population were then obtained by use of norms extended, by S. E. Gathercole (personal communication), for older children.


IQ was assessed by WISC-III (Wechsler 1992). This is a battery of tests that yield measures of verbal IQ and PIQ. The verbal scale comprises tests of comprehension, vocabulary, and abstract reasoning, whereas completion of the performance tasks relies primarily on visual and constructional clues (e.g., mazes, symbol arrangement, and abstract visual problem solving). Verbal IQ and PIQ can then be combined to give a full-scale IQ. The WISC-III requires no reading or writing of words. All children found to have a PIQ <80 were excluded from the study.

Cohort Statistics

A total of 252 children (153 males and 99 females), ages 5–19 years (mean 9.4 years; SD 3.04 years) were assessed, as described above, for CELF-R expressive-language score (by ELS), CELF-R receptive-language score (by RLS), nonword repetition (by NWR), and PIQ. In this sample, which includes unaffected siblings, we found that, although the average level of PIQ was consistent with that of the general population (i.e., mean 100), the means of all language-based measurements fell below the expected mean of 100 (table 2). Thus, the sample selected for the genome screen may be considered to represent a collection of children whose developmental problems are largely language specific.

Table 2
Descriptive Statistics for Each Genome-Screen Phenotype—for the Total Genome-Screen Sample, the Guy’s Hospital Sample, and the Cambridge Sample

Comparisons between proband and cosib groups indicated that the probands generally demonstrated language ratings lower than those in the complementary cosibs (table 3). However, although the cosib language scores showed some regression toward the mean, they all remained below that expected (table 3). This is attributable to the high number (~34%) of siblings who also displayed signs of language impairment. In the clinical sample (i.e., that from Guy’s Hospital; see the “Recruitment” subsection), 52 (37%) of the children were attending either a special language unit or a special school or had been placed, with a statement of special educational needs, in a mainstream school.

Table 3
Descriptive Statistics for Each Genome-Screen Phenotype for the Total Genome-Screen Sample, for Probands and for Cosibs

Data Transformation

The data in table 2 present evidence that the Guy’s Hospital sample and the Cambridge sample, although both drawn from the general population of children with SLI, are significantly different in the magnitude of severity of their disorders. This is attributable to the fact that, although the diagnostic criteria applied to both samples were identical, the Guy’s Hospital sample represents a clinical, severely affected sample, whereas the Cambridge sample represents a more mainstream, epidemiologically selected sample. In order to combine the two samples for variance-components (VC) analysis, which creates a model around a single mean, all phenotypes were standardized to a Z-score, Z=(x-μ)/σ, where x is the attained score, μ is the mean, and σ is the SD; note that the mean and SD are taken from each group separately. Conversion of the language scores in this manner produces a distribution with a single mean while preserving the variances of the original samples and thus allows a single analysis of the two samples, in the VC model. The standardized scores are hereafter denoted as “RLStrans,” “ELStrans, and “NWRtrans” and were used for the combined analysis of both samples, for the genome screen. Correlations between RLStrans, ELStrans, and NWRtrans are given in table 4.

Table 4
Correlations between Phenotypes—in the Total Genome-Screen Sample, the Guy’s Hospital Sample, and the Cambridge Sample

Genotyping and Data Handling

All 473 individuals were genotyped for 400 highly polymorphic dinucleotide-repeat microsatellite markers, taken from the ABI PRISM LMS2-MD10 panels (Applied Biosystems). PCR reactions were performed in 96-well Costar (Thermowell) plates on MJ Research PTC-225 thermocyclers. The fluorescent labeling of primers, with 6-FAM, HEX, and NED phosphoramidites (Applied Biosystems), allowed both the pooling of panels of PCR products and, by means of ABI 373A and 377 sequencers (Applied Biosystems), their subsequent separation and detection on 5% polyacrylamide gels.

Data were extracted from gels by GENESCAN software (version 3.1) and were passed into the GENOTYPER program (version 2.0) for automated allele calling and manual genotype verification (Reed et al. 1994). Raw allele-size data were checked for inconsistencies, by GAS software (version 2.0) (A. Young, personal communication). Marker-allele frequencies were estimated within RECODE (version 1.4) (D. Weeks, personal communication), and Mega2 (version 2.2) (Mukhopadhyay et al. 1999; also see the Division of Statistical Genetics, Department of Human Genetics, University of Pittsburgh web site) was used for the creation of linkage files in a GENEHUNTER 2.0 (Kruglyak et al. 1996; also see the Whitehead Institute for Biomedical Research/MIT Center for Genome Research “/pub/software/genehunter” web site) package. The Discovery Manager system (Genomica) was used for the storage of genotypic data.

Prior to statistical analyses, two data-verification steps were performed. Marker haplotypes were generated in a GENEHUNTER 2.0 (Kruglyak et al. 1996; also see the Whitehead Institute for Biomedical Research/MIT Center for Genome Research “/pub/software/genehunter” web site) package, and all chromosomes showing an excessive number of recombination events were reexamined at the genotype level. Corrected data were then run through SIBMED (sibpair mutation and error detection) (Douglas et al. 2000; also see the Center for Statistical Genetics, University of Michigan web site), to identify possible genotyping errors or mutations. SIBMED uses a hidden Markov model to calculate posterior error probabilities for each sib-pair/marker combination, given all the available marker data, an assumed genotype-error rate (set at 1%), and a known genetic map. All genotypes highlighted by SIBMED were excluded from subsequent analyses.

Sex-averaged marker maps were from the Cooperative Human Linkage Center (see the CHLC Genetics Maps web site) and were supplemented with data from Généthon (Dib et al. 1996).

Information-content maps were produced for each chromosome, in a MAPMAKER/SIBS (version 2.0) (Kruglyak and Lander 1995; also see the Whitehead Institute for Biomedical Research/MIT Center for Genome Research “/distribution/software/sibs” web site) package and were used to determine the markers used in a second round of genotyping, involving 100 microsatellites taken from the Généthon map (Dib et al. 1996) and from the ABI PRISM LMS2-HD5 panels (Applied Biosystems). This additional wave of markers allowed the elimination of gaps in both marker density and information. Final marker density was estimated as being <8 cM, for all chromosomes.

Linkage Analysis

The Haseman-Elston (HE) method (Haseman and Elston 1972) and the VC method (Amos 1994; Pratt et al. 2000) were used within a GENEHUNTER 2.0 (Kruglyak et al. 1996; also see the Whitehead Institute for Biomedical Research/MIT Center for Genome Research “/pub/software/genehunter” web site) package, to calculate—by means of the ELStrans, RLStrans, and NWRtrans scores, as quantitative measures of language ability—both single-point and multipoint LOD scores for all autosomes. Additional multipoint HE and VC analyses were subsequently performed, with the WISC-III (Wechsler 1992) measure of PIQ, for all areas that showed suggestive linkage to a language trait.

GENEHUNTER 2.0 implements a traditional HE regression of squared phenotype differences (D2) on estimated identity-by-descent (IBD) sharing (vi), for each sib pair, at a given genetic locus. At a QTL, the variance of D22i) is expected to be negatively correlated with the proportion of markers shared IBD (Haseman and Elston 1972).

For families with more than two children, all possible sib pairings were included in the HE analysis. No weighting of multiple sib pairs was used. Although this unweighted approach has been suggested to lead to false inflation of significance, as a result of dependence between pairs (Hodge 1984), simulations described below indicate that this is not the case in our data set (fig. 1A).

Figure  1
LOD-score–significance distributions for each measure used in the genome screen. The thicker black lines show the theoretical probability for any given LOD score, under the appropriate analyses; the colored lines show the phenotype-specific empirical ...

The VC method derives two maximum-likelihood models, both of which dissect the trait variability between siblings into major-gene (σ2a), polygenic (σ2p), and environmental (σ2e) variance components. Under the null hypothesis, it is assumed that there is no major-gene effect (i.e., σ2a=0), and in the alternative model the major-gene effect is unrestricted (i.e., σ2a≠0). Comparison of the likelihood of these two models results in a likelihood-ratio estimate, and the theoretical significance of linkage effect can be assessed by a standard χ2 test (Amos 1994). Empirical estimates of the significance of all VC results were derived by means of simulations, as described below. VC analysis was performed with a single mean and no dominance variance. No adjustment was made for multiple phenotypes. Regions of linkage were identified as those which, under all four types of analysis performed, exceeded thresholds for “suggestive” linkage that have been proposed by Lander and Kruglyak (1995).

X Chromosome

In the absence of a multipoint sex-linked VC method, linkage to the X chromosome was assessed by HE analysis only. Linkage analyses were performed within a MAPMAKER/SIBS (version 2.0) (Kruglyak and Lander 1995; also see the Whitehead Institute for Biomedical Research/MIT Center for Genome Research “/distribution/software/sibs” web site) package, under an HE algorithm comparable to that used by GENEHUNTER 2.0, described above. In X-linked analysis, MAPMAKER/SIBS uses only male-male pairs.


Deviations from assumptions made by both of the linkage methods described above (i.e., VC and HE) can lead to unpredictable variations in the relationship between nominal P values and LOD scores, resulting in both type I and type II errors. The VC method supposes the multivariate normality of data, and the unweighted HE method assumes statistical independence of all sib pairings in families with multiple sibships. We adjusted for any divergence from these assumptions, by performing simulations for each phenotype. This allowed an estimation of the empirical pointwise significance of LOD scores.

Pedigree structure and phenotype data were maintained for each family in the genome screen, and SIMULATE (J. Terwilliger, personal communication; also see the Rockefeller University “/software/simulate” web site) was used to generate random genotypes for a single marker with four equally frequent alleles (75% heterozygosity) within this framework. A total of 100,000 replications were run, and linkage was assessed for each, by both the VC approach and the unweighted HE approach.

As demonstrated by Fisher et al. (2002), the resulting LOD-score-significance distributions (fig. 1) can be taken to approximate that found at each point of a typical multipoint situation (where ~70%–80% of IBD information is extracted) and are therefore generally applicable for estimation of the pointwise significance of linkage peaks. Note that these empirical P values, although adjusted to account for measure-specific deviations from normality, still yield only pointwise estimates of significance; they are not adjusted to account for genomewide scanning.


We found that, under our ascertainment criteria (i.e., a single language score >1.5 SD below that expected for age), 34.4% of siblings of probands could be classified as affected. If we assume a population prevalence of 4%, this gives a sibling risk ratio of 8.6 in the families that we studied.

Genomewide QTL analysis highlighted two prominent areas of linkage—one on chromosome 16 and one on chromosome 19 (figs. (figs.22 and and3).3). Although several other regions were found to have LOD scores >1.0 (table 5), only the loci on chromosomes 16 and 19 exceeded the threshold (i.e., LOD score [gt-or-equal, slanted]2.2) that Lander and Kruglyak (1995) have proposed as being indicative of “suggestive” of linkage. Furthermore, they did so under all four types of analysis performed (i.e., multipoint HE and VC and single-point HE and VC) (table 5).

Figure  2
Genomewide plot of HE linkage to three language-related measures under multipoint HE analysis. Abbreviations for language measures are as in the Subjects and Methods section. The X-axis shows cumulative distance (in Haldane cM); chromosome numbers are ...
Figure  3
Suggestive linkage to chromosomes 16 and 19. The X-axis shows positions of the markers typed; a 10-cM (Haldane) bar is given for reference. A, Linkage to chromosome 16, under both the HE method and the VC method, for NWRtrans. For ELStrans and RLStrans, ...
Table 5
LOD Scores >1.0

The locus on chromosome 16 was linked to the NWRtrans-measured trait and spans ~40 cM of 16q, from D16S515 to D16S520. Although the maximum LOD score (MLS) that HE yielded for this region reached 3.55, the VC analysis yielded a somewhat lower MLS, 2.57. However, empirical-probability distributions drawn from simulated data indicated a general deflation of VC LOD scores for the NWRtrans-measured trait (fig. 1B). In 100,000 simulations, a VC LOD score >2.57 was seen only eight times (i.e., pointwise empirical P=.00008) and thus is consistent with the HE result (empirical P=.00003) and verges on the threshold (i.e., P=.00002) that Lander and Kruglyak (1995) have proposed as being indicative of “significant” linkage. Furthermore, chromosome 16 yielded the most significant single-point result for the genome (D16S516; LOD score 2.77), and single-point LOD scores >1.5 were seen for a cluster of three markers directly under the peak of linkage: markers D16S516 (2.77), D16S3040 (2.24), and D16S3091 (1.95) (table 5).

The locus on chromosome 19 was linked to the ELStrans-measured trait and covers ~30 cM of 19q, from D19S220 to D19S418 (fig. 3). This QTL was evident under both the HE and the VC multipoint analyses (HE LOD score 3.55; VC LOD score 2.84) and was supported by single-point analysis, in which two adjacent markers showed linkage to the ELStrans-measured trait, with LOD scores >1.5 (D19S908 [LOD score 2.49]) and D19S902 [LOD score 1.74]) (table 5). Simulations indicated that the ELStrans measure behaves as predicted by theory and, therefore, that empirical P values can be taken as being representative of nominal P values for linkages to the ELStrans-measured trait and, therefore, to the locus on chromosome 19 (fig. 1).

In both the chromosome 16 region of linkage and the chromosome 19 region of linkage, the LOD score for PIQ was never >0.15 (data not shown), indicating that both of these loci are likely to reflect language-specific influences, as opposed to general-intelligence effects.

To clarify the contribution that each group made to the two linkage peaks, we divided the genome-screen sample into its constituent Guy’s Hospital and Cambridge samples and reanalyzed chromosomes 16 and 19. Figure 4 shows that the Guy's Hospital sample and the Cambridge sample contribute equally to both peaks of linkage.

Figure  4
Linkage to chromosomes 16 and 19, based on the Guy's Hospital sample and the Cambridge sample. The format of the graph is as in figure 3. A, Linkage to chromosome 16, in the combined genome-screen sample, the Guy’s Hospital sample, and the Cambridge ...

Interestingly, we found no evidence for linkage to chromosome 7q, the location of both SPCH1 (the family-KE linkage) (MIM 602081) and AUTS1 (the autism chromosome 7 linkage) (MIM 209850). At D7S486, the peak (LOD score 6.22) of linkage in family KE, our single-point LOD score remained <0.001, for all three phenotypes.

Our sample contained a male:female ratio of ~3:2, which is consistent with the male predominance reported in previous studies (Stevenson and Richman 1976). However, we found no strong evidence for a major sex-specific locus in the HE analysis of the X chromosome. Although, for ELStrans, a LOD score of 1.30 was found on Xp, LOD scores for all other measures remained <0.5, across the entire X chromosome.


We have reported here the first molecular genetic study of typical SLI. We implicate two novel loci, on chromosomes 16 and 19, that are found to influence language-related traits. Evidence for these QTLs has been drawn from four complementary analyses (i.e., multipoint HE and VC and single-point HE and VC), and both loci have shown to be relevant in the Guy's Hospital sample and the Cambridge sample.

One important feature of this study is the use of quantitative measures of generalized language abilities. The lack of consensus as to the etiological basis of SLI often makes the derivation of a consistent qualitative affection status unfeasible. The use of quantitative traits circumvents this issue and, in complex cognitive disorders, has been demonstrated to provide a suitable means of investigation of underlying genetic effects (Cardon et al. 1994; Fisher et al. 1999; Gayán et al. 1999). A quantitative-trait approach does, however, create its own issues, perhaps the most pertinent of which is the selection of phenotypes for the appraisal of disorder severity. In the diagnosis of SLI, both ICD-10 and DSM-IV guidelines require a substantial discrepancy between PIQ and verbal abilities. Although the enforcement of discrepancy scores acts to aid the elimination of general IQ effects, these scores are generally felt to result in an overly restrictive phenotype, which is susceptible to compound errors. Also relevant to the current study is the finding that discrepancy scores show only a minimal level of heritability and, hence, may not reflect the underlying genetic influences involved in SLI (Bishop et al. 1995). For the genome screen, we therefore chose to employ broad phenotype batteries, alongside a single specific measurement of phonological short-term memory. All three traits have been demonstrated to be significantly heritable and good predictors of language abilities (Semel et al. 1992; Bishop et al. 1995, 1999).

The importance of phenotype selection has been confirmed by the results of the genome screen in the current study. Intriguingly, only minimal linkage is seen to measures of receptive language abilities—the strongest RLStrans result was seen for chromosome 2q and peaked at 1.52—a result consistent with the previously reported lack of probandwise concordance between twins with only receptive language impairments (Bishop et al. 1995). In contrast, measurements of expressive language skills and phonological short-term memory, both of which have been demonstrated to be subject to strong genetic influence (Bishop et al. 1995, 1999), yielded the two most significant linkage results in the genome screen. These two loci provide the only areas of suggestive linkage in the entire genome. The background level was generally low, with few regions yielding LOD scores >1.5 (table 5).

Furthermore, although all three phenotypes were found to be moderately correlated in our sample (table 4), at both peaks of linkage a discordance between all traits was apparent. Linkages to chromosomes 16 and 19 were seen to be specific to NWRtrans and ELStrans, respectively, with no corresponding peaks seen for the other measures (see fig. 3). However, studies of dyslexia (Grigorenko et al. 1997; Fisher et al. 1999) indicate that the dissection of a complex trait in such a simple manner is not always appropriate and that inferences relating specific loci to distinct components of language impairment should be viewed with caution (Fisher et al. 1999).

In genome screens for complex traits, it is not uncommon to see a shift between the original peak and replication peaks or to find linkage to phenotypes other than that originally reported (Cardon et al. 1994; Grigorenko et al. 1997; International Molecular Genetic Study of Autism Consortium 1998; Fagerheim et al. 1999; Fisher et al. 1999; Gayán et al. 1999; Phillipe et al. 1999; Risch et al. 1999). Thus, the independent reproduction of the loci on chromosomes 16 and 19, in both the Guy's Hospital sample and the Cambridge sample, was particularly striking. The observation of linkage—in exactly the same region and to the same phenotypes—across two separate groups with such different origins provides further endorsement for the QTLs reported here.

The only previously reported linkage to SLI is that of the SPCH1 region on chromosome 7q in family KE (Fisher et al. 1998). Our genome screen shows no evidence for linkage to this area, indicating that it is unlikely to play a significant role in cases of typical SLI. However, given the heterogeneity of the disorder, it remains possible that a subset of individuals in the current study may harbor mutations in FOXP2, and it should be stressed that mutation analysis of the gene will be necessary to assess the full impact of this locus in our subjects with SLI.

The overlap between the SPCH1 and AUTS1 chromosome 7 linkages (International Molecular Genetic Study of Autism Consortium 1998; Lai et al. 2000) has fueled much debate with regard to the relationship between the genetic and phenotypic overlap of SLI and autism (Folstein and Mankoski 2000; Vincent et al. 2000; Warburton et al. 2000). Although some autistic children may develop language that is normal in terms of vocabulary, grammar, and phonology, they invariably encounter difficulties with the use of language in a social context (i.e., pragmatic language). It is estimated that one-third of autistic children never develop speech at all (Rapin 1997). In addition, the prevalence of autism in the siblings of children affected by SLI has often been reported to be increased compared with that in the general population (3%:0.1%) (Hafeman and Tomblin 1999). A recent genome screen for loci involved in autism has implicated a 19q region that is coincident with the chromosome 19 peak reported here (Lui et al. 2001). However, the autistic sample that showed the greatest evidence of linkage to this 19q locus (MLS 1.70) was from the narrow diagnostic group (i.e., that excluding children who may overlap into the SLI spectrum). We found no additional evidence for linkage to any other major loci (i.e., on chromosomes 2, 7, and 15) that previous studies had found to be associated with autism (reviewed by Lamb et al. 2000).

Another disorder that shows significant comorbidity with SLI is dyslexia (Bishop and Adams 1990; Catts 1993). The strong links between both dyslexia and SLI and phonological impairments have often led to the speculation that language impairments and reading disabilities may represent different manifestations of similar neurological deficits (Snowling et al. 2000). Relatives of probands affected by dyslexia experience an increased risk of language impairment (Gallagher et al. 2000), whereas studies of children selected for language impairments often report a high incidence of literacy problems (Tallal et al. 1989). However, the exact relationship between the two disorders remains undetermined. We found no evidence for linkage to regions of chromosomes 2, 6, 15, or 18, the regions that previously implicated by genetic mapping studies of dyslexia (Cardon et al. 1994; Grigorenko et al. 1997; Fisher et al. 1999, 2002; Gayán et al. 1999). Further independent studies involving large sample sizes will be necessary to elucidate any common genetic mechanisms underlying the phenotypic overlaps between both autism and dyslexia and SLI.

QTL genome screens such as the one that has been reported here are crucial to the study of disorders such as SLI, since they neither make no prior assumptions about the basis of the disease nor target specific chromosomal regions for analysis. The current study has provided an overview of the whole genome with respect to language-related phenotypes and has highlighted two loci that appear to have a significant genetic effect on the development of SLI. This work represents the first major step in the clarification of the genetic mechanisms behind SLI, which may lead to a better understanding of the processes involved in language acquisition while also facilitating better diagnosis and treatment of individuals with language impairments.


We would like to thank all the families who have participated in the study, as well as the professionals who continue to make this study possible. We thank Michael Rutter, for initial discussions on the genetics of SLI; Lon Cardon, Dan Weeks, and Clyde Francks, for their statistical advice; all members of the Monaco lab, for their support and advice during the past 3 years; Jonathan Roiser, for his PEP work; all at the Newcomen and CLASP centers, for their involvement in the project; and Jane Addison, Claire Poppy, Deborah Jones, Tilly Storr, and Til Utting-Brown, for their assistance with data collection and management. All laboratory work and the collection of Guy’s families were funded by The Wellcome Trust. CLASP is funded by The Wellcome Trust, British Telecom, The Isaac Newton Trust, a National Health Service (NHS) Anglia & Oxford Regional R&D Strategic Investment Award, and an NHS Eastern Region R&D Training Fellowship Award. D. F. Newbury is funded by an Medical Research Council Studentship, and A. P. Monaco and D. V. M. Bishop are Wellcome Trust Principal Fellows.


Organizations (members) of the SLI Consortium are as follows: The Wellcome Trust Centre for Human Genetics (D. F. Newbury, J. D. Cleak, Y. Ishikawa-Brush, A. J. Marlow, S. E. Fisher, and A. P. Monaco), CLASP (C. M. Stott, M. J. Merricks, I. M. Goodyer, and P. F. Bolton), Newcomen Centre, Guy’s Hospital (L. Jannoun, V. Slonims, and G. Baird), School of Epidemiology & Health Science, The University of Manchester (A. Pickles), Department of Experimental Psychology, University of Oxford (D. V. M. Bishop), Human Communication and Deafness, School of Education, The University of Manchester (G. Conti-Ramsden), and Department of Child Health, Aberdeen University (P. J. Helms).


* Members of the consortium are listed in the Appendix.

Electronic-Database Information

Accession numbers and URLs for data in this article are as follows:

Center for Statistical Genetics, University of Michigan, http://www.sph.umich.edu/statgen/software (for SIBMED)
Division of Statistical Genetics, Department of Human Genetics, University of Pittsburgh, http://watson.hgen.pitt.edu/mega2.html (for Mega2 version 2.2)
Online Mendelian Inheritance in Man (OMIM), http://www.ncbi.nlm.nih.gov/Omim (for AUTS1 [MIM 209850], FOXP2 [MIM 606354], and SPCH1 [MIM 602081])
Rockefeller University “/software/simulate” web site, ftp://linkage.rockefeller.edu/software/simulate (for SIMULATE)
Whitehead Institute for Biomedical Research/MIT Center for Genome Research “/distribution/software/sibs” web site, ftp://ftp-genome.wi.mit.edu/distribution/software/sibs (for MAPMAKER/SIBS)
Whitehead Institute for Biomedical Research/MIT Center for Genome Research “/pub/software/genehunter” web site, http://www-genome.wi.mit.edu/ftp/pub/software/genehunter/ (for GENEHUNTER 2.0)


Amos CI (1994) Robust variance-components approach for assessing genetic linkage in pedigrees. Am J Hum Genet 54:535–543 [PMC free article] [PubMed]
Baddeley A, Gathercole S, Papagno C (1998) The phonological loop as a language learning device. Psychol Rev 1:158–173 [PubMed]
Baddeley A, Wilson BA (1993) A developmental deficit in short-term phonological memory: implications for language and reading. Memory 1:65–78 [PubMed]
Beitchman JH, Brownlie EB, Inglis A, Wild J, Mathews R, Schachter D, Kroll R, Martin S, Ferguson B, Lancee W (1994) Seven-year follow-up of speech/language-impaired and control children: speech/language stability and outcome. J Am Acad Child Adolesc Psychiatry 33:1322–1330 [PubMed]
Bishop DVM, Adams C (1990) A prospective study of the relationship between specific language impairment, phonological disorders and reading retardation. J Child Psychol Psychiatry 31:1027–1050 [PubMed]
Bishop DVM, Bishop SJ, Bright P, James C, Delaney T, Tallal P (1999) Different origin of auditory and phonological processing problems in children with language impairment: evidence from a twin study. J Speech Lang Hear Res 42:155–168 [PubMed]
Bishop DV, Edmundson A (1986) Is otitis media a major cause of specific developmental language disorders? Br J Disord Commun 21:321–338 [PubMed]
Bishop DV, North T, Donlan C (1995) Genetic basis of specific language impairment: evidence from a twin study. Dev Med Child Neurol 37:56–71 [PubMed]
——— (1996) Nonword repetition as a behavioural marker for inherited language impairment: evidence from a twin study. J Child Psychchol Psychiatry 37:391–403 [PubMed]
Cantwell DP, Baker L (1987) Prevalence and type of psychiatric disorder and developmental disorders in three speech and language groups. J Commun Disord 20:151–160 [PubMed]
Cardon LR, Smith SD, Fulker DW, Kimberling WJ, Pennington BF, DeFries JC (1994) Quantitative trait locus for reading disability on chromosome 6. Science 266:276–279 [PubMed]
Catts HW (1993) The relationship between speech-language impairments and reading disabilities. J Speech Hear Res 36:948–958 [PubMed]
Dib C, Faure S, Fizames C, Samson D, Drouot N, Vignal A, Millasseau P, Marc S, Hazan J, Seboun E, Lathrop M, Gyapay G, Morissette J, Weissenbach J (1996) A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380:152–154 [PubMed]
Douglas JA, Boehnke M, Lange K (2000) A multipoint method for detecting genotyping errors and mutations in sibling-pair linkage data. Am J Hum Genet 66:1287–1297 [PMC free article] [PubMed]
Fagerheim T, Raeymaekers P, Tonnessen FE, Pedersen M, Tranebjaerg L, Lubs HA (1999) A new gene (DYX3) for dyslexia is located on chromosome 2. J Med Genet 36:664–669 [PMC free article] [PubMed]
Fisher SE, Francks C, Marlow AJ, MacPhie IL, Newbury DF, Cardon LR, Ishikawa-Brush Y, Richardson AJ, Talcott JB, Gayán J, Olson RK, Pennington BF, Smith SD, DeFries JC, Stein JF, Monaco AP (2002) Independent genome-wide scans identify a chromosome 18 quantitative-trait locus influencing dyslexia. Nat Genet 30 (in press) [PubMed]
Fisher SE, Marlow AJ, Lamb J, Maestrini E, Williams DF, Richardson AJ, Weeks DE, Stein JF, Monaco AP (1999) A quantitative-trait locus on chromosome 6p influences different aspects of developmental dyslexia. Am J Hum Genet 64:146–156 [PMC free article] [PubMed]
Fisher SE, Vargha-Khadem, Watkins KE, Monaco AP, Pembrey ME (1998) Localisation of a gene implicated in a severe speech and language disorder. Nat Genet 18:168–170 [PubMed]
Folstein SE, Mankoski RE (2000) Chromosome 7q: where autism meets language disorder? Am J Hum Genet 67:278–281 [PMC free article] [PubMed]
Gallagher A, Frith U, Snowling MJ (2000) Precursors of literacy delay among children at genetic risk of dyslexia. J Child Psychol Psychiatry 41:202–213 [PubMed]
Gathercole SE, Willis C, Baddeley AD, Emslie H (1994) The children’s test of nonword repetition: a test of phonological working memory. Memory 2:103–127 [PubMed]
Gayán J, Smith SD, Cherny SS, Cardon LR, Fulker DW, Brower AM, Olson RK Pennington BF, DeFries JC (1999) Quantitative-trait locus for specific language and reading deficits on chromosome 6p. Am J Hum Genet 64:157–164 [PMC free article] [PubMed]
Grigorenko EL, Wood FB, Meyer MS, Hart LA, Speed WC, Shuster A, Pauls DL (1997) Susceptibility loci for distinct components of developmental dyslexia on chromosomes 6 and 15. Am J Hum Genet 60:27–39 [PMC free article] [PubMed]
Hafeman L, Tomblin B (1999) Autistic behaviours in the siblings of children with specific language impairment. Mol Psychiatry Suppl 4:S14
Haseman JK, Elston RC (1972) The investigation of linkage between a quantitative trait and a marker locus. Behav Genet 2:3–19 [PubMed]
Hodge SE (1984) The information contained in multiple sibling pairs. Genet Epidemiol 1:109–122 [PubMed]
International Molecular Genetic Study of Autism Consortium (1998) A full genome screen for autism with evidence for linkage to a region on chromosome 7q. Hum Mol Genet 7:571–578 [PubMed]
Kruglyak L, Daly MJ, Reeve-Daly MP, Lander ES (1996) Parametric and nonparametric linkage analysis: a unified multipoint approach. Am J Hum Genet 58:1347–1363 [PMC free article] [PubMed]
Kruglyak L, Lander ES (1995) Complete multipoint sib-pair analysis of qualitative and quantitative traits. Am J Hum Genet 57:439–454 [PMC free article] [PubMed]
Lai CSL, Fisher SE, Hurst JA, Levy ER, Hodgson S, Fox M, Jeremiah S, Povey S, Jamison DC, Green ED, Vargha-Khadem F, Monaco AP (2000) The SPCH1 region on human 7q31: genomic characterization of the critical interval and localization of translocations associated with speech and language disorder. Am J Hum Genet 67:357–368 [PMC free article] [PubMed]
Lai CSL, Fisher SE, Hurst JA, Vargha-Khadem F, Monaco AP (2001) A novel forkhead-domain gene is mutated in a severe speech and language disorder. Nature 413:465–466
Lamb JA, Moore J, Bailey A, Monaco AP (2000) Autism: recent molecular advances. Hum Mol Genet 9:861–868 [PubMed]
Lander ES, Kruglyak L (1995) Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet 11:241–247 [PubMed]
Law J, Boyle J, Harris F, Harkness A, Nye C (1998) Screening for speech and language delay: a systematic review of the literature. Health Technol Assess 2:1–184 [PubMed]
Lewis BA, Thompson LA (1992) A study of developmental speech and language disorders in twins. J Speech Hear Res 35:1086–1094 [PubMed]
Lui J, Nyholt DR, Magnussen P, Parano E, Pavone P, Geschwind D, Lord C, Iversen P, Hoh J, the Autism Genetic Resource Exchange Consortium, Ott J, Gillam TC (2001) A genomewide screen for autism susceptibility loci. Am J Hum Genet 69:327–340 [PMC free article] [PubMed]
Mukhopadhyay N, Almasy L, Schroeder M, Mulvihill WP, Weeks DE (1999) Mega2, a data-handling program for facilitating genetic linkage and association analyses. Am J Hum Genet Suppl 65:A436
Neils J, Aram DM (1986) Family history of children with developmental language disorders. Percept Mot Skills 63:655–658 [PubMed]
Phillippe A, Martinez M, Guilloud-Bataille M, Gillberg C, Rastam M, Sponheim E, Coleman E, Zappella M, Aschauer H, van Malldergerme L, Penet C, Feingold J, Brice A, Leboyer M, the Paris Autism Research International Sibpair Study (1999) Genome-wide scan for autism susceptibility genes. Hum Mol Genet 8:805–812 [PubMed]
Pratt SC, Daly MJ, Kruglyak L (2000) Exact multipoint quantitative-trait linkage analysis in pedigrees by variance components. Am J Hum Genet 66:1153–1157 [PMC free article] [PubMed]
Rapin I (1997) Current concepts: autism. N Engl J Med 337:97–104 [PubMed]
Reed PW, Davies JL, Copeman JB, Bennet ST, Palmer SM, Pritchard LE, Gough SC, et al (1994) Chromosome-specific microsatellite sets for fluorescence-based, semi-automated genome mapping. Nat Genet 7:390–395 [PubMed]
Risch N, Spiker D, Lotspeich L, Nouri N, Hinds D, Hallmayer J, Kalaydjieva L, et al (1999) A genomic screen of autism: evidence for a multilocus etiology. Am J Hum Genet 65:493–507 [PMC free article] [PubMed]
Semel EM, Wiig EH, Secord W (1992) Clinical Evaluation of Language Fundamentals–Revised. Psychological Corporation, San Antonio
Snowling MJ, Adams JW, Bishop DV, Stothard SE (2001) Educational attainments of school leavers with a preschool history of speech-language impairments. Int J Lang Commun Disord 36:173–183 [PubMed]
Snowling M, Bishop DV, Stothard SE (2000) Is preschool language impairment a risk factor for dyslexia in adolescence? J Child Psychol Psychiatry 41:587–600 [PubMed]
Stevenson J, Richman N (1976) The prevalence of language delay in a population of three-year-old children and its association with general retardation. Dev Med Child Neurol 18:431–441 [PubMed]
Stott CM, Merricks MJ, Bolton PF, Goodyer IM. Screening for speech and language disorders: the reliability, validity and accuracy of the general language screen. Int J Lang Commun Disord (in press) [PubMed]
Stromswold K (1998) Genetics of spoken language disorders. Hum Biol 70:297–324 [PubMed]
Tallal P, Ross R, Curtiss S (1989) Familial aggregation in specific language impairment. J Speech Hear Disord 54:167–173 [PubMed]
Tomblin JB, Buckwalter PR (1998) Heritability of poor language achievement among twins. J Speech Lang Hear Res 41:188–199 [PubMed]
Tomblin JB, Records NL, Zhang X (1996) A system for the diagnosis of specific language impairment in kindergarten children. J Speech Hear Res 39:1284–1294 [PubMed]
Vincent JB, Herbrick J-A, Gurling HMD, Bolton PF, Roberts W, Scherer SW (2000) Identification of a novel gene on chromosome 7q31 that is interpreted by a translocation breakpoint in an autistic individual. Am J Hum Genet 67:510–514 [PMC free article] [PubMed]
Warburton P, Baird G, Chen W, Morris K, Jacobs BW, Hodgson S, Docherty Z (2000) Support for linkage of autism and specific language impairment to 7q3 from two chromosome rearrangements involving band 7q31. Am J Med Genet 96:228–234 [PubMed]
Wechsler D (1992) Wechsler Intelligence Scale for Children–Third UK Edition. Psychological Corporation, London
Zhang L, Cui X, Schmitt K, Hubert R, Navidi W, Arnheim N (1992) Whole genome amplification from a single cell: implications for genetic analysis. Proc Natl Acad Sci USA 89:5847–5851 [PMC free article] [PubMed]

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Gene
    Gene links
  • MedGen
    Related information in MedGen
  • OMIM
    OMIM record citing PubMed
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...