Send to

Choose Destination
See comment in PubMed Commons below
Cortex. 2014 Jun;55:182-91. doi: 10.1016/j.cortex.2013.12.004. Epub 2013 Dec 20.

Category fluency, latent semantic analysis and schizophrenia: a candidate gene approach.

Author information

Neuropsychiatric Genetics Group, Department of Psychiatry, Trinity Centre for Health Sciences, Trinity College Dublin, St James Hospital, Dublin, Ireland. Electronic address:
Psychiatry Research Group, Department of Clinical Medicine, University of Tromsø, Norway; Norwegian Centre for Integrated Care and Telemedicine (NST), University Hospital of North Norway, Tromsø, Norway.
Pearson Knowledge Technologies, Boulder, CO, USA; Institute for Cognitive Science, University of Colorado, Boulder, CO, USA.
Pearson Knowledge Technologies, Boulder, CO, USA.
Clinical Brain Disorders Branch, National Institute of Mental Health/NIH, Bethesda, MD, USA.
Clinical Brain Disorders Branch, National Institute of Mental Health/NIH, Bethesda, MD, USA; Lieber Institute for Brain Development, Baltimore, MD, USA; Departments of Psychiatry, Neurology, Neuroscience and The Institute of Genomic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.



Category fluency is a widely used task that relies on multiple neurocognitive processes and is a sensitive assay of cortical dysfunction, including in schizophrenia. The test requires naming of as many words belonging to a certain category (e.g., animals) as possible within a short period of time. The core metrics are the overall number of words produced and the number of errors, namely non-members generated for a target category. We combine a computational linguistic approach with a candidate gene approach to examine the genetic architecture of this traditional fluency measure.


In addition to the standard metric of overall word count, we applied a computational approach to semantics, Latent Semantic Analysis (LSA), to analyse the clustering pattern of the categories generated, as it likely reflects the search in memory for meanings. Also, since fluency performance probably also recruits verbal learning and recall processes, we included two standard measures of this cognitive process: the Wechsler Memory Scale and California Verbal Learning Test (CVLT). To explore the genetic architecture of traditional and LSA-derived fluency measures we employed a candidate gene approach focused on SNPs with known function that were available from a recent genome-wide association study (GWAS) of schizophrenia. The selected candidate genes were associated with language and speech, verbal learning and recall processes, and processing speed. A total of 39 coding SNPs were included for analysis in 665 subjects.


Given the modest sample size, the results should be regarded as exploratory and preliminary. Nevertheless, the data clearly illustrate how extracting the meaning from participants' responses, by analysing the actual content of words, generates useful and neurocognitively viable metrics. We discuss three replicated SNPs in the genes ZNF804A, DISC1 and KIAA0319, as well as the potential for computational analyses of linguistic and textual data in other genomics tasks.


Cognition; Gene; Latent semantic analysis; Schizophrenia; Verbal learning and recall

[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science Icon for PubMed Central
    Loading ...
    Support Center