Format

Send to

Choose Destination
Behav Res Methods. 2019 Oct;51(5):2152-2179. doi: 10.3758/s13428-019-01282-6.

LADEC: The Large Database of English Compounds.

Author information

1
Department of Psychology, University of Alberta, Edmonton, Alberta, Canada. cgagne@ualberta.ca.
2
Department of Psychology, University of Alberta, Edmonton, Alberta, Canada.
3
Department of Linguistics and Languages, McMaster University, Hamilton, Ontario, Canada.

Abstract

The Large Database of English Compounds (LADEC) consists of over 8,000 English words that can be parsed into two constituents that are free morphemes, making it the largest existing database specifically for use in research on compound words. Both monomorphemic (e.g., wheel) and multimorphemic (e.g., teacher) constituents were used. The items were selected from a range of sources, including CELEX, the English Lexicon Project, the British Lexicon Project, the British National Corpus, and Wordnet, and were hand-coded as compounds (e.g., snowball). Participants rated each compound in terms of how predictable its meaning is from its parts, as well as the extent to which each constituent retains its meaning in the compound. In addition, we obtained linguistic characteristics that might influence compound processing (e.g., frequency, family size, and bigram frequency). To show the usefulness of the database in investigating compound processing, we conducted a number of analyses that showed that compound processing is consistently affected by semantic transparency, as well as by many of the other variables included in LADEC. We also showed that the effects of the variables associated with the two constituents are not symmetric. In short, LADEC provides the opportunity for researchers to investigate a number of questions about compounds that have not been possible to investigate in the past, due to the lack of sufficiently large and robust datasets. In addition to directly allowing researchers to test hypotheses using the information included in LADEC, the database will contribute to future compound research by allowing better stimulus selection and matching.

KEYWORDS:

Bigram frequency; Compound words; Family size; Morphology; Psycholinguistics; Semantic transparency; Sentiment

PMID:
31347038
DOI:
10.3758/s13428-019-01282-6

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center