Sampling Assumptions Affect Use of Indirect Negative Evidence in Language Learning

PLoS One. 2016 Jun 16;11(6):e0156597. doi: 10.1371/journal.pone.0156597. eCollection 2016.

Abstract

A classic debate in cognitive science revolves around understanding how children learn complex linguistic patterns, such as restrictions on verb alternations and contractions, without negative evidence. Recently, probabilistic models of language learning have been applied to this problem, framing it as a statistical inference from a random sample of sentences. These probabilistic models predict that learners should be sensitive to the way in which sentences are sampled. There are two main types of sampling assumptions that can operate in language learning: strong and weak sampling. Strong sampling, as assumed by probabilistic models, assumes the learning input is drawn from a distribution of grammatical samples from the underlying language and aims to learn this distribution. Thus, under strong sampling, the absence of a sentence construction from the input provides evidence that it has low or zero probability of grammaticality. Weak sampling does not make assumptions about the distribution from which the input is drawn, and thus the absence of a construction from the input as not used as evidence of its ungrammaticality. We demonstrate in a series of artificial language learning experiments that adults can produce behavior consistent with both sets of sampling assumptions, depending on how the learning problem is presented. These results suggest that people use information about the way in which linguistic input is sampled to guide their learning.

MeSH terms

  • Adult
  • Comprehension / physiology*
  • Female
  • Humans
  • Language Development
  • Language*
  • Learning / physiology*
  • Linguistics / methods
  • Linguistics / statistics & numerical data*
  • Male
  • Models, Statistical*

Grants and funding

This work was supported by grant number RES-000-22-3275 from the Economics and Social Research Council awarded to AH and grant number SES-0631518 from the National Science Foundation awarded to TG. http://www.esrc.ac.uk/. http://www.nsf.gov/. The funders were not involved in the study in any way.