Format

Send to

Choose Destination
PLoS One. 2014 Nov 19;9(11):e113114. doi: 10.1371/journal.pone.0113114. eCollection 2014.

Diffusion of lexical change in social media.

Author information

1
School of Interactive Computing, Georgia Institute of Technology, Atlanta, Georgia, United States of America.
2
School of Computer Science, University of Massachusetts, Amherst, Massachusetts, United States of America.
3
School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America.

Abstract

Computer-mediated communication is driving fundamental changes in the nature of written language. We investigate these changes by statistical analysis of a dataset comprising 107 million Twitter messages (authored by 2.7 million unique user accounts). Using a latent vector autoregressive model to aggregate across thousands of words, we identify high-level patterns in diffusion of linguistic change over the United States. Our model is robust to unpredictable changes in Twitter's sampling rate, and provides a probabilistic characterization of the relationship of macro-scale linguistic influence to a set of demographic and geographic predictors. The results of this analysis offer support for prior arguments that focus on geographical proximity and population size. However, demographic similarity - especially with regard to race - plays an even more central role, as cities with similar racial demographics are far more likely to share linguistic influence. Rather than moving towards a single unified "netspeak" dialect, language evolution in computer-mediated communication reproduces existing fault lines in spoken American English.

PMID:
25409166
PMCID:
PMC4237389
DOI:
10.1371/journal.pone.0113114
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center