Please use this identifier to cite or link to this item: http://dx.doi.org/10.23668/psycharchives.3079
Title: The learnability consequences of Zipfian distributions: Word Segmentation is Facilitated in More Predictable Distributions
Authors: Lavi-Rotbain, Ori
Arnon, Inbal
Issue Date: Jun-2020
Publisher: PsychArchives
Abstract: One of the striking commonalities between languages is the way word frequencies are distributed. Across languages, word frequencies follow a Zipfian distribution, showing a power law relation between a word's frequency and its rank (Zipf, 1949). Intuitively, this means that languages have relatively few high-frequency words and many low-frequency ones. While studied extensively, little work has explored the learnability consequences of the greater predictability of words in such distributions. Here, we propose such distributions confer a learnability advantage for word segmentation, a foundational aspect of language acquisition. We capture the greater predictability of words using the information-theoretic notion of efficiency, which tells us how predictable a distribution is relative to a uniform one. We first use corpus analyses to show that child-directed speech is similarly predictable across fifteen different languages. We then experimentally investigate the impact of distribution predictability on children and adults. We show that word segmentation is uniquely facilitated at the predictability levels found in language, compared both with uniform distributions and with skewed distributions that are less predictable than those of natural language. We further show that distribution predictability impacts learning more than distribution shape, and that learning is not improved further in distributions more predictable than natural language. These novel findings illustrate learners' sensitivity to the overall predictability of the linguistic environment; suggest that the predictability levels found in language provide an optimal environment for learning; and point to the possible role of cognitive pressures in the emergence and propensity of such distributions in language.
URI: https://hdl.handle.net/20.500.12034/2693.2
http://dx.doi.org/10.23668/psycharchives.3079
Citation: Lavi-Rotbain, O., & Arnon, I. (2020). The learnability consequences of Zipfian distributions: Word Segmentation is Facilitated in More Predictable Distributions. PsychArchives. https://doi.org/10.23668/PSYCHARCHIVES.3079
Appears in Collections:Preprint

Files in This Item:
File Description SizeFormat 
Zipfian-cognitive-advantage-psycharchive.pdf567,99 kBAdobe PDF Preview PDF Download

Version History
Version Item Date Summary
2 10.23668/psycharchives.3079 2020-06-16 14:35:44.448 The new version was formed in order to fix mistakes that appeared in the graphs of the original version.
1 10.23668/psycharchives.3075 2020-06-09 16:29:57.0

This item is licensed under a Creative Commons License Creative Commons