Chaotic Evolution in an Unbiased Learner

Adam Albright

Abstract

A premise of evolutionary (or channel-based) explanations of phonological typology is that isolated misproductions or miscategorizations may cause the incoming speech signal to deviate from the speaker's original intent—for example, target /np/ may be recovered as [mp] due to articulatory overlap and the difficulty of distinguishing coarticulated [np] from [mp]. Over time, these deviations are assumed to create statistically significant patterns corresponding to crosslinguistically common processes, which may then be learned and reinforced by learners even in the absence of intrinsic biases towards typologically common patterns. Numerous studies have investigated whether humans behave like unbiased learners (infants: Seidl & Buckley 2005, Gerken & Bollt, in press; adults: Pycha & al. 2003, Wilson 2003, 2006; Koo & Cole 2006; Finley & Badecker 2007). Rather less attention has been paid to an important prior question: is a series of individual channel events (misproductions and misperceptions) actually sufficient to create the statistical patterns that are observed typologically? Concretely, given an initial state with a difficult contrast such as [np] vs. [mp], would misperceptions of /np/ as [mp] gradually accumulate to create a pattern of nasal place assimilation? In this paper, the author reports a series of computational simulations designed to address this question. An unbiased inductive learner was used to investigate what patterns might arise in languages partway through a phonetically motivated change. The properties of languages at a stage with a 3:1 preference for [mp] were explored by generating 1,000 artificial lexicons, each containing 50 words with nasal+[p] clusters (37-38 [mp] words, 12-13 [np] words). In all of these languages, there is a strong (75%) tendency for labial nasals before [p] (nasal place assimilation). The question of interest is whether there are even stronger statistical patterns, due to coincidences elsewhere in the word. To test this, the author submitted all 1,000 languages to an inductive model of phonological constraint discovery, which compares words that share a particular property (such as [n] or [m]) to determine the best predictors in the surrounding phonological context (such as a following [p]). It emerged that in 678/1000 languages, the algorithm found specific contexts that were more reliable predictors of nasal place. Thus, it appears that a powerful unbiased learner may find patterns that "side-track" natural changes, diverting them to create unnatural distributions. The author discusses several learning biases that prevent this effect, and may be important in deriving the observed phonological typology.

Paper 1813

Abstract

Published in