All proceedings
Enter a document #:
Enter search terms:

Info for readers Info for authors Info for editors Info for libraries Order form Shopping cart

Share Paper 1091

Vocabulary Range and Text Coverage: Insights from the Forthcoming Routledge Frequency Dictionary of Spanish
Mark Davies
106-115 (complete paper or proceedings contents)


A fundamental issue facing language learners and teachers is being able to effectively maximize the acquisition of vocabulary, by focusing on those words that the learner is most likely to encounter. This paper provides insight into this question, based on data from the forthcoming Routledge Frequency Dictionary of Spanish. This is the first large-scale frequency dictionary of Spanish in more than forty years, and is the first to be based on a large corpus (20 million words) representing equivalent sizes of sub-corpora from spoken Spanish, fiction, and non-fiction. The corpus was tagged and lemmatized, and then the most frequent 6000 lemma were selected based on overall frequency, range, and dispersion throughout the corpus. The data indicate that learners who have mastered approximately 4000 lemma will be able to recognize about 90% of all tokens in spoken Spanish, whereas approximately 7000 lemma are needed for 90% coverage in fiction, and 8000 lemma for non-fiction.

Published in

Selected Proceedings of the 7th Hispanic Linguistics Symposium
edited by David Eddington
Table of contents
Printed edition: $230.00