Cascadilla Proceedings Project: Paper 1091 Abstract


List of proceedings

Enter a document #:
Enter search terms:




Info for readers

Info for authors

Info for editors

Info for libraries



Order form

Shopping cart

Vocabulary Range and Text Coverage: Insights from the Forthcoming Routledge Frequency Dictionary of Spanish
Mark Davies
106-115 (complete pdf)
Bookmark and Share

A fundamental issue facing language learners and teachers is being able to effectively maximize the acquisition of vocabulary, by focusing on those words that the learner is most likely to encounter. This paper provides insight into this question, based on data from the forthcoming Routledge Frequency Dictionary of Spanish. This is the first large-scale frequency dictionary of Spanish in more than forty years, and is the first to be based on a large corpus (20 million words) representing equivalent sizes of sub-corpora from spoken Spanish, fiction, and non-fiction. The corpus was tagged and lemmatized, and then the most frequent 6000 lemma were selected based on overall frequency, range, and dispersion throughout the corpus. The data indicate that learners who have mastered approximately 4000 lemma will be able to recognize about 90% of all tokens in spoken Spanish, whereas approximately 7000 lemma are needed for 90% coverage in fiction, and 8000 lemma for non-fiction.



Published in:
Selected Proceedings of the 7th Hispanic Linguistics Symposium
edited by David Eddington

Table of contents

ISBN 978-1-57473-403-4 library binding
v + 202 pages
publication date: 2005
published by Cascadilla Proceedings Project, Somerville, MA, USA

Printed edition: $230.00



Copyright © 2009 Cascadilla Proceedings Project. All rights reserved. To request permission to copy any elements from our pages, or to send comments or questions about our pages, please write to webmaster@cascadilla.com and make sure to provide the URL of the particular page.