Learning words, word frequency, graded readers and more.

To me the major task in language learning is the acquisition of vocabulary. If this is done through massive listening and reading, it will naturally bring with it a constantly improving familiarity with the language, making it easier and easier to understand grammar explanations, and eventually making it possible to learn to express oneself in writing or orally. Vocabulary is the key, in my view. The more vocabulary we learn, the more we can acquire. In vocabulary, the rich get richer and the poor stay poor.

Michael Lewis, with his “lexical approach” was one of the early proponents of the primacy of vocabulary. Here is a good summary of this approach by David Overton. Lewis stresses the importance of chunks of language, groupings of words, collocations, that the native speaker naturally throws together. Lewis proposes increasing the learner’s awareness of these using exercises.

This is about the only place I disagree with Lewis. I prefer to read and listen, without any exercises. In my experience, when I start learning a language, I am more interested in individual words. I need them to make any sense of what I am reading. I need lots of words. I live with the fact that the combination of the words does not always make sense, that I do not always understand the colloquial phrases or chunks. I just keep reading and listening, until my overall familiarity with the vocabulary and the language reaches a point when I am able to start focusing on chunks, collocations, etc., culling or mining them from my reading and listening. At LingQ this means that I start saving more phrases. I am at that phase now in Czech. The saving or learning of chunks and collocations is particularly important in the transition to output from input, to active use from passive knowledge.

Another popular area of study relating to vocabulary acquisition is the issue of word frequency. It is often stated that we should focus on learning the most common 2,000 or so words of a language (English is usually the example used) since these account for 80% or so of most contexts. This is where graded readers are usually recommended.

Specialists in vocabulary acquisition like Paul Nation and Batia Laufer have calculated that you need 3,000 word families to feel even somewhat comfortable reading, and 5,000 to be comfortable in most situations. This is based on the assumption that you should be reading texts with 98% known words. Many learners like to read graded readers which use simplified language with a low percentage of uncommon words.

I am not a proponent of this approach past the very first month or so. I feel that in whatever I read I will encounter the most common words often enough to learn them. However, if I stick to only 2% new words I will take forever to build the vocabulary I need.

With online dictionaries, and other programs that assist with the acquisition of vocabulary, and with the availability of audio to help the struggling reader, I recommend that learners attack difficult texts as soon as possible. This was the approach of famous Hungarian polyglot Kato Lomb and I heartily agree.

When I import a text into LingQ from a Czech newspaper today, I usually find only about 15-20% new words. It was 60% when I started two months ago. I am reading and listening to Karel Capek’s delightful notes on a visit to England, and there are about 30% new words. But in both cases this includes a lot of names, so the actual number of new words is quite a bit less. I don’t know what my true vocabulary is in Czech but I would imagine it is 6-7000 words if not more. The LingQ system tells me that I know just under 16,000 words and have saved just under 13,000 words (LingQs). This is after two and a half months. If I were using graded readers I would have far fewer words at this point. I would probably be closer to being able to take part in a simple dialogue, however. But I am not motivated to do that. I will start talking in a few months when I am better able to understand the normal conversation of a native speaker, and for that I will need lots of words.

One last point is the distinction between word families and individual words. The words find, finds, finding, findings etc. are counted as four words at LingQ. According to Nation and Laufer, 3,000 word families is equivalent to 5,000 individual words in English, 5,000 equates to 8,000 and so on. In English a LingQ word count should be divided by 1.6. In other more inflected languages the LingQ count needs to be divided by more.

One more thing, if you would like to test your word level in English try this website.

You may also like

3 comments on “Learning words, word frequency, graded readers and more.

    Try google. Here is one answer from google

    Word families are groups of words that have a common feature or pattern – they have some of the same combinations of letters in them and a similar sound. For example, at, cat, hat, and fat are a family of words with the “at” sound and letter combination in common.


Dear Steve, I would like to know how long in particular would it take to achieve native-like fluency and pronunciation by lots of exposure and native interaction? Remember I am a young teen.

Leave a Reply

Your email address will not be published.