The inescapable case for extensive reading
Rob Waring, Notre Dame Seishin University, Okayama, Japan
In his article, Dr. Rob Waring discusses the necessity for Extensive Reading and Extensive Listening in all language programs. The article reviews recent vocabulary research and shows that learners need to meet massive amounts of language to learn not only single words but also their collocations, register and so forth. The article demonstrates that neither intentional learning nor course books (especially linear-based ones) can cover the vast volume of text the learners need to meet without Extensive Reading. He shows that learners need to gain their own sense of language and this cannot be gained from only learning discrete language points, rather it must, and can only, come from massive exposure in tandem with course books.
This paper puts forward the idea that graded reading, or extensive reading, is a completely indispensable part of any language program, if not all language programs.
1. The amount of language to be learnt
Let us first look at the vocabulary. We know from vocabulary research that English is made up of a very few extremely common words which comprise the bulk of the language. In written text, we know that about 2000 word families cover about 85-90% of the running words in general texts and that 50% of any text will be function words (Nation 2001). We also know that to read a native novel, a newspaper or a magazine with 98% vocabulary coverage, a learner would need to know about 8000 word families. But how should these words be learnt? And what do we mean by “learning”?
One of the few things language researchers can agree about is that learners can learn words from reading provided the reading is comprehensible. They may though, disagree over the uptake rates and types of texts to be used. Determining uptake rates is a vital component in the overall picture of vocabulary learning because these rates affect how much text learners need to meet, and over what time period the learning should take place.
One of the main factors affecting learnability includes the ratio of unknown to known words in a text. The more dense a text is (more unknown words it has), the less likely incidental learning can occur. Liu Na and Nation (1985) and Hu and Nation (1999) suggest the optimal known word coverage rate be about 95-99% of known words for there to be a good chance that learning can take place.
Laufer (1989) and Nation (2001), and many others have shown that unless we have about 98-99% coverage of the vocabulary of the other words in the text, the chance that an unknown word will be learnt is minimal. This means that at minimum there should be one new word in 40, or 1 in 50 for the right conditions for incidental vocabulary learning. The figures for learning from listening appear to be even higher due to the transitory nature of listening (Brown – Waring – Donkaewbua 2008).
Uptake rates also depend on the opportunities for learning that is, the number of times an unknown word appears in a given text and how closely spaced the unknown words are, so that knowledge can be retained in memory before it is lost. It is pertinent to look at the opportunity that learners have for learning from natural text because this can tell us how how words are spaced in the language. Moreover, this data combined with the uptake rates stated above, can help us determine whether incidental learning of vocabulary from reading is efficient enough to be a major vocabulary learning strategy.
Table 1 shows the frequency at which words occur in a 50 million word sub-corpus (both written and spoken) of the British National Corpus (BNC) of English.
The most frequent word in English (the) covers 5.839% of any general English text (i.e. it occurs once in every 17 words) (see (1) in the table). The 2000th most frequent word in English covers 0.00432% of any general English text and occurs once every 23,103 words (2). Note that when the learner meets the 2000th most frequent word in English, this means that all the previous 1999 words have also been met at least once.
If we set the uptake threshold whereby a word become “learnt” at 10 recurrences, 85,329 words need to be read to “learn” all the 1000 most frequent words in English (3).To “learn” all the 500 most frequent words in English at an uptake threshold of 20 times, 80,732 words need to be read (4) and 2.6 million words need to be met to meet the most frequent 5000 words at 20 recurrences (5).
Many researchers argue that learners can build a huge vocabulary simply from reading. However, even at the 10 meeting recurrence rate for learning to occur, Table 1 clearly shows that a huge amount of text needs to be met to facilitate the learning of vocabulary incidentally from reading. It also shows that as one’s vocabulary level increases, there is a huge increase in the amount of text that one needs to be read in order to meet unknown words because each new or partially-learnt word is met more and more infrequently.
Considerable evidence (e.g. Nation 2001, Waring – Takaki 2003) suggests that our brains do not learn things all in one go, and we are destined to forget many things we learn and especially recent knowledge is quite fragile. We also tend to pick up complex things like language in small incremental pieces rather than in whole chunks. We know for example, that it takes between 10-30 or even 50 or more meetings of a word receptively for the form (spelling or sound) of an average word to be connected to its meaning (Waring, forthcoming).
The BNC data in Table 1 are for word families based on type. In other words the data states that meeting any of the family members 20 times (use, then uselessness, then user) means the whole family will be learnt after those 20 meetings. This is obviously a gross simplification as many derivations are easy to learn (wind/windy or teach/teacher), whereas other are complex and late acquired (govern/ungovernable or excuse/inexcusable). Moreover, the analysis does not account for polywords, not the thousands of lexical chunks and set phrases such as I’d rather not; If it were up to me, I’d…; We got a quick bite to eat; What’s the matter?; The best thing to do is … and so on. Nor does it take into account polysemy (multiple meaning senses of words), phrasal verbs, idioms and metaphor because the analysis was done by type. All these need to be learnt in addition to the single words.
Table 1 also does not take into account the volume of text needed to learn the collocations and colligations either. If we assume that the learning of a meaning and its form is a precondition for the learning of its collocations (we need to know calm and sea to know the collocation calm sea), we can conclude that these ‘deeper’ aspects of the learning of a word will take far longer than just learning the word as a single unit i.e. its form-meaning connection only. But how many collocations does each word have, on average? Here is a sample of some of the main collocations and colligations for the very common word idea (taken from Hill – Lewis 1997).
Verb collocations of Idea. e.g. abandon an idea
abandon, absorb, accept, adjust to, advocate, amplify, advance, back, be against, be committed/dedicated/drawn to, be obsessed with, be struck by, borrow, cherish, clarify, cling to, come out/up with, confirm, conjure up, consider, contemplate, convey, debate, debunk, defend, demonstrate, develop, deny, dismiss, dispel, disprove, distort, drop …………………….
These are just a small part of the verb collocations and colligations of one word – idea. And most of them were not given. This list only goes up to the letter d and there are about 100 more! And that doesn’t count the adjective uses (e.g. an abstract idea, an appealing idea, and arresting idea and so on) of which there are also several dozen. Not all words have this number of collocational partners and no one would suggest that learners need to know them all. Learners do however, need to know a good proportion of these to even approach native-like control and fluency over a given word and its collocations, thus the vocabulary task becomes even more arduous than that painted in Table 1.
The density of a text is a property of the learner, not the text itself. Thus a given text could be easy for one learner but impossibly hard for another. The above clearly suggest that language EFL learners who are trying to read fluently (extensively) who have not yet reached an advanced level (i.e. they know fewer than 5000 word families) should meet language which has been controlled and simplified so they are not overwhelmed by dense texts that prevent them from reading fluently. L1 texts (especially literary texts) typically are very dense lexically which would make them difficult to read and learn from and almost impossible to read fluently for all but the most highly advanced learners of English. Learners reading native texts that contain a high would make the reading slow and intensive and change the reading task into a linguistic (study) one rather than one for building fluency. This is not bad necessarily, but learners should be aware that unless they read a lot, they will not have the opportunity to meet the unknown words they need to strengthen their partially-known vocabularies. Therefore, EFL learners would need to use graded readers initially to help even out the density issues by systematizing the vocabulary load. Only when the learners can cope with more advanced texts, should they be exposed to them. Nevertheless, the volume of text needed to be met is immense and far beyond that of most normal courses. What this means is that far more than one book a week at the learner’s level will be required as was recommended by Nation and Wang (1999).
Unless the volume of reading is increased, it is likely that any partial knowledge of a given word will be lost from memory especially as each individual occurrence of words above this level appears so randomly and unpredictably in ungraded text. These data together suggest that it is unlikely much learning will occur from only reading above the 3000 word level unless several thousands of words are read per day.
To this point we have examined the vocabulary task at hand. If we now turn to the grammar, we can see a similarly massive task ahead of our learners. These examples of the present perfect tense, in its various guises, mask various forms and cannot be seen in the same way words can be, as the tense is abstract which makes it even harder to acquire.
The tense appears with differing subjects and objects, as both yes/no and wh- question forms, in the negative as well as declarative. It can be active or passive, continuous or simple, with have or has and that does not count the myriad regular and irregular past participle forms and the short answer forms. There are about 75 different possible variations of the form of the present perfect tense – and that does not count the different uses...
We have a fairly good idea about the uptake rates for words, but what about grammatical features? It is sad to say that after an exhaustive search for the uptake rates of grammatical features it appears that in the whole history of language research there is no data at all. None. This is amazing given that the vast majority of language courses taught today have a grammatical focus at least in part. How can we, as an industry, create courses and write learning materials without at least some idea of how frequently grammatical items need to be met for learning to occur? That said, it is clear that it typically takes several years after learners have been introduced to language features that they finally feel comfortable enough with them to start to use them at all, let alone correctly.
The above would seem to be a damning indictment on the benefit of incidental learning from fluent reading because it could be said that the time expended on the reading might be more fruitfully spent on intentional learning.
Indeed, recent research (Nozaki 2007) has shown that direct and intentional learning of vocabulary is faster than from incidental learning (i.e. from reading)... Nozaki found that the words met with word cards were learnt not only 16 times faster (words per hour of study), but were also retained longer than words learnt incidentally from reading.
Additionally, a case study of a learner in a study by Mukoyama (2004) showed that 30 minutes a day of learning Korean-Japanese word pairs for 30 days lead to 640 words being attempted and partially learnt. At the end of 30 days, 468 words were learnt (all the words were tested by L1-L2 translation) and two months later 395 words were still known, and at 7 months 310 words were retained all without any further meetings. These two studies together clearly show the power of intentional learning over incidental learning.
One might easily conclude from the above that we should not ask learners to learn vocabulary incidentally from reading, but rather adopt a systematic and intensive approach to direct vocabulary learning such as with word cards. One might even go further to conclude that by doing so, learners would not need to “waste” time reading, because they can learn faster from intentional learning and free up valuable class / learning time. However, this would be a grave mistake and a fundamentally flawed conclusion because language learning is far more complex than the extremely simplistic picture given above.
To really know a word well, learners need to know not only meanings and spellings, but the nuances of its meanings, its register, whether it is more commonly used for speaking or writing, which discourse categories it is usually found in, as well as its collocations and colligations, among many other things. The above studies see words as single stand-alone objects rather than words that co-exist and are co-learnt (and forgotten) with other words. They vastly underestimate what might be learnt because they only look at a partial, though very important, picture of word learning – the learning of single meanings.
One might be tempted to suggest given the rather slow rate at which vocabulary is learnt from incidental reading, that the multiple meanings, colligations, collocations, register, pragmatic values and so forth could be learnt intentionally. While this may be possible in theory and even in practice, we have to then ask where is the material to do this with? Where are the books that systematically teach this “deeper” vocabulary knowledge and recycle it dozens or hundreds of times beyond the form-meaning relationship (collocation etc.) for even the 1000 most frequent words? A few books exist but do not even come close to more than random selection of a choice few collocations, whereas as we have seen, learners need vastly more. In short, these materials do not exist. Even if they did, it would take a monumental amount of motivation to plough through such books intentionally and I doubt few, if any, learners have this stamina.
No learner has the time to methodically go through and learn all the above. No course book, or course, can possibly hope to teach even a tiny fraction of them. There is too much to do. But our course books were not designed to teach all of this. Let us look at what course books and course typically are designed to do. Our course books concentrate on introducing new language items with each appearing in new units or lessons, with new topics all the time.
The structure of course books and linear courses in general, shows us that they are not concerned with deepening knowledge of a given form, only introducing it or giving minimal practice in it beyond a token review unit, or test. They do not concentrate on the revisiting, recycling and revising necessary for acquisition. The assumption underlying most courses and course books is that our learners have “met” or “done that now” and we do not need to go back to it, so we can move on. Adopting this default view of language teaching (that “teaching equals learning” implicit in these materials) is a massive mistake if that is all we do because it undersells what our learners need – which is massive language practice with the things taught in course books but under the right conditions.
These data suggest that course books do not, and cannot by their very design provide the recycling of vocabulary needed for acquisition. This should not in any way be seen as an attack on course books. Course books are very useful and powerful but because of their design, they can only do half the job. They are good at introducing new language features in a linear way, but are not good, because of their design, at recycling this language and are poor at building depth of knowledge. If learners only use course books, and endless intensive reading books, they will not be able to pick up their own sense of how the language works until very late in their careers (i.e. until they have met the language enough times).
This, we can suspect, is one of the reasons teachers and learners alike complain that even after several years of English education, many learners cannot make even simple sentences even though they can get 100% on grammar and reading tests but can hardly say a word in anything other than faltering English. The reason for this should now be clear. Simply put, they did not meet enough language to fully learn what they were been taught. Their knowledge is abstract, and stays abstract, because it was taught abstractly because the course books and courses tend to break down the language into teachable units. This atomistic knowledge is useful for tests of discrete knowledge (e.g. selecting a tense from choices or completing gap fills) because this knowledge was learnt discretely, which allows them to do well on discrete point tests. However, because their knowledge is held discretely, it is no wonder when the learners are called upon to use it in speaking or writing they don’t know how to put their discrete knowledge together fluently.
So, how are the learners going to deepen their knowledge if they do not have time to learn these things intentionally, and our course books do not re-visit the features / words they teach? Where is the recycling of language we need for real learning? The answer lies with graded or extensive reading used in tandem with a taught course such as the course book shown above. The two must work together. The course book would introduce and give minimal practice in the language features and vocabulary while the reading of graded readers consolidate, strengthen and deepen that knowledge.
Therefore to gain fluent control over the language, the learners also must meet these items in real contexts to see how they work together, to see how they fit together. In other words learners must get a “sense” or “feeling” for how the language works. This sense of language can only come from meeting the language very often and by seeing it work in actual language use (i.e. from their reading or listening). This depth of knowledge gives learners the depth of language awareness and confidence to feel comfortable with the language that will enable them to speak or write. And this exposure comes from graded readers and extensive reading and extensive listening.
An oft-asked question is “Why can’t my learners speak? They’ve been learning English for years now. I teach them things, but they just don’t use them. It’s so frustrating.” Learners will only speak when they are ready to. That is, they will speak once they feel comfortable enough using the language feature or word.
But where does this comfort come from? It comes from experience with the language. The more times they meet a word, a phrase, a grammatical feature, the more chance it has to enter their comfort zone and the greater chance there is for it to become available for production. It is no wonder then that research into extensive reading that show gains for speaking only from extensive reading (e.g. Mason – Krashen 1997).