Hi,
I just wonder is there a list of 20,000 most frequently used English words? I think my vocabulary is around 15,000 and I have no problem reading English literature in general. However, from time to time I met unfamiliar words, so I think maybe I should simply spend sometime memorize another couple of thousand words. I searched the Internet but couldn't find exact what I want (most word lists I've found contain only 1,000~5,000 words).
Thanks,
Xing
Hi, I just wonder is there a list of 20,000 most frequently used English words? I think my vocabulary is ... the Internet but couldn't find exact what I want (most word lists I've found contain only 1,000~5,000 words). Thanks, Xing

Lorge Thorndike - Most common 30,000 words. Used to develop vocabulary section of IQ test 1954
See article below.
Vocabulary Resources for Material Writers
Writers
From The Materials Writers NewsletterThe Newsletter of the Materials Writers' National Special Interest Group of the Japan Association of Language Teachers

Vol. IV, No. 3, October 1996
John Bauman
Enterprise Training Group
Material written for ESL students needs to use somewhat simplified vocabulary
and structure if it is to be accessible to lower and intermediate level students. In terms of vocabulary, a writer can try to "keep it simple" while
writing, but a more rigorous approach is to compare a text with a list of words
prepared for this purpose. A variety of lists of words are available, as well as
different ways to use them. In this article, I will briefly list and describe
some lists. I'll also discuss a program that will analyze a text and give some
links for further exploration of this topic on the internet. Links to sites
mentioned are given in the "Web Links" section at the end of this article.
Teaching and Learning Vocabulary (Nation 1990) contains a good general discussion of this topic. Nation doesn't hesitate to quantify the issue. His
model of an ideal vocabulary teaching sequence starts with the most frequent
2,000 words, which he calls general service vocabulary. Everybody needs to know
these words; they make up about 87% of an average written text. After this
point, general frequency becomes less useful as a guide to what words to teach.
Students are better off studying a list of words specific to their field of
interest or need, if one can be found. For the student aiming at English-language higher education, Nation's 800 word University Word List is
appropriate. After this, the remaining vocabulary of English is of too little
frequency to merit direct study. Skills such as analyzing word parts, context
guessing, etc. can be taught.
The number of different words used will depend on the level of the text. Writers
of material for ESL learners also have to decide which words to use, or, in a
larger sense, to which population of words should they restrict themselves. Here
a list becomes necessary. Many have been developed over the years. The following
remain relevant.
The General Service List
The General Service List (GSL)(West 1953) is the specific list of 2,000 words
that Nation refers to when he writes about the "first 2,000 words." It's based
on written texts, it's old, and it's not in frequency order, though frequency
numbers are given. The source of the frequency information is even earlier than
the publication date, being derived from Thorndike and Lorge (1944). But the
list was not compiled based on frequency alone. It was created to be an ideal
vocabulary for ESL students to start out with. Through the 1970s, a lot of
material, particularly graded readers, was based on this list. Even today, much
of this material is sold and used. The GSL is out of print, and somewhat out of
favor. The list is available as a component of the Vocabprofile program described below and, in a slightly different form, on this web page.

Thorndike and Lorge
The Teacher's Word Book of 30,000 Words (Thorndike and Lorge, 1944) was created
as a resource for elementary and high school teachers in the United States. It
is still frequently cited, though computer-produced corpora have largely replaced it as an authority on the frequency of words. For example, it's the
source of the words above the 2,000 word level in the vocabulary test in Nation
(1990). It's old, it's based on a compilation of pre-WW2, non-computerized word
counts totaling about 18 million written words. As published, it's not in
frequency order, but frequency ranks are given for each word. The University Word List
The University Word List (UWL)(in Nation, 1990) is a list of academic vocabulary
composed of about 800 words. It's designed for students who plan to study in an
English-language college or university. Essentially, it's the most common 800
words in academic texts, excluding the 2,000 words of the GSL. This list is
structurally linked to the GSL. A student who studies the GSL, followed by the
UWL, will find no repetition of words. The list is divided into 11 parts. Part
one has the greatest frequency and range, part 2 next, etc. This list is also a
component of the Vocabprofile program.
The Brown Corpus
The Brown Corpus (Francis and Kucera, 1982) is the earliest computerized study
of English vocabulary. It is an analysis of 1 million words published in the
United States in 1961. It's also kind of old, but it's more consistent in it's
definition of "word" (as a lemma) than the earlier lists. The 1982 publication,
which includes both alphabetical and frequency order lists of the words, is a
very useful resource.
The LOB Corpus
The LOB Corpus (Hofland and Johansson, 1982) is a study of 1 million words of
British text published in 1961. It was designed to be a British counterpart to
the Brown corpus.
The Cambridge English Lexicon
The Cambridge English Lexicon (CEL) (Hindmarsh, 1980) is a list of 4470 words,
prepared with reference to the GSL, Thorndike and Lorge, Brown, other sources,
and the author's experience as an ESL teacher and material developer. Each item
is graded from 1 to 5. The most useful aspect of the list is that the different
meanings of the words are also graded on the same scale. Only the CEL and the
GSL give separate information on the different meanings of common words (though,
of course, dictionaries do also). The GSL gives actual frequency numbers for the
different meanings, but the age of the data and the fact that it was gathered by
hand may make the CEL a more reliable source for an indication of the relative
importance to students of different meanings of words. The grading in the CEL is
not based solely on frequency.
Modern Corpora
These days, much is heard about corpora from dictionary publishers, who all
boast about the enormous corpora that their learner dictionaries are based on.
The British publishers are particularly enthusiastic about this, using either
the CoBuild corpus or the British National Corpus (BNC) as a source of lexicographic information. Both of these corpora contain more than 100 million
words. Limited access to them is possible through the internet, see the links on
the Collocations Homepage listed below. Depending on your purpose, it may be
more useful to access these corpora in pre-digested form through the dictionaries based on them. A lemmatized frequency list of the BNC has been
prepared by Adam Kilgarriff and is available for FTP. Vocabprofile
Vocabprofile is a freeware program for PCs that will compare a given text with
any properly formatted list. Three lists can be done at a time. The output will
report what percent of the words in the text are on each of the lists. It will
also print the text with the words marked to indicate which list they are on, or
if they aren't on a list. Vocabprofile is available for FTP at the URL below.
The three lists that come with the program are the first 1,000 words of the GSL,
the second 1,000 words of the GSL and the UWL.
Concluding Remarks
None of these resources is ideal. Thorndike and Lorge and the GSL are old, old
enough that the English of today surely differs significantly. However, the core
vocabulary of English changes more slowly, so at the frequency level of the
first 2,000 words this may be less of a problem. The GSL offers some advantages
as a standard. It was specifically designed as a teaching vocabulary list. It
has a long history of use, both in teaching materials and in second language
acquisition research. A program to compare it with a given text is readily
available. Of the lists above, only the CEL was also compiled for the purpose of
facilitating the creation of teaching materials. It's more modern than the GSL,
but appears to have had less impact. It is not conveniently available for
computerized text comparison.
The Brown Corpus, the LOB Corpus and the lemmatized list from the BNC are useful
because they give the lists in frequency order. This allows a population of
words to be defined much more precisely, and individual words to be compared
with each other. But these lists were prepared for linguistic research, not
teachers. They're lists of lemmas, which means that words are listed more than
once if they can act as more than one part of speech. Some derived forms are
also considered as separate lemmas, such as comparative and superlative forms of
adjectives. These factors affect both the frequency rankings of words and the
number of words that appear on a list. In other words, a list of 1,000 words
taken from the GSL or CEL would contain more than 1,000 lemmas. These corpus-based lists need substantial adjustment to make them appropriate as
vocabulary standards. These adjustments have already been made to the GSL and
CEL.
An author of EFL material has many vocabulary options available. I hope this
discussion of resources is useful and that the bibliography and the internet
sites below will be helpful in finding the items that will serve your specific
needs.
Links to sites mentioned
Adam Kilgarriff
http://www.itri.brighton.ac.uk/~Adam.Kilgarriff /
Links to his lemmatized, frequency order version of the BNC are here. John Higgins
http://www.marlodge.supanet.com/index.html
Here you can find Vocabprofile as well as links to other programs. Bibliography
Francis, W.N. and Kucera, H. (1982).Frequency Analysis of English Usage. Houghton Mifflin, Boston
Hindmarsh, R. (1980). Cambridge English Lexicon. Cambridge University Press,
Cambridge
Hofland, K. and Johansson, S. (1982). Word Frequencies in British and American
English. NAVF, Bergen
Nation, I.S.P. (1990). Teaching and Learning Vocabulary. Newbury House, New York
Thorndike, E.L. and Lorge, I. (1944). The teacher's Word Book of 30,000 Words.
Teachers College, Columbia University, New York
West, M. (1953). A General Service List of English Words. Longman, London
Back to the Top
John Bauman's Homepage
Hi, I just wonder is there a list of 20,000 most frequently used English words? I think my vocabulary is ... the Internet but couldn't find exact what I want (most word lists I've found contain only 1,000~5,000 words). Thanks, Xing

Hmmm, I wonder is there anywhere on the web you can test your vocabulary, like answering a quiz and getting an approximate word count as a result ? Might be interesting to try.
CV
Teachers: We supply a list of EFL job vacancies
Hi, I just wonder is there a list of 20,000 most frequently used English words? I think my vocabulary is ... I searched the Internet but couldn't find exact what I want (most word lists I've found contain only 1,000~5,000 words).

I believe one of the commissions of the EU has developed lists of core words and functions that are essential at various levels of language acquisition for all the major EU languages. they are designed to help prepare textbooks and as guidelines for foreign language examinations in the different EU countries. I'm not certain whether they are available on-line - I'll check during the next few days.
However, at your level of cpompetence I don't think memorising a few thousand random words is the solution to your "problem" - I don't really think it IS a problem. I think it would be far more useful to read a large quantity of general literature (and specialist literature in your various areas of interest) and note the words you don't know. This will enable you to develop your own list of vocabulary that is useful for you to learn.
Regards, einde O'Callaghan
Einde,
Are you talking about the Common European Framework?

If you are, then it is a rather vague list of 'can do' statements: a functional view of the language. People in some countries have taken it further for the different languages, but I don't know how far. And in any case, functional language like this is apparently sometimes hard to pin down.I heard of a Collins COBUILD experiment in which native speakers were recorded in conversations that should bring up the language of 'recommending' and 'advice' (talking about visiting a holiday place). Never once did the speakers use 'should'!
Xing Qiu,
Why don't you just read a bit more in English? It's surely more fun than learning an abstract list of words, especially because you get more information about how the word is used. Words don't usually like being alone: they hang out with the other words, contexts and register they belong with.
Some learners I know make marks in their dictionary every time they look up a word when they are reading: if a word has three or more marks, they probably need to spend a bit more time trying to remember it. And there are plenty of books now that use simple English, for example the Penguin Readers, or the Oxford Bookworms.

Good luck!
Jan
Einde, Are you talking about the Common European Framework?

I think that's what they're called
If you are, then it is a rather vague list of 'can do' statements: a functional view of the language. ... up the language of 'recommending' and 'advice' (talking about visiting a holiday place). Never once did the speakers use 'should'!

I'm sure I saw lists of English WORDS, not just functions, while attending a presentation of a new series of books that were based on this fraqmework.
Regards, Einde O'Callaghan
P.S. I agree with your suggestion about reading and dictionary work.
Students: Are you brave enough to let our tutors analyse your pronunciation?
Here in Greece, it's been the Common European Framework this and the CEF that for the last two years. I paid forty bucks for the commission report and still haven't gotten past page five.
Many, many publishers have jumped on the CEF bandwagon and promised that their books are based on it. The recognized experts then disagree and point out circuitously that the publishers are simple thieves bent on commercial success and devil-take-the-hindmost (new information, indeed). Quite frankly, for all practical purposes, it seems to be a lot of dignified talk and no action, in the classic European tradition...
The "can-do statements" might as well be called "I-think-I-can, I-think-I-can statements," for all the real-world effectiveness we've seen to date.