Data

[OC] Vocabulary size at each English proficiency level

December 9, 2025

View 34 Comments

34 Comments

thegodzilla25 on December 9, 2025 9:21 pm

Cool test! Took 2 mins and I learnt some things!
Zigxy on December 9, 2025 9:23 pm

I feel like part of the spread has to do with the original language of the user.

Someone who natively speaks a Germanic or Latin language is going to probably know quite a lot of Germanic and Latin words, respectively. Although their overall grasp of the language might not be great. Conversely someone from an unrelated language might need to have studied for a long time to match the vocab depth, but would have a much better grasp of other areas.
BiBoFieTo on December 9, 2025 9:32 pm

Took the test. It was really interesting. A few times it made me question my sanity because of the fake words.

It correctly identified me as a native speaker.
DameKumquat on December 9, 2025 9:35 pm

Phew, I have native level English!

Nice test – will it be available in other languages?
QuantumIce8 on December 9, 2025 9:35 pm

Cool test and data! One observation: the output word count from the test is unreadable when on dark mode (Android, Firefox). The dark blue text is almost the same as the dark grey background
Few-Interview-1996 on December 9, 2025 9:39 pm

Re: Your test. Yes, I do know the meaning of the word enceinte. It just doesn’t happen to be English. :p
highlyeducated_idiot on December 9, 2025 9:43 pm

Excellent little app you have there. Good job!
PristineAnt9 on December 9, 2025 9:44 pm

Can you fix the German test? It always freezes on the last word and I desperately need to know how bad I am at German.

Also thank you, lots of fun!
samuelazers on December 9, 2025 9:48 pm

what if they have a native vocabulary but heavy accent or makes grammar mistakes?
makkerker on December 9, 2025 9:50 pm

It is not size that matters but how do you use it
zombiecalypse on December 9, 2025 9:52 pm

I’m glad I scored above the median (?) native speaker, because I’m pretty sure I’d do a lot worse in my native language
warnerbolanos on December 9, 2025 9:54 pm

The German test gets stuck on the last word.
Enuntiatrix on December 9, 2025 9:56 pm

https://preview.redd.it/5jehow00396g1.png?width=720&format=png&auto=webp&s=4c7ea0529b69d3186bdd745a271da090579ff4fc

Very nice. I’m a non-native speaker, but I started with English in school 20 years ago. Perhaps the only subject I ever needed IRL, to be honest.
PHealthy on December 9, 2025 9:58 pm

You should do this test but for risk literacy
akurgo on December 9, 2025 10:03 pm

The test is really well made. I’m C1 it seems. There are so many words that I’ve read and heard countless times, but don’t know the exact meaning of. For example, I will typically understand a sentence with words like “embellish” or “egregious” in it without really knowing the word, and so I don’t bother looking it up. Maybe I should bother.
tka4nik on December 9, 2025 10:10 pm

Nice work, and very cool test!

~~Someone already mentioned that for some languages, the last word (if the result is non-trivial, as in if you didn’t press all “don’t know”) freezes up and doesn’t show the results. Can confirm the bug for Russian as well~~

https://preview.redd.it/uacjw1ce596g1.png?width=715&format=png&auto=webp&s=572a9d9ad4423f45f30df3fdbf4cb0a7ce7817e0

Seems like you’ve already fixed the bug, good job!!
turb0_encapsulator on December 9, 2025 10:14 pm

Interesting. I am honestly surprised that the distribution curve isn’t larger for native speakers. Perhaps that means it isn’t so hard to raise someone’s reading level. I am at 90th percentile despite only knowing 23.5% more words than the average person.
n4s0 on December 9, 2025 10:16 pm

This is pretty cool. Thanks!
ChengliChengbao on December 9, 2025 10:20 pm

im a native speaker and i got C1

amazing
GreyGoblin on December 9, 2025 10:28 pm

God damn these stupid violin plots!

What exactly is the Y axis units between B1 & B2? What’s difference between green points above B1 and below that line.

A histogram if modality is important, a box and whiskers if it’s not.

Yeah yeah, those won’t look ‘as detailed’… But that’s just it you’re not adding detail to data, you’re adding noise to art.

/Rant
hansrotec on December 9, 2025 10:29 pm

Avoided the fake words and got the definitions correct…. A few of those fake words as others have said had me questioning myself and other words …. I may start using them see if I can get one or two going in a friend group
thespermthatsurvived on December 9, 2025 10:35 pm

Cool stuff!! What did you use for the dataviz if I may ask?
thebowlman on December 9, 2025 10:35 pm

What is the difference between C2 and Native?
Rafa_50 on December 9, 2025 10:35 pm

Great test, I do feel like some of the options when it asks you to define a word are a bit weird, but it might be just due to alternative meanings or me being dumb.
Devilnaht on December 9, 2025 10:41 pm

Very interesting! It aligns reasonably well with what I’ve read before on the vocabulary size per CEFR level, although a bit smoother of a curve (also, A1 seems quite a bit higher than expected). If you’re curious, you can find a non-paywall link to the paper that their definition of a word family is based on here: [https://www.lextutor.ca/morpho/fam_affix/bauer_nation_1993.pdf](https://www.lextutor.ca/morpho/fam_affix/bauer_nation_1993.pdf) .

An interesting thought is that the productive vocabulary growth in real terms is probably a good deal larger than this suggests; as you progress in a language, you not only recognize more word families, but you’re able to use more members of the word families you already know. For instance, the Paul Nation article there gives 16 different words within the single word family “develop”. Eyeballing it, an A1 speaker might only be able to productively use maybe 3-4 of them, whereas a native speaker would be able to use all or nearly all. So while the above may show that a native speaker knows “about 10 times as many words” as an A1 speaker, I wouldn’t be surprised if the active vocabulary of a native speaker were 20 or 30 times larger.
Oneforallandbeyondd on December 9, 2025 10:44 pm

Best A2 is stronger than worse C2? hehe. Great system that is.
JJBrazman on December 9, 2025 10:47 pm

Thanks for the fun test! One note, in dark mode the final result is almost unreadable because it’s dark blue against a black background. And that’s what I’ll blame for my score being lower than I’d like!
Schuesselpflanze on December 9, 2025 10:52 pm

I took the test in German and English.

The German one is a little wacky because it didn’t use the capitalization rules
Vorschrift on December 9, 2025 11:12 pm

I…. C2. Believe you not?
Sensitive-Reaction32 on December 9, 2025 11:12 pm

I’m classed in C2 category. I’m a native English speaker, but I don’t know the meaning of many words (just know they exist), so I’m not entirely surprised
TheBigBo-Peep on December 9, 2025 11:16 pm

Really well done

Thought I was hot stuff but nope, 48% vs Native speakers (classified C2, 15300)

That said, I was very honest (and found all 10 fake words) so I suspect some people are being a bit generous. I suspect the median person isn’t taking this test either 🙂
Ariel90x on December 9, 2025 11:25 pm

https://preview.redd.it/bis74smqi96g1.jpeg?width=583&format=pjpg&auto=webp&s=29f3c1aca40ceae5f737044fc86b3ee0c0de099a

I’m Italian, I studied Latin and German and IMO this test is broken from someone like me since most of the hard words are either Germanic or from Latin\French.
polypolip on December 9, 2025 11:32 pm

Nice data and fun test. One remark regarding the test – at least for Polish it gave weird options as answers, like for “intruz” / intruder, I’m guessing the answer was “gość” / guest probably because intruder is an unwanted guest, but that’s a really bad way to put it if it’s missing the adjective.
Malorn44 on December 9, 2025 11:40 pm

Would be interested in seeing this for Japanese

Tags

[OC] Vocabulary size at each English proficiency level

34 Comments