frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Project Valhalla, Explained: How a Decade of Work Arrives in JDK 28

https://www.jvm-weekly.com/p/project-valhalla-explained-how-a
407•philonoist•10h ago•231 comments

DuckDB Internals: Why Is DuckDB Fast? (Part 1)

https://www.greybeam.ai/blog/duckdb-internals-part-1
337•marklit•3d ago•100 comments

The Productivity J-Curve [pdf] (2018)

https://ide.mit.edu/sites/default/files/publications/jcurve.pdf
24•kioku•3d ago•7 comments

Zen and the Art of Machine Learning Research

https://blog.jxmo.io/p/zen-and-the-art-of-machine-learning
169•jxmorris12•3d ago•53 comments

Ten years of ClickHouse in open source

https://clickhouse.com/blog/open-source-10
187•saisrirampur•3d ago•60 comments

To study how chips work, MIT researchers built their own operating system

https://news.mit.edu/2026/to-study-how-chips-really-work-mit-researchers-built-their-own-operatin...
296•speckx•4d ago•43 comments

I found 10k GitHub repositories distributing Trojan malware

https://orchidfiles.com/github-repositories-distributing-malware/
891•theorchid•1d ago•234 comments

Gribouille 0.3.0: A Grammar of Graphics for Typst

https://mickael.canouil.fr/posts/2026-06-15-gribouille-0-3/
164•mcanouil•4d ago•60 comments

The AirPods Effect

https://www.theescapenewsletter.com/p/the-airpods-effect
272•herbertl•17h ago•484 comments

Show HN: Modeloop – From visual algorithms to microcontroller C code

https://www.modeloop.app/
19•lucamark•3d ago•22 comments

Akse3D – open-source 3D modelling anyone can master

https://akse3d-en.skaperiet.no
96•joachimhs•4d ago•25 comments

Zero-Touch OAuth for MCP

https://blog.modelcontextprotocol.io/posts/enterprise-managed-auth/
228•niyikiza•18h ago•88 comments

Reinventing the Renaissance

https://drb.ie/article/reinventing-the-renaissance/
12•Petiver•2d ago•2 comments

How Japan's railways stayed one while splitting apart

https://arun.is/blog/jr-logo/
140•ddrmaxgt37•2d ago•120 comments

Ubiquiti: Enterprise NAS, Built on ZFS

https://blog.ui.com/article/introducing-enterprise-nas
386•ksec•1d ago•317 comments

.gitignore Isn't the only way to ignore files in Git

https://nelson.cloud/.gitignore-isnt-the-only-way-to-ignore-files-in-git/
498•FergusArgyll•1d ago•156 comments

Lift Challenge

https://www.darpa.mil/research/challenges/lift
26•mhb•11h ago•30 comments

CS 6120: Advanced Compilers: The Self-Guided Online Course (2020)

https://www.cs.cornell.edu/courses/cs6120/2025fa/self-guided/
410•ibobev•1d ago•57 comments

So You Want to Define a Well-Known URI

https://mnot.net/blog/2026/well_known_uris
125•ingve•10h ago•76 comments

SMTP Relay with Web Dashboard

https://github.com/toinbox/simplerelay
36•toinbox•3d ago•4 comments

Norway greenlights first full-scale ship tunnel

https://eandt.theiet.org/2026/06/18/norway-greenlights-world-s-first-full-scale-ship-tunnel
104•geox•6h ago•58 comments

Telescope Ranchers

https://kottke.org/26/06/telescope-ranchers
11•bookofjoe•3d ago•0 comments

Datasette Apps: Host custom HTML applications inside Datasette

https://simonwillison.net/2026/Jun/18/datasette-apps/
119•lumpa•15h ago•51 comments

Hospitals and universities repurposing drugs at lower cost

https://www.kcl.ac.uk/news/hospitals-and-universities-repurposing-drugs-at-90-lower-cost
323•giuliomagnifico•1d ago•154 comments

If your product is Great, it doesn't need to be Good (2010)

http://paulbuchheit.blogspot.com/2010/02/if-your-product-is-great-it-doesnt-need.html
132•skogstokig•3d ago•93 comments

Flexport (YC W14) Is Hiring in Indonesia, India, and Thailand

https://www.flexport.com/company/careers/
1•thedogeye•15h ago

W Social, public institutions and the theater of European digital sovereignty

https://blog.elenarossini.com/w-social-public-institutions-and-the-theater-of-european-digital-so...
247•nemoniac•1d ago•154 comments

"No Feigning Surprise"

https://wizardzines.com/comics/surprise/
67•evakhoury•3d ago•50 comments

Modos Color Monitor Pushes E-Paper Displays Further

https://spectrum.ieee.org/modos-e-paper-monitor
321•Vinnl•1d ago•74 comments

Show HN: Talos – Open-source WASM interpreter for Lean

https://github.com/cajal-technologies/talos
78•mfornet•1d ago•20 comments
Open in hackernews

How many of the 170k English words do you know?

https://vocabowl-870366514258.us-west1.run.app/
90•abnry•2h ago

Comments

dtagames•2h ago
This was fun! The progression seems logical.

I scored 71,000.

slices•1h ago
75k here but a few of the later ones were lucky guesses.
cubano•1h ago
Yes...exactly the same here although the guesses often had some grounding in the root of the word.
itsamario•2h ago
I know maybe 20-30. I'm aware of maybe a few thousand.

I use the language to understand not get an effect

goldenarm•2h ago
It's hilarious that most of these words are French
the_lonely_phon•1h ago
Depends is bratwurst a German word or an English one? You will hard pressed to find an American that doesn’t know thr word and what it means. You can buy them at just about any grocery store and they are a staple of many restaurants.

At some point the word becomes both. Sourced from its mother language and maybe even still meaning the same thing in both, but no less an English word than any other at this point.

mordechai9000•1h ago
It also had "weltschmerz" in the list, but I think I have only ever heard "ennui" used in English. They are both foreign words, but I would not have thought of weltschmerz as a loan word. Then again, maybe I am not reading the right texts.
rhdunn•1h ago
Norman French due to the Norman invasion of 1066 resulting in Old English evolving into Middle English. You can see that in the words for animals vs meats (cow and boef/beef, sheep and mutton, etc.) where the Germanic people raised the sheep and the Norman aristocracy ate them.

A lot of the more common and simpler words are Germanic, as is the grammar (e.g. compound words like cupboard).

classified•1h ago
English also has a ridiculously high fraction of Latin too.
pessimizer•1h ago
Not from Latin but through French - the direct use of Latin in English is generally restricted to technical jargon and legal terms (that English often also share with the French.)

Latin isn't really any sort of parent to Old English afaik, even though the Romans ran Britain for a while.

yorwba•2h ago
There is a typo in "Hippopotomonstrosesquippedaliophobia," it should be "Hippopotomonstrosesquipedaliophobia" instead. (Also, it breaks the layout.)
classified•1h ago
I bet that "p" just bounced out of pure spite.
summarybot•1h ago
Let the ironic screaming at the sight of this word commence!
bobson381•1h ago
also interrobang is rendered as bang-interro (!?) when it should be interro (?) then bang (!) -> (?!)
spelufo•1h ago
do you really think so?!

I think bang-interro just didn't sound as nice and that's probably why it is called an interrobang.

egypturnash•59m ago
No, it should be rendered with the proper Unicode: U+203D ‽
zamadatix•58m ago
There isn't a "correct" way to incorrectly render the interrobang as 2 separate characters. The name was never supposed to suggest a certain ordering instead of just being both at the same time. The name "interrobang" just sounded better than "exclamaquest" (or any of the alternatives Type Talks readers submitted).
fritzo•1h ago
Feature request: fewer clicks. It should be one click per question
TheJoeMan•1h ago
I'd suggest a "toast" would suffice for the correct answers. Proceed to the next question when correct, with a "next" button when incorrect.
ortusdux•1h ago
Keyboard shortcuts would be nice as well. When I saw it was 100 questions I bailed.
pastel8739•1h ago
I wish the option was just “yes I know this word” or “no I don’t”. Reading the definitions takes too long for so many words
yorwba•1h ago
A different interaction design is used by https://testyourvocab.com : just a list of words with a checkbox for each. But it might encourage overconfidence. Before their acquisition by Preply, they also had an interesting blog with statistical analysis: https://web.archive.org/web/20210724115604/http://testyourvo...

The two tests give me widely different results, probably because the sampled words aren't perfectly representative and so the results should have huge error bars to account for this sampling error.

jstanley•1h ago
Cool idea, am working through.

It's annoying that you need to click 3 times per question, and the buttons are in 2 different places.

Maybe would be better to just let me click the answer I want and then instantly show me the next question?

Also who is Sandi?

rhdunn•1h ago
Sandi Toksvig, the current host of the BBC program QI (Quite Interesting), previously hosted by Stephen Fry. She's also been on a number of other BBC TV and radio shows.
gilleain•1h ago
I suspect Sandi Toksvig, one of the hosts of QI. One of the 'success' messages is "quite interestng!".

No offence mean to anyone, but the whole exercise feels very QI : superficial 'understanding' of a large range of things (for example words) without much of a connection between these words.

cm2012•1h ago
Fun fact: there's a test you can do called wordsum which correlates extremely highly, like .71, to IQ. It's just asking you 10 vocabulary questions. It turns out knowing advanced vocabulary correlates really well to IQ.
summarybot•1h ago
I don't know if I can get behind .71 implying "correlates really well" ... that's the issue I had recently with talking with GPT, it was evaluating my logical reasoning ability based on the vocabulary I was employing. You don't need fancy words to be intelligent.
Laurel1234•1h ago
Pretty fun.

I suggest skipping the submit button and just showing it's correct when pressing and moving on after a sec or so. Having to click on submit twice really breaks the flow.

Also in all the words I tried I noticed out of the 4 options one is the correct one, another is the opposite of the correct one, and the other 2 are random stuff. You can basically skip any option whose antonym isn't present as well.

RicoElectrico•1h ago
It estimated 74k words for me, but I feel this might be inflated; much of the time when I didn't know the answer - I could vibe guess it just as you did it. The distractor answers weren't convincing enough. For starters, when an answer was based on deconstructing the word into common English words, that ruled it out. After all, if it was, then it wouldn't have been obscure.

A tangent: writing distractors for multiple choice questions is hard. From the exams I know (excluding those whose nature precludes it, such as based on calculation or rote memorization) the only that does this brutally well is LEK (Polish medical graduate exam). It's nigh impossible to vibe guess it at more than random chance for someone outside the field.

_diyar•1h ago
in casual use you might also be able to guess it from context, so i think it’s a wash
datsci_est_2015•1h ago
Yeah I also got exactly 74k. Stuff like “xylologist” I guessed had to do with vegetation because of “xylem”, whereas xylophone player was too on the nose. Then again, maybe knowing xylem in the first place makes 74k reasonable.
mpeg
trevwebdev•1h ago
Interesting, I don't have the time to go through 100 though and having to click on answer and then mouse down to continue is a slog.
archildress•1h ago
Nice tool - would love it if I could press a number on the keyboard to select and rapidly move through them.
kiaofz•1h ago
These should maybe be checked through. Many are the second or third definitions, and some even reference the word in the definition e.g Lethargic: exhibiting lethargy
notsylver•1h ago
It seems like the right answer is usually the longest of the choices, I managed to get a few just by picking the longest. It would also be nice if there was a "I don't know" instead of guessing and skewing the results by getting it right, though maybe thats accounted for
orrito•1h ago
These were likely all AI generated, or at least the alternatives were. I made an app a while ago as well, and afterwards realized AI often wanted to make a very covering answer for the correct one, making it often longer than the others, thus defeating the idea of the quiz in the process.
EstanislaoStan•1h ago
Yeah this is AI slop I don't like..
thenthenthen•1h ago
Also surprisingly mostly the forst or last option (might be bias)
thenthenthen•39m ago
Hahahhaha i got 62k points by just choosing the longest definitions. Great observation!
latexr•10m ago
> It seems like the right answer is usually the longest of the choices

You are correct. I tested that hypothesis about a dozen times and it seems that if you always pick the longest you’ll get it right somewhere in the high 70s to mid 80s. For anyone interested in testing for themselves, open the website to the first question then run this in the console (not going to spend time optimising it, it works well enough for the purpose):

  let loopCount = 0

  const loop = setInterval(() => {
    Array.from(document.querySelectorAll("button")).slice(0, 4).reduce((long, curr) => curr.textContent.length > long.textContent.length ? curr : long).click()
    setTimeout(() => Array.from(document.querySelectorAll("button")).at(-1).click(), 100)
    setTimeout(() => Array.from(document.querySelectorAll("button")).at(-1).click(), 200)

    loopCount++
    if (loopCount === 100) clearInterval(loop)
  }, 500)
jrrv•1h ago
Presumably it's a random batch of words since you can run the test again. I wonder how much the word selection affects the outcome. I got 66,750 with 20/20/15/17/14.

I'm curious how the difficult is chosen because "obfuscate" was included in the hardest difficulty but I would not consider that to me a difficult word.

Also I found that some of the definitions were not completely correct.

rhdunn•1h ago
It could be based on things like word frequency. I'd expect obfuscate/obfuscation to be less common outside of programming and RPGs (Vampire the Masquerade).
fl4regun•1h ago
apparently 54,000. Seems like it is including even fictional words though in this test (like from fiction novels). Ironically I scored higher on the expert words (18/20) than the "advanced" words (11/20)
apical_dendrite•1h ago
plenty of words and phrases originate from fiction

quixotic, scrooge, shangri-la, Uncle Tom, gargantuan, kafkaesque, blurb, milquetoast

and words like cyberspace were first used in fiction

once real people use them, they stop being fictional words

krustyburger•1h ago
Kafkaesque doesn’t originate directly from fiction like your other examples any more than a word like Dickensian does.
triceratops•57m ago
Well it does and it doesn't. It wouldn't be a word if Franz Kafka hadn't written any fiction. Same for Dickensian.
fl4regun•37m ago
The word was "Brobdingnagian", which apparently means "giant", from the book, Brobdingnag, published in the 1700s. I know all of the words you listed, even if I don't know t he books they came from, on the other hand, I've never heard anyone use "Brobdingnagian" and I've never heard of the book it came from either.
croisillon•1h ago
i remember of such a link in July 2011 but i could only find that one which is a bit different

https://news.ycombinator.com/item?id=2806377

yorwba•1h ago
How did you manage to remember the exact month? https://news.ycombinator.com/from?site=testyourvocab.com
WesleyJohnson•1h ago
59,400 - It said I'm a person of few words. It also recommended I read a dictionary. I feel some kind of way about that. :D

Fun!

hmokiguess•1h ago
why use many word when few word do trick
sd9•1h ago
Interesting concept, but 100 words is really quite a lot to get through... It's tiresome trudging through the easy words at the start, and I never got to see the interesting words before getting bored.

I've seen other systems like this calibrate far more quickly by assigning a sort of score and confidence behind the scenes. Confidence starts out low and increases over time - correct/incorrect answers rapidly adjust score at the beginning, then things settle down.

In practice this means you get a sequence of increasingly uncommon words initially, until you get one wrong, then you drop back to something easier until you start getting things right again, and eventually circle around words at your level.

Also - too many clicks per word. It's low stakes, just let me click the definition once and I'll live if I misclick (or add an undo button).

datsci_est_2015•1h ago
> Also - too many clicks per word. It's low stakes, just let me click the definition once and I'll live if I misclick.

This, and accept that people will have incorrect input and build it into the confidence. Even the smartest person in the world sometimes makes clerical errors, or has the wrong neuron fire at the wrong moment.

thenthenthen•1h ago
Moly holy the clicking is too much 3 clicks that could be one :O
conradludgate•1h ago
300* that could be 100*
latexr•1h ago
> Also - too many clicks per word.

They’re also too far away. I’m on a laptop and I have to keep moving the cursor up and down just to confirm. Give each option a letter or number and let me press it to choose the answer¹.

¹ There is (was?) some service for forms which does that and it works quite well. I think it was Typeform, but I just opened the website to check and—of course—it’s now just plastered with mentions of AI so I lost interest in verifying.

mcbetz•1h ago
This reminds me of a learning resource that I can't find again: you start with an assessment of how many words you know and then you get new words in context with every session (and maybe some spaces repetition). It was mostly from newspaper articles and catered for every level of English. It was a website (ca 2013), not an app. Any ideas?
Johnny_Bonk•1h ago
I like this but it should be all operable with keyboard to be faster ie up down and 1234 for options and if its righht you just move on, maybe show synonyms in the success ui.
bluecalm•1h ago
67900

English is not my native language. I get my vocabulary from browsing the Internet. There is no way I know that many words.

kortex•1h ago
Super fun, got 70,250. Friends have always lightly ribbed me for having to go home and look up words i've used. Those remaining 100k words must be really obscure.

One suggestion would be more convincing decoy choices, some were pretty silly. But I have no idea how they come up with them.

ak_111•1h ago
Open any technical textbook in an area slightly outside your domain and you will quickly disabuse yourself of the notion that majority of words are obscure. Most complex words are just technical/jargon not archaic or forgotten.
analog8374•1h ago
this is a test for willingness to put up with the whole 100. It says something.

3 clicks per is what gives it away. and the little compliments. and that it's 100 questions

dakolli•1h ago
Cool concept. but...

Vibe coders need to be forced to spend one day learning basic CSS before they're allowed to use an LLM to make a website and the internet would be a lot more pleasant as we move forward with slopification.. It doesn't have to be sloppy, and doesn't take all that much studying to at least be able to steer an llm in the right direction to make something look nice. At this point everything is just the same 3 colors and a centered flex column with weird spacing.

sim04ful•1h ago
I notice that the concept related to the right answer sometimes has an opposite counterpart.
HaloZero•1h ago
I wish it had keyboard shortcuts, it's a bit of a sludge to click through twice.

Got 64,650: 20/19/17/18/12 (the intermediate one was a dumb mistake)

ItsBob•1h ago
Apparently I know 70,000 words... I got 90 out of 100 and it thinks I'm Stephen Fry!
popey•1h ago
That was a nice diversion. I got 76,750.
2bird3•1h ago
All the 3 incorrect answers are just indirect opposites of the correct one.Quite easy to determine which is correct, even without knowing the word
fcatalan•1h ago
71050, not bad for a non native speaker I guess. I missed 9/100.

But to be honest many that might catch out a native speaker are just the Spanish/French/Latin word, so it was too easy in a way.

femto•1h ago
I got 97/100 (80.5k) by picking the answer that has no relation to the word. Most of the incorrect answers bore some relation to the word, whether that be phonetic or a similarity to a root word.
mpeg•1h ago
Yeah I got 75k~ and did something similar ... most of the expert and grandmaster ones had at least 1 or 2 obvious incorrect answers, then it was a 50/50 so I usually went for the thing that sounded either closer to the root of the word or completely left-field

Anything up to expert was obvious

WithinReason•1h ago
Also, just pick the longest answer :)
blatherard•1h ago
It might be nice if you could unlock a "hard mode" or ability to the first 1-3 levels after a first run. I scored a little over 81K and considered playing again because I like quizzes, but doing another batch of (to me) easy words seemed like a waste.
asdfasgasdgasdg•1h ago
Not a very good test. Too easy to guess many of the words, and the words seem to follow a theme. For example my list had five or six that had to do with speaking too much or too little (verbose, lugubrious, and a few others in that vein). And many easy words were placed late in the test (e.g. zeitgeist, facetious being in the expert and grand master categories?).

And it didn't even tell me at the end how many words I know!

There is a similar variant of such a test where you just go down a list of words of increasing obscurity, ticking the ones you are familiar with. If you do this once or twice, you can get a fairly good estimate of the actual number of words you know.

zaik•1h ago
That sounds like a good application of Item Response Theory (IRT).
jdiff•1h ago
78,250 is way more than I expected. I sure don't feel like I know 78,000 words.
NickHoff•1h ago
I enjoyed some of the incorrect options. For "Debilitate" one of the options was "Remove a bill from the tab".
JauntyHatAngle•1h ago
That was fun. Bit confused by the result because it says I was "wow are you stephen fry?" Which I assume meant I did decent. (72K).

But then below it said "you are a man of few words".

I take it the latter is just because I've only done the test once? But it's mixed messaging on first attempt I think.

Joe_Cool•1h ago
Combined with the factoid it features under "how is this calculated":

    However, most native speakers have an active vocabulary between 15,000 and 35,000 words.
We must be geniuses, lol.
welshwelsh•36m ago
That tracks. Active vocabulary means the set of words that someone knows well enough to actually use in their speech or writing.

That's always going to be smaller than the set of words for which a person can choose the correct definition out of four options.

sowbug•10m ago
Maybe "few words" means your larger vocabulary lets you use a single word to represent a concept that someone else would need several words to say. But the conversation ends up longer when the other person asks you to define the obscure word you just used.
amarant•1h ago
Fun game! I did worse than many others here, only 69.9k estimated words. But then English is my second language, so I'm pretty pleased with the result!
moron4hire•1h ago
Lethargic had an option "having the quality of lethargy".
tennfown•1h ago
Gaikwar - which I was able to guess was a former Indian state seems irrelevant as an “English” word especially given it seems to derive from a name that I have to assume is native to the region.
alistaira•1h ago
For those interested in the nature of the later, harder words but not willing to work through the earlier sets, here are the ones from my run:

Level 0: Core Basics Abundant, Baffle, Candid, Dwell, Emerge, Frugal, Generic, Hinder, Impartial, Jovial, Knack, Lucid, Meager, Naive, Obsolete, Peculiar, Quench, Refute, Seldom, Tedious, Unique, Valid, Wary, Yearn, Zeal, Adequate, Barren, Coarse, Diligent, Esteem, Fickle, Gloom, Hoax, Ignite, Jolt, Keen, Linger, Mend, Numb, Omit, Pledge, Quota, Rural, Soothe, Toxic, Urge, Vow, Witty, Yield.

Level 1: Intermediate Acumen, Benevolent, Complacent, Dilapidated, Eloquent, Fabricate, Gregarious, Hypothetical, Imminent, Juxtapose, Lethargic, Meticulous, Nostalgia, Oblivious, Pragmatic, Reiterate, Scrutinize, Tentative, Ubiquitous, Verbose, Wane, Aesthetic, Bolster, Candor, Defer, Elicit, Furtive, Glut, Heed, Impeccable, Lament, Modicum, Notorious, Opulent, Plausible, Resilient, Stagnant, Trivial, Viable, Zenith.

Level 2: Advanced Alleviate, Breviary, Cacophony, Deferential, Ephemeral, Fastidious, Garrulous, Harangue, Iconoclast, Juggernaut, Laconic, Magnanimous, Nefarious, Obsequious, Paradigm, Recalcitrant, Sanguine, Taciturn, Ubiquity, Vacillate, Winsome, Zephyr, Abase, Banal, Capricious, Debilitate, Ebullient, Facetious, Gaikwar, Hackneyed, Idiosyncrasy, Jargon, Kindle, Labyrinth, Maverick, Narcissism, Ostracize, Palliate, Quagmire, Rancorous, Sagacity, Tantamount.

Level 3: Expert Abstemious, Bellicose, Chicanery, Deleterious, Enervate, Fatuous, Gauche, Hegemony, Inculcate, Jejune, Kowtow, Lugubrious, Mawkish, Nonsectarian, Obdurate, Pernicious, Quotidian, Recapitulate, Supercilious, Tempestuous, Unctuous, Vehement, Winnow, Xenophobe, Ziggurat, Acquiesce, Bombastic, Circumlocution, Desultory, Equinox, Fiduciary, Gerrymandering, Hubris, Incognito, Kinetic, Loquacious, Metamorphosis, Nihilism, Orthography, Precipitous, Quasar, Reparation, Soliloquy.

Level 4: Grandmaster (The Obscure) Accoutrement, Brobdingnagian, Crepuscular, Defenestrate, Equanimity, Flibbertigibbet, Grandiloquent, Hippopotomonstrosesquippedaliophobia, Ineffable, Jingoism, Kerfuffle, Logorrhea, Mellifluous, Obfuscate, Panacea, Quixotic, Rococo, Sesquipedalian, Tergiversate, Ultracrepidarian, Vicissitude, Weltschmerz, Xeric, Yclept, Zeitgeist, Absquatulate, Bumbershoot, Callipygian, Dord, Ergophobia, Fartlek, Gobbledygook, Houghmagandy, Interrobang, Kakistocracy, Lollygag, Mumpsimus, Nudiustertian, Omphaloskepsis, Pogonotrophy, Quire, Ratoon, Snollygoster, Tittynope, Ucalegon, Vagitus, Widdershins, Xylopolist, Yarborough, Zenzizenzizenzic.

grey-area•1h ago
Got a bit boring then suddenly very hard with some really esoteric words at the end in the ‘grandmaster’ level. It’d be nice if it got progressively harder without levels.

Some definitions were not great and alternatives a little silly at times but on the whole seemed pretty accurate.

Also probably needs calibrated as 96/100 was projected to 77k words, what would the estimate be for 100/100?

yousif_123123•1h ago
This was fun! And it told me I know 55k words which made me a little happy.

I'm not sure exactly how you did this, but I think you asked an LLM to come up with the wrong options. Two things to consider:

1. While the LLM can go r good options, they won't be always hard to guess. I wonder if instead you can have the LLM generate very close words (or skip using an LLM entirely) and put those as the options. 2. If you will generate options with an LLM, make sure you are mindful of its inability to shuffle things around. The correct answer was overwhelmingly the first or second option in the list. You should ask the model to give the options in a uniform order (say from true meaning then decreasing amount of replayability), then manually shuffle them so that the probability of which option (A, B, C or D) is always 25%.

ronbenton•1h ago
Some felt too easily guessable. Too many joke answers maybe?
ekjhgkejhgk•1h ago
I was doing well until I got to grandmaster.

Then I was doing poorly in grandmaster, until I realize you can ace grandmaster by just picking the longest explanation every time.

pgraf•1h ago
Really interesting, but I would love to be able to express honestly when I just guessed. This way the result would be much more scientifically sound. Four answers have a 25% chance of random correctness, which is a bit high in my opinion. I think either adding a "I don't know" or a confidence level (Known/educated guess/wild guess) would help.
naishoya•1h ago
"77,250words "Unbelievable. Are you actually Stephen Fry in disguise?"

I do concur that a refined collection of incorrect proposed responses which includes selections among terms with semantic proximity, conflated synonyms and plausible morphology could refine the accuracy of evaluations; and if the test was intended to bestow authentic assessments of lexicographical capability this would in all probability become an efficacious approach, but as a simply presentable quiz for folks with sesquipedalian proclivities I was not unduly discomfited by anything moreso than the extraneous clicks leading to and following the display of dichotomous determinations.

kubb•1h ago
Same here (72 750) but it doesn't feel right. I'm not a native speaker and I was able to guess some of them via elimination or cognates.

I'd say I know 10 000 words tops.

grey-area•53m ago
You may know more words than you think, many are shared with French and other Romance languages, particularly the more esoteric ones (see what I did there?). Taking another recherché example: palimpsest - very similar in English, French, Greek.
collabs•1h ago
I got 70,750 which is much higher than I expected. The early words were obvious. However, a lot of the later questions I could only answer because they were multiple choice. If I had to actually come up with a definition, I suspect my score would be much lower.
WithinReason•1h ago
81,250 97/100 without being a native speaker. Although truth be told only because I figured out how to guess well.
itvision•1h ago
Scientific Estimate: 36,250. Nah, I'm far worse.

Probably not too bad for a person whose native language is not English.

fp64•1h ago
When there are two options that describe exactly the opposite of each other, it will be one of them. Reduced a bit the fun - but then again, for some words I understood what they are dealing with, but not whether positively or negatively.
franciscop•1h ago
Only got 63,150 words. Considering English is the 3rd language I learned, I think I did pretty well.
EstanislaoStan•1h ago
Literally when I got to advanced and beyond just picking the longer and more complicated looking answer was the right one. I think this test is extremely flawed.
nickcw•1h ago
I have a copy of the shorter Oxford English Dictionary from 1970 which I inherited. It is two massive volumes and is only shorter in comparison to the full dictionary which is 12 volumes (more in more modern editions).

My shorter OED contains 163,000 words (compared to the 600,000 words of the longer).

According to this site I know 71,000 words... Let's test that against the OED. I should have about 43% chance if knowing a word picked at random.

In my totally scientific test (ha) I chose 50 words at random from the OED and discovered I knew 29 of them for a score of 58% which is more than two sigma from 43%, this disproving the hypothesis.

I forgot what that was now, but it was a fun experiment.

pclmulqdq•59m ago
I also got something around 70-80k with 95/100 correct words (I don't know or use most of these words, but the later sections have a lot of words with Greek or Latin origin, which made them easy to guess). One of my wrong words was a misclick in the first section, which I think dragged down the estimate quite a lot. You may have done something similar. I assume they use a simple formula where early misses cost you a lot and late misses cost you very little.
srean•58m ago
Neat way to validate.

Your method of sampling could be improved further, unfortunately at the expense of ease of use. If the dictionary was sorted according to difficulty, then you could use stratified sampling.

I comment on the related aspects here.

https://news.ycombinator.com/item?id=48599769

cake-rusk•1h ago
Apparently I am Stephen Fry in disguise :D

My score: 78,000 words, 20/20/19/18/18.

waterpowder•1h ago
69,250 (91/100) - I think being French helped a lot for the most complex words, as they're basically the same!
walthamstow•1h ago
76250, or 93/100. Native English speaker from London. Some of the last 10 words were seriously obscure.

Are accoutrement and ziggurat really English words? Accoutrement is even pronounced as French!

stavros•1h ago
Depending on what you consider an "English" word, anywhere from 0% to 100% of words are English words. I've definitely seen accoutrement and ziggurat in English, and quite often.
walthamstow•1h ago
Of course, the line is very blurry. I've used accoutrement(s) in English many times, but I've never considered myself to be speaking English when I use it. It's like joie de vivre or c'est la vie.
stavros•55m ago
What about "rendezvous", or "etiquette", or "RSVP", cliche, nuance, etc? Do you consider those French or English?

As you say, the line is very very blurry.

walthamstow•24m ago
Rendezvous and cliche yes. Nuance, etiquette, RSVP no. It's instinctive so I can't explain but maybe because rendezvous and cliche require using French pronounciation. On this I think you could find more differing opinions than there are possible answers.
jcattle•1h ago
there's also https://www.myvocab.info/en

From what I can tell they actually have a bit more robust science behind their algorithm (and a lot less questions to answer)

Jordan-117•56m ago
This one's much better. Shorter, faster, adapts to one's level, gives an out for being unsure, largely doesn't bother with definitions (except the occasional verification challenge), and even mixes in some fake words to ensure you're not BS-ing.
spelufo•1h ago
Nice. I want one in Spanish so I can compare results.
srean•1h ago
In addition to how much fun it was, it has potential pedagogic value for teaching sampling based estimation.

It would have paired well with an exposition of vanilla Monte Carlo and the benefits of stratified sampling.

Although stratified sampling is good, one can do better in this case by using adaptive sampling, where one uses a runtime (Bayesian) estimate of vocabulary to maximize information gain per question -- preferrentially sample from those strata where the current strata specific estimate has higher variance.

Joe_Cool•1h ago
Getting "Obfuscate" as #99 and "Quixotic" as #100 made me feel exorbitantly smart.
alentred•1h ago
Good fun! At first I was scared of having to answer 100 questions, but when the words got more sophisticated it turned to be more engaging. Also, the result is good for self-esteem! :) Many thanks to the author!

I wonder if the test is calibrated to the fact that some answers are just well guessed? I am not a native English speaker, but I speak 3 languages overall and have basic notions in Latin, and I have to admit it helped a lot in "deciphering" a few words that I didn't know at all. And in at least 2 cases I just guessed correctly.

Findecanor•1h ago
I got an estimate of 70,550, from a score of 87/100 (20/18/16/17/16). Not native English speaker.

I suppose the words must be weighed, because other people in the thread with more correct words got a not much higher estimate.

steve_adams_86•51m ago
Strange. I got a lower estimate despite getting more correct than you and getting more grandmaster words.

Admittedly I had to guess several. It’s kind of an etymological deduction and estimation game at times.

naishoya•13m ago
There's no need to suppose:

From the website with just one more click - like one more wafer thin mint.

<snip> According to the Oxford English Dictionary (Second Edition), there are approximately 171,476 words in current use.

However, most native speakers have an active vocabulary between 15,000 and 35,000 words. The Algorithm

We use Stratified Sampling. Instead of testing random words, we divide the language into 5 distinct difficulty bands based on frequency of use:

    1. Core Basics~3,000 words
    2. Intermediate~7,000 words
    3. Advanced~10,000 words
    4. Expert~25,000 words
    5. The Obscure~40,000+ words
Calculation

"If you answer 2 out of 3 'Intermediate' questions correctly, we estimate you know roughly 66% of the 7,000 words in that band."

Total Score = Σ (Accuracy in Band × Band Size) </clip>

stavros•1h ago
I got 98 words right and it estimated I know 82k words. That's less than half the quoted 170k number, so what would it have estimated at 99 or at 100?
egypturnash•1h ago
“You mastered 98 new words! THE VERDICT

You are a person of few words, or perhaps just a mysterious one. Quite intriguing.”

—- This sounds more like a cute assessment of only getting two words right. And what do you mean “new words”? It wasn’t until eighty-odd words in that I actually got a word I didn’t know and had to guess by ruling out multiple-choice options.

steve_adams_86•54m ago
Nice work. I only got 90. It also summarized that as though I might learn English one day. Kind of an odd result. I’m not offended, just confused.
egypturnash•3m ago
vibe-coded index into the list of comments is backwards I guess
Glyptodon•1h ago
Some of the definitions offered are slightly short of what I expect. Like for "Obsequious" it offers "obedient to an excessive or servile degree" which isn't wrong, but it misses the expression of a sort of noisy eagerness in that servility.
thenthenthen•57m ago
Yeah, some definitions are super weird or overly specific, like ‘yield’ > ‘a specific amount of agricultural produce’ (iirc, ymmv)
HyperL0gi•1h ago
UX suggestion to make going thought this much faster:

1. Frame each option with one key (1,2,3,4). User press 2, select the second option

2. Let the user change options if they want until they press Enter. Enter submits the answer.

3. Once submitted, another Enter brings the next one

NateEag•57m ago
As a fluent native speaker who has read thousands of books and sometimes reads dictionary entries for fun, a number of these definitions are actually slightly off.

"Verbose," for instance, is defined as "Using more words than are needed."

That's not exactly wrong, but it's kind of misleading. "Verbose" explicitly means using a large pile of words, drowning the reader in far more words than are strictly necessary.

"More words than are needed" could be as limited as "used a three-word construction in a sentence where it could have been one."

There are many more like this.

Please, I beg all of you - don't use LLMs to generate linguistic slop that claims to be linguistic education.

I weep for the world that is to come.

mattas•54m ago
I had no idea there was an English word specifically to describe throwing someone out of a window. Defenestrate.
chromatin•53m ago
The UX is awful - I bailed out at 25/100 JUST IN LEVEL ONE (BASICS)

Might I suggest adaptive difficulty? After getting 10, 15, 20 correct in a row it should scale up the difficulty immediately, rather than waiting for 100 in the basic level 1...

scary-size•27m ago
Check button hidden under the URL bar thing in safari, progress bar hidden when scrolling check button in view. In between endless whitespace.
juancn•52m ago
The triple click is annoying.

I mean, select the word, then press check, then press continue.

It could be one single click and move to the next, show me my last result at the same time you ask me for the next one.

dbingham•52m ago
If the goal is to actually calculate how many words we know, then you should include an "I don't know" option. Sure, some people will choose to guess to inflate their score, but some of us will be honest because we legitimately want to know our scores.

If you force me to guess, then I'm going to guess. Not only does that give me a 25% chance of getting it right at random, but as others have pointed out, it is very hard to make a multiple choice question that isn't guessable by an astute enough test taker. I think I knew 80 - 85 of those words, but I scored 97, because those questions were very guessable.

Also, reiterating everyone else's comments with respect to the UX needing fewer clicks, and also the definitions not being exact or precise in many cases.

alkyon•48m ago
I only got 4 wrong as a non-native speaker. Okay, I'm widely read in English, but among LLM-generated definitions it's just too easy to spot the right one.
vova_hn2•46m ago
Got 59,800, Performance Breakdown:

Core Basics 19/20

Intermediate 17/20

Advanced 19/20

Expert 14/20

Grandmaster 12/20

I guess, it's not too bad for a non-native speaker.

Minor feedback:

1. The correct answer for "Lethargic" is "Affected by lethargy". I think, definitions should not use words that share common root with the defined word, because:

a. it makes guessing too easy

b. it basically becomes a circular definition which is meaningless

2. Options almost always include 1 correct answer, 1 direct opposite and 2 completely random. Once you learn to recognise it, you can easily rule out 2 random options and have a 50/50 guess.

philipwhiuk•45m ago
The four options were generally:

* Correct word * Opposite definition * Another word's definition * Opposite of that word's definition

Which massively reduces the difficulty

cainxinth•39m ago
79k. Missed three from the last group: Vagitus, Yarborough, and Quire.
sceptic123•35m ago
Yarborough is _also_ an English town so I should have got one more
extra88•27m ago
Same. Also, proper names should be excluded entirely; the only "Advanced" one I got wrong was a place name.
sireat•31m ago
This is rather like SAT from 35 years ago.

Same strategies apply for guessing the unknown especially with a modicum(it was on the test!) of Latin knowledge..

Strange that pretty every one here is getting 70k estimates (93/100 for me).

Feels a bit high at least for me as a non-native speaker.

I got 2 words I knew wrong, and guessed about 5 unknown words correctly. Those were bizarre repetitive words I've never seen before.

I remember doing a similar test from a reputable university about 10-15 years ago also in an app format and only got about 30k estimate.

zeristor•21m ago
This is something that could be done for other languages, word lists are easy.

I’m not sure how you’d gauge what knowing each word would indicate.

Also adequate options, that sound plausible.

zulux•1h ago
In order to stunt on the pors, English borrowed a fair amount of Latin and Greek directly - especially in law, philosophy, and the sciences.
wongarsu•1h ago
English has this weird dichotomy where most of the words in a typical sentence are Germanic, while most of the words in the dictionary are French.

Fun fact: according to a quick count by AI using web search, the previous sentence contains 21 words of Germanic origin, 2 of Latin origin, 2 of Greek origin and 1 of French origin. Also the etymology of the word Germanic is Latin, while that of the word French is Germanic

graemep•1h ago
They are not. Quite a few have Latin roots and the like that corresponding French words share.
pessimizer•1h ago
Approximately 0.0% of those came into English through Latin, while around 100% came through Norman French.
grey-area•1h ago
Latin was commonly spoken at one time and used for religion and scientific discourse for even longer.
I_am_tiberius•1h ago
French english speakers usually have a quite good vocabulary. Getting to the point of speaking english is a milestone that's quite difficult for french speakers though.
triceratops•1h ago
English is the PHP of human languages.
GeoAtreides•18m ago
I'm not sure PHP deserved that...
bobson381•15m ago
Huh, interesting. I retract my previous statement! I'd love to read about this if you have a source.
•
1h ago
Yeah I guessed that one right because xylophone player sounded like a trap.

I don't understand how they rank words though, some extremely common words like xenophobia were ranked as high as much more obscure ones.

rationalist•1h ago
66k for me, but I didn't get that word, instead I got ones like Hippopotomonstrosesquippedaliophobia, Flibbertigibbet, and Brobdingnagian... which the latter two interestingly do show up in my keyboard's word completion suggestions.
fittingopposite•1h ago
Haha. Yeah I figured Xylo- (wood) + sth. related to mono-poly so wood-seller made sense. Never have heard of this word before
pclmulqdq•52m ago
I think the test was vibe coded, because a xylologist is someone who studies wood, not someone who sells wood. I am not sure if "xylolgist" was the exact word, though.

xylo- = wood; -logy = study

Indeed from M-W: "a branch of dendrology dealing with the gross and the minute structure of wood"

onionisafruit•1h ago
It would have been nice to have an “i don’t know” button. Instead I decided to select the first option for words I didn’t know instead of trying to figure them out. Although when I got to the final group I couldn’t resist trying to figure them out. It estimated 61k for me.
vova_hn2•39m ago
> A tangent: writing distractors for multiple choice questions is hard.

In case of online quiz you can have a "competition" between distractors:

1. start by having much more distractors than needed and pick randomly

2. for each measure the probability of it getting clicked (clicks/times it's shown)

3. show the most frequently clicked distractors more often

mpeg•1h ago
It'd also be a lot less awkward to go through 100 words if it had keyboard shortcuts (1-4 for the words, enter to submit) and if they fixed the layout shift jank
goodmythical•34m ago
wouldn't even let me tab to sumbit, you had to click, tab through each following option, then to submit, but then you had to tab again to confirm the submission!
vova_hn2•43m ago
> I suggest skipping the submit button and just showing it's correct when pressing and moving on after a sec or so.

Having an answer counted as incorrect, just because I've accidentally touched the screen of the phone? I would absolutely hate that.

analog8374•1h ago
it's intentional. therefore testing vocab isn't the point.

I'm guessing it's testing our susceptibility to machine-generated compliments

latexr•1h ago
> it's intentional.

What is?

> I'm guessing it's testing our susceptibility to machine-generated compliments

I fail to see the point. For one, the compliments aren’t particularly good or interesting; for another, I didn’t even read them (I just went back to check after your comment), I simply clicked when seeing green.

analog8374•1h ago
too many clicks per word. and the distance between click points. that's intentional.

well the point would be to see how susceptible you are to that. They're figuring out where your cost vs reward tipping point is.

philipwhiuk•47m ago
There's a small handful, mostly QI-inspired.
latexr•5m ago
I think you’re reading too much into it. I think it’s just a common design pattern that was copied and is clearly optimised for mobile, where the distance doesn’t matter that much.

Anyway, if they were running metrics on that they just became useless because I automated responding to it a bunch of times.

https://news.ycombinator.com/item?id=48598586#48600403

sandworm101•1h ago
100 is too many? Thats two or three minutes at most.

I would suggest a bias in this test towards reading. More than a couple are words i know but rarely see in print. But maybe im too much a fan of british TV so i hear many of thier words without seeing them written down.

sd9•1h ago
Did you actually do 100 words? It wasn't two or three minutes. With good UX, sure. But I wasn't getting through 1 word per second.
DC-3•1h ago
It also doesn't get hard enough. Also way too many of the words are just words about long words, or the tendency to be verbose.
thenthenthen•1h ago
Lol. Yeah. Non native here but gave up at about 50 words. Too many words, too easy. And my English SUCKS
alentred•59m ago
> It also doesn't get hard enough

Oh come on! Like you really knew what "Hippopotomonstrosesquippedaliophobia" is?

iugtmkbdfil834•49m ago
:D I did better than expected, but I did miss that one. I learned some fun ones.
philipwhiuk•48m ago
It does get hard enough but only in the very last fraction.

Zenzizenzizenzic for example.

sowbug•21m ago
Plus a scroll on mobile because the submit button is below the fold, though it seems to stay in the right place after the first scroll.
cyanydeez•1m ago
yeah, it should just be click->next;

I got tired after 8 words, looked at how many I'm suppose to know and gave up.

It'd be improved with statistical analysis; just progressively get harder and try to guess. If you wanted to gameify, you could update the stats after each answer.