Show HN: I Built Paul Graham's Intellectual Captcha Idea

27•nowflux•2h ago

PG has posted about improving social networks using something like an "intellectual CAPTCHA" many times [1][2][3][4] - "Make users pass a test on basic concepts like the distinction between necessary and sufficient conditions before they can tweet."

I felt the same way. So I built one using a mix of simple math, logic, and Twitter/X Community Noted posts. Try sample questions here - https://mentwire.com/sample - without signing up.

- Invites are temporarily open to HN users.

- Onboarding test + one daily question before accessing feed, post or reply.

- Posts authors are anonymous until upvoted or downvoted, forcing evaluation of content on merit.

- Face ID (on-device only) to post/reply, pangram checks for AI text.

Sourcing good questions turned out to be much harder than I thought. If you have suggestions to scale this, I would love to hear. Eventually, could be gated across disciplines/topics to get a competence × interest graph instead of the pure interest graph of today's social networks.

[1] https://x.com/paulg/status/1235949761359904768 [2] https://x.com/paulg/status/1576517990182359040 [3] https://x.com/paulg/status/1514979883948126209 [4] https://x.com/paulg/status/1505842647319126016

Repost from https://news.ycombinator.com/item?id=47577829. This link contains a full quiz and linked directly to sample to try without signing up.

Comments

philipkglass•2h ago

At least one of the test questions was just a screen shot from a tweet. It was difficult to read. I'd suggest extracting text from screen shots with OCR. Apple has built-in functionality for this on their operating systems with Live Text. There are strong open source systems based on small vision language models for this, too. The one I have been recommending lately is GLM-OCR:

https://github.com/zai-org/GLM-OCR

It's fast and can run even on low-resource computers.

---

Does this CAPTCHA actually resist computers? I didn't try feeding the questions I got to an LLM, but my sense is that current frontier models could probably pass all of these too. Making generated text pass the pangram test is simple enough for someone actually writing a bot to spin up automated accounts.

tripplyons•2h ago

I think it's more about resisting some humans than it is about resisting machines.

_alternator_•2h ago

Two mild concerns: first, I missed one and it told me I didn’t miss any at the end.

Second, some of the logic problems have flawed premises (eg All licensed pilots must pass a medical exam. Jake is a licensed pilot, therefore Jake passed a medical exam.) If you see the flaw in the premise (it assumes no fraud) then the conclusion does not follow.

Im not sure you’re going to be able to actually improve human discourse this way. The idea that it’s ‘irrationality’ that’s the source of xitters problems is far too shallow to really make a change.

rogual•2h ago

I took the pilot one as an abstract logic type question where you're supposed to assume the premise is true, so I said yes and the page said I was right, because that's a "valid logical deduction" or something.

Then there was another question in the same format that said "if you study hard enough you'll pass the exam. You didn't pass, so you didn't study hard enough." So I thought, oh, another logic one, and said yes to that one too, but the page was like, "not quite! You might fail for other reasons!"

alwa•1h ago

And if, as OP says, it’s necessity and sufficiency we’re testing—whether or not there were also other reasons contributing to your exam failure, wouldn’t failing that one necessary condition be sufficient to fail the outcome?

wlkr•1h ago

Yes, I also assumed an imperfect system with cases of fraud for the medical exam question and was quite surprised by the overly simplistic response.

Windchaser•30m ago

> If you see the flaw in the premise (it assumes no fraud) then the conclusion does not follow.

Right. Or he could've been grandfathered in.

But more basically: this is logically valid, but not logically sound. These are two different ways in which something may be "true" or "false", and in this format, it's not completely clear, soundness vs validity. Based on context clues like the absurd premise of pilots -> medical exam, I assumed validity, but it's still a weird format.

snissn•2h ago

This is weird political propaganda. The first post misrepresented annual costs of housing.

dwroberts•1h ago

Yeah this was my immediate impression as well. The assertion that homing every homeless person is the same as putting them up in unique per-person accommodation that ends up costing 65 billion is absurd.

littlestymaar•1h ago

The first one is utterly stupid.

Housing is a very complex issue that goes well beyond the sheer cost of the housing unit.

Do I think “solving homelessness” is easy with $10B? No. Does the calculation made in the answer makes any sense: absolutely not.

southerntofu•1h ago

Came here to say this. It takes political will, but money is certainly not the issue. It's just like with healthcare, free healthcare for all costs much less per person than privatized healthcare like they have in the USA. It's just a matter of designing the system properly so the money doesn't get siphoned off by parasite investors.

adamm255•1h ago

All the references about Data Center water use were from one guy too.

metalliqaz•2h ago

it's a really nice idea, but of course completely antithetical to the business model of modern social media platforms. So, it will never go anywhere. HN might be the only locale with any real numbers that I could see actually using it. Even BlueSky I think could never risk something like this.

as an interesting thought experiment, consider the questions that TruthSocial would put in. would an average unsophisticated user be able to tell the difference between your product and a hopelessly biased version such as that? they would support the correct answers with their own misinformation. Would it be just another schism of reality?

voodooEntity•2h ago

Funny thing had to laugh :)

airza•2h ago

I opened it, it told me it was impossible to build a house in california for less than 350K, i closed it

littlestymaar•1h ago

Worse: the author probably assumes it's $350k per year since they are comparing to a yearly expense.

Intellectual captcha™

next_xibalba•1h ago

Same. And I'm not even focused on whether this is a reasonable number or not. The quoted tweet also says "But our politicians would rather spend that on genocide." And I'm asked to evaluate whether this is "accurate" with a thumbs up or thumbs down. (According to Mentwire, it is not accurate). So I'm evaluating both the cost of housing the homeless, but also whether politicians would rather fund genocide. So, this seems like it is not really an intellectual CAPTCHA, but rather an ideological CAPTCHA.

And just to disclose my biases, I would tend to believe that $350k is an absurdly high figure and that politicians are obviously not holding a vote where they are forced to choose between ending homelessness and funding genocide. But I believe that people who disagree with me can be considered intelligent and not "too dumb to pass an intellectual CAPTCHA".

ferfumarma•1h ago

Seriously. And not even to build a house, but to make a single person not homeless. Give me a break.

riffraff•1h ago

Perhaps the test was that if you finish the test you haven't passed it.

zarzavat•55m ago

Let's assume that the tweet is proposing to spend $10 billion per year to end homelessness in the entire US, since it contrasts it with genocide which is clearly a national objective not a local one.

A quick Google gives on the order of 1 million homeless people in the US. That's $10k per person per year which is the correct order of magnitude for the price of housing someone.

I believe OP missed the "per year" in the tweet that's why they are comparing to house prices rather than the yearly cost of housing, which is obviously much smaller because houses last longer than 1 year.

tomasphan•1h ago

I answered 8/10 correctly but mostly on instinct, for example betting that the Trump tweet is misleading. Opus 4.6 got 9/10 correct. You might need an internal time limit (don't show the user) and some strawberry questions.

throw_winblows•1h ago

Intellectual?

Centigonal•1h ago

One issue with this is that it mixes hypothetical formal logic style problems (where there are clear, inflexible rules) with real life examples (where group membership/traits, cost estimation, and causal attribution are less clear) without always disambiguating which one is which. Fun quiz though!

strangattractor•1h ago

Yes - I was given a syllogism that was logically correct but the Truthyness of the premises where wrong or misleading in the real world.

All pilots need a medical exam to have a license.

John is a pilot.

John has had a medical exam.

Pilots can be licensed without a medical exam. It is illegal for them to fly without a valid medical but the 2 are separate issues. Also LSA pilots do not need a medical.

dogleash•1h ago

> "Make users pass a test on basic concepts like the distinction between necessary and sufficient conditions before they can tweet."

If twitter ever became what he says he wants, he'd quit using it within a month. He already has the option to close twitter and seek out experts' writing. Why is he choosing to bask in the emotions generated by people being wrong on twitter?

It's like listening to a friend complain about twitter being "full of" content that you rarely/never see on your feed. Nah, that's their algorithm and they just told you exactly who they are.

blamestross•1h ago

Reminds me of IQ tests I took as a kid.

"Finish the sequence" with 4 options and "no pattern" as the choices.

It becomes "what does the moderately intelligent person who wrote the test thinks counts as a pattern" not the intended exercise at all. There was never enough samples to even guess at a real pattern in them.

gertop•1h ago

> Try sample questions here without signing up

It's very gracious of you to let us fill captchas without signing up first.

anon115•1h ago

with fine iterations i think it will get their the idea is their i see it

littlestymaar•1h ago

You can rename it “conservative circle jerk captcha” and it would be more accurate.

notsound•1h ago

I got a 10 out of 10 because I've seen these strawmans in center-right arguments before. Definitely promotes thinking inside the box; the homelessness question presupposes the most expensive solution (buying the homeless homes) in opposition to annual costs that would probably go down over time. I doubt both figures.

tomjen3•1h ago

I love the idea of this. I want it to succeed.

But most of the questions I got... They weren't very good and not just because I got them wrong - I got a bunch of them right that I shouldn't have.

For example, the one about homelessness, where it ends with a guy saying our politicians would rather use the money for genocide.

I downloaded the statement for that reason, got told my vote was correct and then it came up with it was correct only because of the first part of it.

I think you're trying to import statements automatically and I fear that won't work. I also fear that you're gonna get, just crap, to be honest. And your social network doesn't deserve that.

I think your best bet is to look at the kind of questions asked on the LSAT, and just do a bunch of essentially IQ and general-knowledge questions. Take the input from Twitter as inspiration, use it as a template and it might work.

One thing you might consider is wanting to filter out people who can't see past their own political agenda.

You can do that by making enough questions so that you're sure to catch people, no matter what they believe on all the hot issues of the day. This probably isn't as hard as it sounds, there's only going to be seven or eight hot issues.

You pick three of them and you should be pretty certain that you will cover the entire spectrum. So for example, you could make sure to include, pro LGBT, pro abortion and pro guns. You would catch most people on that and then you should exclude them if they cannot see past their blindness.

I hope you make this work, the world needs it.

Show HN: GovAuctions lets you browse government auctions at once

Show HN: I built a tiny LLM to demystify how language models work

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

Show HN: Vajra, a background coding agent with graph-based workflows

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Show HN: ComputeLock – Insurance to reduce unpredictable compute spend

Show HN: Weird Clocks

Show HN: I made a YouTube search form with advanced filters

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: ReverseCam – See yourself as others see you

Show HN: Tiny TUI for disk usage exploration

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds

Show HN: I just built a MCP Server that connects Claude to all your wearables

Show HN: I replaced Google Analytics with my own tool – no cookies, <1KB script

Show HN: A game where you build a GPU

Show HN: OsintRadar – Curated directory for osint tools

Show HN: M. C. Escher spiral in WebGL inspired by 3Blue1Brown

Show HN: I built a 2-min quiz that shows you how bad you are at estimating

Show HN: Ec – terminal native 3-way Git mergetool

Show HN: Contrapunk – Real-time counterpoint harmony from guitar input

Show HN: I made a crossword app for language learners

Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck

Show HN: I built a small app for FSI German Course

Show HN: I built a frontpage for personal blogs

Show HN: I developed a node editor framework using gpui

Show HN: Multi-agent coding assistant with a sandboxed Rust execution engine

Show HN: Apfel – The free AI already on your Mac

Show HN: sllm – Split a GPU node with other developers, unlimited tokens

Show HN: I made open source, zero power PCB hackathon badges

Show HN: Aiaiai.guide: Plain-English mental model for LLM apps, tools and agents

Show HN: I Built Paul Graham's Intellectual Captcha Idea

Comments

Show HN: GovAuctions lets you browse government auctions at once

Show HN: I built a tiny LLM to demystify how language models work

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

Show HN: Vajra, a background coding agent with graph-based workflows

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Show HN: ComputeLock – Insurance to reduce unpredictable compute spend

Show HN: Weird Clocks

Show HN: I made a YouTube search form with advanced filters

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: ReverseCam – See yourself as others see you

Show HN: Tiny TUI for disk usage exploration

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds

Show HN: I just built a MCP Server that connects Claude to all your wearables

Show HN: I replaced Google Analytics with my own tool – no cookies, <1KB script

Show HN: A game where you build a GPU

Show HN: OsintRadar – Curated directory for osint tools

Show HN: M. C. Escher spiral in WebGL inspired by 3Blue1Brown

Show HN: I built a 2-min quiz that shows you how bad you are at estimating

Show HN: Ec – terminal native 3-way Git mergetool

Show HN: Contrapunk – Real-time counterpoint harmony from guitar input

Show HN: I made a crossword app for language learners

Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck

Show HN: I built a small app for FSI German Course

Show HN: I built a frontpage for personal blogs

Show HN: I developed a node editor framework using gpui

Show HN: Multi-agent coding assistant with a sandboxed Rust execution engine

Show HN: Apfel – The free AI already on your Mac

Show HN: sllm – Split a GPU node with other developers, unlimited tokens

Show HN: I made open source, zero power PCB hackathon badges

Show HN: Aiaiai.guide: Plain-English mental model for LLM apps, tools and agents