frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: An AI eval based on a silly joke from an underrepresented language

https://kapuskonda.vercel.app/eval
1•ad--astra•1mo ago
Marathi is an Indian language with 83 million speakers, but it's underrepresented as text online. There's a silly joke every Marathi-speaking kid learns: kapus kondyachi goshta (the story of the kapus konda). Jokes like this spread orally, not through text.

It's not a real joke. There's no punchline. It's pure infinite-loop trolling—the kind of thing kids use to annoy each other or adults use to tease children.

Someone asks: "Can I tell you the story of the kapus konda?"

You say yes, no, whatever. Doesn't matter. There is no story. Your answer gets echoed back, and the question repeats. Forever.

"No." "What do you mean 'no'? Can I tell you the story of the kapus konda?" "Fine, tell me." "What do you mean 'fine, tell me'? Can I tell you the story of the kapus konda?"

That's it. That's the whole joke.

I turned this into an AI eval: https://kapuskonda.vercel.app

The words "kapus konda" mean nothing coherent, at least AFAIK, although kapus = cotton, konda = bran. So models that don't know the joke try to make sense of it. They hallucinate elaborate stories.

I tested 31 models two ways: recognizing the joke when someone initiates it, and performing the joke themselves. None of them got it.

Bonus: with web search enabled, Claude Opus 4.5 (on Claude.ai) passed. The gap is real, but retrieval helps.

All prompts, responses, and scoring visible on the site.

Feedback welcome. This is my first eval and I'm sure there's stuff I got wrong.

Also curious: does your language/culture have a something like this that would make for a good eval?