frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Smart Home for Beginners: Where to Start

https://aigadgetexpert.com/best-smart-home-beginners-2026
1•amghal•28s ago•0 comments

Jersey Mike's confidentially files for IPO

https://www.cnbc.com/2026/04/20/jersey-mikes-ipo.html
1•lxm•1m ago•0 comments

Pica: Better Font Management for macOS

https://pica.joshpuckett.me/
2•jbegley•5m ago•0 comments

Stb_AVIF: A pure C89, Libc-only AVIF decoder in stb-style single-header form

https://github.com/lenchan139/stb_avif
1•roytam87•6m ago•1 comments

Can you make a picture of a dog wearing a hat

https://dispatchesfromthefuture.substack.com/p/can-you-make-a-picture-of-a-dog-wearing
1•JoiDegn•7m ago•0 comments

Show HN: Local, agent-friendly double-entry bookkeeping and tax prep

https://github.com/andrewchilds/moneypit
1•andrewchilds•7m ago•0 comments

Substack added a scheduler. Here's why I kept building PubQ anyway

https://www.indiehackers.com/post/substack-added-a-scheduler-heres-why-i-kept-building-pubq-anywa...
1•rkapdi•9m ago•0 comments

Trump's Landman Iran Strategy [video]

https://www.youtube.com/watch?v=VZsm3Z2njAQ
1•keepamovin•10m ago•0 comments

They Built the 'Cursor for Hardware.' Now, Anthropic Wants In

https://www.wired.com/story/schematik-is-cursor-for-hardware-anthropic-wants-in-on-it/
1•CharlesW•10m ago•0 comments

My Linux Setup for Work and Life – NixOS, Niri, Helix [video]

https://www.youtube.com/watch?v=CeUOz_xtO-o
1•AnthOlei•10m ago•0 comments

Show HN: Kern – Agents that do the work and show it

https://github.com/oguzbilgic/kern-ai
1•obilgic•13m ago•0 comments

Sony implementing age verification for PlayStation users

https://twitter.com/CR1337/status/2046427329866694676
2•CR1337•17m ago•1 comments

The Ferrari of Espresso Machines Is Fueling a Hot Resale Market

https://www.nytimes.com/2026/04/20/dining/la-marzocco-espresso-machine.html
3•mitchbob•21m ago•1 comments

Voice to Instrument

1•starkiron•22m ago•0 comments

Wormhall

http://iladelf.org/wormhall/index.html
1•madprops•22m ago•0 comments

Claude Desktop Works with OpenCode Go

https://gist.github.com/avarayr/a9a35354aa6d7d8430ce0c27cd9aff3f
1•mikamika83•23m ago•0 comments

Mathematician Collapses All Functions to One Weird Formula [video]

https://www.youtube.com/watch?v=hwtqJaS42xk
2•darepublic•31m ago•0 comments

The SF Group Chat

https://twitter.com/daniel_dhawan/status/2041913527045386447
1•nowflux•34m ago•0 comments

It's not just one thing – it's another thing

https://techcrunch.com/2026/04/20/ai-writing-its-not-just-this-its-that-barrons/
1•davikr•35m ago•0 comments

Show HN: I built an AI that assigns YOU tasks

https://www.pause.build/
1•chaidhat•54m ago•4 comments

Apple iPhone texting changes: they fixed everything and changed nothing

https://webmatrices.com/post/apple-iphone-texting-changes-they-fixed-everything-and-changed-nothing
2•bishwasbh•55m ago•0 comments

Show HN: pg_roast – A Postgres extension that harshly judges your database

https://github.com/samirketema/pg_roast
2•samirketema•57m ago•1 comments

Homeland Security is making "smart glasses" to collect intelligence on Americans

https://www.kenklippenstein.com/p/exclusive-ice-glasses
7•c420•1h ago•0 comments

Red Queen Hypothesis

https://en.wikipedia.org/wiki/Red_Queen_hypothesis
5•Hooke•1h ago•0 comments

FanDuel wants to carve a sports niche in the prediction market business

https://www.cnn.com/2026/04/19/tech/fanduel-prediction-markets-app
1•1659447091•1h ago•0 comments

"You're mad Lad figured it out " – OpenClaw creator [video]

https://www.youtube.com/watch?v=7rzYDM6vMtI
2•0xAntonioo•1h ago•0 comments

String Seed of Thought: Prompting for Distribution-Faithful, Diverse Generation

https://pub.sakana.ai/ssot/
1•hardmaru•1h ago•0 comments

Show HN: Palmier – bridge your AI agents and your phone

https://github.com/caihongxu/palmier
2•caihongxu•1h ago•0 comments

pnpm v11 is almost here

https://twitter.com/pnpmjs/status/2045901598006690244
1•bpierre•1h ago•0 comments

Futuristic analyser tool? – what is this – omg

https://rogmash.neocities.org/3drein
1•rogmash•1h ago•0 comments