frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Iran's largest crypto exchange is used by the IRGC to move millions

https://www.reuters.com/investigations/one-irans-most-powerful-families-founded-its-largest-crypt...
1•rmason•1m ago•0 comments

Show HN: Mnemo – local-first AI memory layer for any LLM (Rust, SQLite,petgraph)

https://github.com/zaydmulani09/mnemo
1•zaydmulani•3m ago•0 comments

After Tailoring My Resume, I Landed 3 Job Offers in 3 Weeks

https://old.reddit.com/r/ResumeTips/comments/1rby70o/after_tailoring_my_resume_i_landed_3_job_off...
1•nixass•3m ago•0 comments

Sipio – A minimalist tasting journal for coffee, wine, beer and other drinks

https://jirkapenzes.github.io/sipio-web/
1•jiripenzes•7m ago•0 comments

Systems for Making Systems

https://maxkapur.com/2026/06/03/metasystems.html
1•speckx•12m ago•0 comments

Enshittification, Despotification, and the Open Internet (by Mike Masnick)

https://www.liberalism.org/p/enshittification-despotification-and-the-open-internet
2•liotier•14m ago•0 comments

'All Systems Glow'

https://www.cnet.com/tech/all-systems-glow-apple-teases-wwdc-2026-with-new-tagline-playlist-wallp...
1•rbanffy•15m ago•0 comments

Show HN: Aeterna – Private, passwordless AI digital legacy vault

https://helloaeterna.com/
1•ls1911•16m ago•1 comments

TrustedRouter: One API, all the LLMs, provably private

https://jperla.com/blog/trustedrouter-one-api-all-llms-provably-private
2•ljlolel•18m ago•0 comments

Git and S3 as the memory layer for agents

https://twitter.com/VijitDhingra1/status/2062265896039833935
2•crush_robo_1536•19m ago•0 comments

My Software North Star

https://kristoff.it/blog/north-star/
2•kristoff_it•19m ago•0 comments

What your router knows (but won't tell)

https://david.weekly.org/blog/2026-04-16-what-your-router-knows/
1•ujeezy•19m ago•0 comments

How LLMs Work

https://www.0xkato.xyz/how-llms-actually-work/
2•0xkato•20m ago•0 comments

It's time to fly – Codex [video]

https://www.youtube.com/watch?v=bJcA23ckzcY
2•phyzix5761•23m ago•0 comments

A Man Who Reads Books for a Living (One Every Two Days)

https://lithub.com/the-man-who-reads-books-for-a-living-one-every-two-days/
3•gmays•26m ago•0 comments

Show HN: CLI for crawling documentation sites into Markdown with defuddle

https://github.com/artemnistuley/docrawl
1•nistuley•26m ago•0 comments

The Approach to Equilibrium

https://www.guidavid.com/writing/approach-to-equilibrium
1•gdss•27m ago•0 comments

Revealing the Frontier with Stacks and Queues

https://dystroy.org/blog/stack-and-queues/
1•g0xA52A2A•30m ago•0 comments

NULLs in ClickHouse can hurt performance

https://rushter.com/blog/clickhouse-nulls/
1•birdculture•31m ago•0 comments

Why are there no good tablets at the moment?

https://neilzone.co.uk/2026/06/why-are-there-no-good-tablets-at-the-moment/
1•speckx•31m ago•0 comments

Rewiring software delivery for the agentic era

https://www.mckinsey.com/capabilities/technology/our-insights/rewiring-software-delivery-for-the-...
1•igor_mart•33m ago•0 comments

Monitor all your servers from one beautiful dashboard

https://boxwatch.app/
1•genx-joe•34m ago•0 comments

Show HN: I created a React alternative using web componnents

https://createthirdplaces.org/tech/placesjs.html
3•gulugawa•34m ago•0 comments

Multi-stage distributed query execution in ClickHouse Cloud

https://clickhouse.com/blog/multi-stage-distributed-query-execution-clickhouse-cloud
1•samaysharma•35m ago•0 comments

Stophy for AI Agents

https://stophy.dev
1•hakiiizimana•35m ago•0 comments

Trump's Takeover of the American Regulatory Machine

https://www.wsj.com/politics/policy/trump-takeover-regulators-130b57a3
5•doener•36m ago•0 comments

Analysis of Canadian Surveillance Law Expansion Under Bill C-22 – CitizenLab

https://citizenlab.ca/research/analysis-of-proposed-surveillance-law-expansion-under-bill-c-22/
2•EmbarrassedHelp•38m ago•1 comments

PaceVer (an alternative to SemVer, for mobile apps)

https://pacever.org/
2•maxloh•38m ago•0 comments

How ClickHouse Became 26x Faster at Joins

https://clickhouse.com/blog/clickhouse-fast-joins
1•samaysharma•39m ago•0 comments

Can poppy seeds make you fail a drug test?

https://www.popsci.com/health/can-poppy-seeds-cause-positive-drug-test/
2•bryan0•40m ago•0 comments