frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: TimeTracker PWA built with GunDB – 100% privacy friendly with sync

https://time-tracker.hosgeldin.click/
1•hrkucuk•1m ago•0 comments

Halley's Comet wrongly named: 11th-century English monk predates British

https://www.universiteitleiden.nl/en/news/2026/01/halleys-comet-wrongly-named-11th-century-englis...
1•ohjeez•1m ago•0 comments

Elon Musk is taking SpaceX's minority shareholders for a ride – Nils Pratley

https://www.theguardian.com/business/nils-pratley-on-finance/2026/feb/03/elon-musk-is-taking-spac...
2•abdelhousni•1m ago•0 comments

What questions do you have about using MCP servers with Postgres?

1•pgedge_postgres•1m ago•0 comments

Show HN: Engineer Profiles

https://engineerprofiles.com
1•skellertor•2m ago•1 comments

Bencher – Continuous Benchmarking

https://github.com/bencherdev/bencher
1•sea-gold•3m ago•0 comments

Show HN: Agent Box – Instant Sandbox VM for Claude Code(Macs)

https://github.com/Zabaca/agent-box
1•uptownhr•3m ago•0 comments

Some Data Should Be Code

https://borretti.me/article/some-data-should-be-code
1•ingve•3m ago•0 comments

DeepSeek R1 new distill models [video]

https://www.youtube.com/watch?v=fFL7la73RO4
1•GTP•4m ago•0 comments

Lockin, a PDF TTS reader for manuals and papers cited Q&A

https://lockin.pageyard.org/
1•lockin__•4m ago•0 comments

How to Make Package Managers Scream (FOSDEM'26)

https://www.youtube.com/watch?v=PBlDHlFnzGo
1•boegel•5m ago•0 comments

A Journey into Understanding the IDE Bus

https://www.crowdsupply.com/polpotronics/picoide/updates/a-journey-into-understanding-the-ide-bus
1•geerlingguy•6m ago•0 comments

There is no evidence for X

3•cadabrabra•7m ago•5 comments

So We Built Our Own Agentic Developer

https://builders.fullscript.com/posts/lessons-learned-from-building-nitro-fullscripts-autonomous-...
3•ncrum•11m ago•0 comments

The Art of Being Lazy(log)

https://www.warpstream.com/blog/the-art-of-being-lazy-log-lower-latency-and-higher-availability-w...
1•ordinarily•13m ago•0 comments

Scientists Discover Life Thriving Beneath Fukushima's Dead Reactors

https://dailygalaxy.com/2026/02/strange-life-under-fukushima-dead-reactors/
1•SunshineTheCat•14m ago•0 comments

Technocracy 2.0

https://brooklynrail.org/2026/02/field-notes/technocracy-2-0/
2•antonomon•16m ago•1 comments

Something Wild Going on with Emails?

2•trevyn•16m ago•0 comments

Home Assistant Comm Badge

https://github.com/graffitiwriter/Home-Assistant-Comm-Badge
2•taubek•17m ago•0 comments

SanDisk crushes wallets with up to 2.8X SSD price hikes

https://www.tomshardware.com/pc-components/ssds/sandisk-crushes-wallets-with-up-to-2-8x-ssd-price...
3•vmykyt•20m ago•0 comments

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
2•theblazehen•23m ago•0 comments

Sh-DSL – Write/Use Shell with Janet

https://janet-lang.org/spork/api/sh-dsl.html
1•veqq•23m ago•0 comments

Exploring Different Keyboard Sensing Technologies – LTT Labs

https://www.lttlabs.com/articles/2026/01/27/exploring-different-keyboard-sensing-technologies#buc...
1•rbanffy•23m ago•0 comments

Windsurf Tab v2

https://windsurf.com/blog/windsurf-tab-2
2•swyx•24m ago•0 comments

Securely run Claude Code agents in Docker

https://edspencer.net/2026/2/4/run-claude-code-agents-docker-herdctl
1•edspencer•24m ago•0 comments

Hand-Crafting Domain-Specific Compression with an LLM

https://engineering.nanit.com/hand-crafting-domain-specific-compression-with-an-llm-3c42f5c2b070
1•PaulHoule•25m ago•0 comments

The perks of being a mole rat

https://worksinprogress.co/issue/the-perks-of-being-a-mole-rat/
1•ortegaygasset•25m ago•0 comments

Show HN: A TikTok-style research paper reader

https://pokepaper.com/
1•hajimi_hacker•26m ago•0 comments

PaperBanana – Automating Academic Illustration

https://paperbanana.org/
1•bilsbie•27m ago•0 comments

Readr, Safari-Like Reading Mode for Chrome

https://github.com/login
1•ymolodtsov•27m ago•2 comments