frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Agentis – An AI-native programming language where the LLM is the stdlib

https://github.com/Replikanti/agentis
1•ylohnitram•1m ago•1 comments

iOS 26.4's new setting lets you disable another Liquid Glass effect

https://9to5mac.com/2026/03/09/ios-26-4s-new-setting-lets-you-disable-another-liquid-glass-effect/
1•latexr•1m ago•0 comments

Show HN: Free AI resume tailor I built after a recent layoff (235 users so far)

https://jobbi.app/
1•djrnz•2m ago•0 comments

Closing the verification loop, Part 2: autonomous optimization

https://www.datadoghq.com/blog/ai/fully-autonomous-optimization/
1•chrisra•3m ago•1 comments

From Tool to Employee: What Claude Code's /Loop Means

https://aieatingsoftware.substack.com/p/from-tool-to-employee-what-claude
1•sidsarasvati•4m ago•0 comments

Reversing Russian spyware I installed on my iPhone [video]

https://www.youtube.com/watch?v=XQvZ2mLnZVI
1•todsacerdoti•5m ago•0 comments

Agentic development environment extension taxonomy

https://droctothorpe.github.io/adeet/
1•droctothorpe•5m ago•1 comments

Worldwide Sidewalk Joy: Adding whimsy to neighborhoods

https://worldwidesidewalkjoy.com
3•NaOH•6m ago•1 comments

10K Curl Downloads per Year

https://daniel.haxx.se/blog/2026/03/09/10k-curl-downloads-per-year/
1•donutshop•6m ago•0 comments

Superpowers 5

https://blog.fsck.com/2026/03/09/superpowers-5/
2•arittr•10m ago•0 comments

Show HN: Git Trophy – 3D print your GitHub contribution graph

https://git-trophy.com/
1•Lukabuz•11m ago•0 comments

Trump is heading for a hard reckoning over Iran

https://spectator.com/article/trump-is-heading-for-a-hard-reckoning-over-iran/
2•leiftw•11m ago•0 comments

Reinforcement fine-tuning use cases

https://developers.openai.com/api/docs/guides/rft-use-cases/
1•teleforce•11m ago•0 comments

Bromure: An ephemeral browser that runs in a disposable virtual machine on macOS

https://github.com/rderaison/bromure
1•felineflock•11m ago•0 comments

QuickTERMINAL – A 10k-line single-file terminal emulator for macOS

https://github.com/LEVOGNE/quickTerminal
1•LEVOGNE•12m ago•1 comments

In Memoriam, Tony Hoare

http://lefenetrou.blogspot.com/2026/03/in-memoriam-tony-hoare.html
21•nextos•13m ago•7 comments

JavaScript with a native Rust host game engine. Built for vibe coding

https://github.com/Aura-Industry/auramaxx
1•chiubaca•14m ago•0 comments

Why right-wing media can't stop Candace Owens

https://www.salon.com/2026/03/04/why-right-wing-media-cant-stop-candace-owens/
1•tzs•17m ago•0 comments

How long do electric vehicle batteries last?

https://www.npr.org/2026/03/02/nx-s1-5706658/electric-vehicle-battery-lifespan
2•tzs•21m ago•0 comments

A Modular Computer That's Bringing Back Analog

https://www.hackster.io/news/a-modular-computer-that-s-bringing-back-analog-e02f07df7bf6
1•todsacerdoti•25m ago•0 comments

US blindsides states with surprise settlement in Live Nation/Ticketmaster trial

https://arstechnica.com/tech-policy/2026/03/us-blindsides-states-with-surprise-settlement-in-live...
8•voxadam•25m ago•1 comments

FBI is investigating breach that may have hit its wiretapping tools

https://www.theregister.com/2026/03/08/fbi_investigates_wiretap_system_breach/
2•Bender•25m ago•0 comments

Show HN: Git Worktrees Simplified

https://github.com/backnotprop/worktree-aliases/tree/main
1•ramoz•27m ago•0 comments

Kettle, open source tooling for TEE-attested builds

https://github.com/lunal-dev/kettle
5•indirect•28m ago•1 comments

SDL_mixer 3.2.0 (stable) is out

https://github.com/libsdl-org/SDL_mixer/releases/tag/release-3.2.0
1•linkdd•30m ago•0 comments

Making Prompt Injection Harder Against AI Coding Agents

https://medium.com/@cbchhaya/making-prompt-injection-harder-against-ai-coding-agents-f4719c083a5c
1•dpapathanasiou•30m ago•0 comments

Ask HN: General API for accessing bank transactions?

4•aykhm•31m ago•1 comments

What it means to be a 10x engineer (2025)

https://medium.com/@orzel.jarek/what-it-means-to-be-a-10x-engineer-0f5c4db543a6
3•orzeljarek•33m ago•0 comments

RPPG-Enabled Contactless Pulse Rate Monitoring Software in CVD Patients

https://www.mdpi.com/2306-5354/13/2/246
1•PaulHoule•33m ago•0 comments

Codex 101 Guide from a Recovering PM

https://www.forwardeployed.com/post/codex-best-practices
1•yummyelephant8•33m ago•1 comments