frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Nature's Overlooked Role in National Security

https://nautil.us/natures-overlooked-role-in-national-security-1280439
1•lschueller•4m ago•0 comments

Show HN: Gitbar – A menu bar app for GitHub PRs and issues

https://usegitbar.app/
1•brunokiafuka•4m ago•0 comments

LLxprt Code Is the Anti-Claw

https://vybestack.dev/blog/rendered/2026-02-20-anti-claw.html
1•mooreds•4m ago•0 comments

Sam Altman is "the face of evil" for not reporting school shooter, says lawyer

https://arstechnica.com/tech-policy/2026/04/school-shooting-lawsuits-accuse-openai-of-hiding-viol...
1•asplake•5m ago•0 comments

Lilex. The Font for Developers

https://lilex.myrt.co/
3•hmokiguess•5m ago•0 comments

Bambu labs sends legal threat to orcaslicer dev over use of AGPL code [video]

https://www.youtube.com/watch?v=jIbpQtoz6hs
2•mindcrime•7m ago•0 comments

Practical Ways to Reduce Claude Code Token Usage

https://www.kdnuggets.com/7-practical-ways-to-reduce-claude-code-token-usage
2•sminchev•10m ago•1 comments

Recession and Revolution: Our Experience Isn't a Model or System

http://charleshughsmith.blogspot.com/2026/05/recession-and-revolution-our-experience.html
1•speckx•11m ago•0 comments

Boris Cherny: TI-83 Plus Basic Programming Tutorial (2004)

https://www.ticalc.org/programming/columns/83plus-bas/cherny/
1•suoken•14m ago•0 comments

Empty Screenings

https://walzr.com/empty-screenings
1•jbegley•15m ago•0 comments

AI startup JuliaHub raises $65M to rival Simulink

https://www.axios.com/2026/04/30/bob-muglia-ai-hardware-engineering
9•ViralBShah•15m ago•1 comments

XGrammar-2: 80x Faster Structured Generation for Agent Tool Calling

https://blog.mlc.ai/2026/05/04/xgrammar-2-fast-customizable-structured-generation
2•ubospica•16m ago•0 comments

Show HN: Full-featured CLI textarea component for React Ink

https://github.com/omranjamal/ink-textarea
1•omranjamal•16m ago•0 comments

What NIST's mDL guidance means for the future of digital identity

https://1password.com/blog/nist-mobile-drivers-license-standards
1•AndroidKitKat•18m ago•0 comments

The creator of Roomba is back with a furry robot companion

https://www.theverge.com/ai-artificial-intelligence/922947/roomba-creator-new-robot-familiar-mach...
2•jpm_sd•25m ago•0 comments

Your Dinner Got Worse On Purpose

https://www.worseonpurpose.com/p/your-dinner-got-worse-on-purpose
3•jkestner•26m ago•0 comments

AI models are choking on junk data

https://fortune.com/2026/05/03/ai-models-are-choking-on-junk-data/
2•Zeidd•26m ago•1 comments

Russian satellites over Ukraine: where and when they fly over

https://texty.org.ua/projects/117087/in-focus-of-a-satellite-how-russia-spies-on-ukraine-from-space/
4•gmays•29m ago•0 comments

Chinese Government Got the Largest Digital Rights Conference Canceled

https://www.wired.com/story/the-chinese-government-pressured-zambia-to-cancel-the-worlds-largest-...
3•miohtama•30m ago•0 comments

Formatting a 25M-line codebase overnight

https://stripe.dev/blog/formatting-an-entire-25-million-line-codebase-overnight-the-rubyfmt-story
12•r00k•31m ago•2 comments

Reverse-engineering Final Fantasy X (PS3) with Ghidra

https://tech.dreamleaves.org/posts/exploring-spira-ps3-binary-with-ghidra/
3•joshguthrie•32m ago•0 comments

'Nature' Retracts Paper on the Benefits of ChatGPT in Education

https://www.404media.co/nature-retracts-paper-on-the-benefits-of-chatgpt-in-education/
4•cdrnsf•32m ago•0 comments

Promptise Foundry – a Python agentic framework for building production systems

https://github.com/promptise-com/foundry
1•cryxnet•34m ago•0 comments

Lena – macOS menu bar app for storing and copying CLI commands

https://github.com/yannickboog/lena
2•Schaefle•35m ago•0 comments

2 Letters from Steve (2013)

https://davidgelphman.wordpress.com/2013/03/29/2-letters-from-steve/
1•CharlesW•36m ago•0 comments

Show HN: NeuralScript – A pure-Rust AOT compiler

https://github.com/bwiemz/NSL
1•AkaiNa•36m ago•0 comments

The War Against Renters

https://thezvi.substack.com/p/housing-roundup-15-the-war-against
3•7777777phil•37m ago•0 comments

Safety benchmarks are inflated because models know they're being tested

https://www.lesswrong.com/posts/mDriHK4beN5rq2tAA/verbalized-eval-awareness-inflates-measured-safety
2•aranguri•38m ago•0 comments

Deepsec

https://vercel.com/blog/introducing-deepsec-find-and-fix-vulnerabilities-in-your-code-base
4•0xedb•39m ago•1 comments

Transformers Are Inherently Succinct

https://arxiv.org/abs/2510.19315
4•bearseascape•40m ago•0 comments