frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Codeberg is why developers are broke

https://sharemygit.com/
1•onesandofgrain•2m ago•1 comments

AI Fails at 96% of (General Work) Jobs (New Study)

https://www.youtube.com/watch?v=z3kaLM8Oj4o
2•swolpers•2m ago•1 comments

A chatbot's worst enemy is page refresh

https://zknill.io/posts/chatbots-worst-enemy-is-page-refresh/
1•zknill•5m ago•0 comments

How Michael Abrash doubled Quake framerate

https://fabiensanglard.net/quake_asm_optimizations/
1•Audiophilip•6m ago•0 comments

Show HN: Claude Remote – control Claude Code on your Mac from your phone

1•ChilinAI•6m ago•0 comments

Organising Entryway Clutter with a Double Wardrobe with Drawers

https://dreamhomestore.co.uk/collections/wardrobes
1•Stevencoles89•8m ago•1 comments

Kintsugi

https://events.sonarsource.com/kintsugi/
1•handfuloflight•8m ago•0 comments

A reinforcement learning agent that learns to play Kung Fu Master

https://shantanugoel.com/2026/02/15/teach-machines-kungfu/
1•devnonymous•9m ago•0 comments

The TRAP – The wasted opportunities of the Orbán era [video]

https://www.youtube.com/watch?v=9NQEcLIiOpM
1•r_sz•12m ago•0 comments

Show HN: OpenCode Upgrade Skill: Automating Updates

1•ekadet•14m ago•0 comments

Foursquare scrapped engineering manager titles

https://sfstandard.com/2026/02/03/foursquare-scrapped-engineering-manager-titles/
1•walterbell•15m ago•0 comments

Interpreting OCapN Principles in Cloud-Native Agentic AI Architectures

https://serefayar.substack.com/p/interpreting-ocapn-principles-in-cloud-native-agentic-ai
1•serefayar•17m ago•0 comments

Making a product that Marl loves

https://invertedpassion.com/making-a-product-that-marl-loves/
1•twapi•17m ago•0 comments

Memoirs from the old web: IE's crazy content rating system

https://www.devever.net/~hl/pics
1•Diti•18m ago•0 comments

Qwen3.5: Towards Native Multimodal Agents

https://qwen.ai/blog?id=qwen3.5
3•danielhanchen•20m ago•2 comments

Show HN: OpenClaw – An OS for AI agents that do work

https://github.com/mupengi-bot/mupengism
1•mupengism•21m ago•0 comments

ERAO – Ask questions in plain English over your database or files

https://erao.digital
1•jorjinio•22m ago•1 comments

Singapore says China-backed hackers targeted its four largest phone companies

https://techcrunch.com/2026/02/10/singapore-china-backed-hackers-targeted-largest-phone-companies...
3•JeanKage•23m ago•0 comments

Fluxer: Free, open source instant messaging and VoIP platform

https://github.com/fluxerapp/fluxer
2•thunderbong•24m ago•0 comments

The AI Advantage Established Companies Have over Startups

https://www.context-link.ai/blog/hidden-ai-advantage-established-companies
1•oliaukus•24m ago•0 comments

Show HN: We rebuilt Flood-It in Bun/vanilla JavaScript, and added a Maze mode

1•ekremkrc•26m ago•1 comments

Show HN: Dominake – A domino puzzle where 5×6 grids are impossible

1•UnclonedMath•30m ago•0 comments

Show HN: Train AI Agents to Write Better Playwright Tests

https://testdino.com/blog/playwright-skill/
2•tanmay001•34m ago•0 comments

Friends Might Be Sharing Your Number with ChatGPT Contacts Sync

https://www.pcmag.com/news/watch-out-your-friends-might-be-sharing-your-number-with-chatgpt?test_...
1•walterbell•34m ago•0 comments

A Wave of Unexplained Bot Traffic Is Sweeping the Web

https://www.wired.com/story/made-in-china-niche-websites-are-seeing-a-surge-of-mysterious-traffic...
2•JeanKage•36m ago•0 comments

Show HN: AISeedream5 – a simple web UI for Seedream 5.0 image

https://aiseedream5.org/
1•xuyanmei•37m ago•0 comments

Show HN: 0211 – Go from zero to eleven in any topic with F1-style gear shifting

1•ekadet•38m ago•0 comments

Pi Coding Agent

https://pi.dev/
2•tin7in•43m ago•0 comments

Who Opened the Door?

https://chaosguru.substack.com/p/who-opened-the-door
2•BerislavLopac•45m ago•1 comments

Experiments with CodeMirror: Building a code review tool

https://aziis98.com/blog/codemirror-review-tool/
1•aziis98•46m ago•0 comments