frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Redis Patterns for Coding Agents

https://redis.antirez.com/
1•ingve•1m ago•0 comments

Show HN: CANomaly-LSTM – Detecting CAN bus anomalies with deep learning

https://github.com/Yigtwxx/CANomaly-LSTM
1•Yigtwx•1m ago•1 comments

The JWKS Setup for Robust JWT Validation in Asp.net 10

https://www.aaronpina.com/the-ultimate-jwks-setup-for-robust-jwt-validation-in-asp-net-10/
1•aaronpina•1m ago•0 comments

Flightradar24 for Ships

https://atlas.flexport.com/
1•chromy•1m ago•0 comments

Show HN: OneCamp – Self-Hosted Slack/Asana/Zoom/Notion Alternative

1•akashc777•2m ago•0 comments

Biggest day of Claude app downloads in history, by far

https://xcancel.com/SashaKaletsky/status/2027987508500316571
1•doener•3m ago•0 comments

Show HN: Free tool to see what keywords any website ranks for

https://champsignal.com/tools/competitor-keyword-finder
1•maximedupre•4m ago•1 comments

PHP on the Desktop: BosonPHP for Ultra-High Performance Native Applications

https://lionel-peramo.com/posts/php-desktop-native-applications-bosonphp/
1•ulrischa•6m ago•0 comments

OpenAI details layered protections in US defense department pact

https://www.reuters.com/business/media-telecom/openai-details-layered-protections-us-defense-depa...
1•giuliomagnifico•6m ago•0 comments

Welcoming Elizabeth Barron as the New Executive Director of the PHP Foundation

https://thephp.foundation/blog/2026/02/27/welcoming-elizabeth-barron-new-executive-director/
1•ulrischa•6m ago•0 comments

Who Owns Your ATProto Identity? Hint: It's Probably Not You

https://kevinak.se/blog/who-actually-owns-your-atproto-identity-hint-its-probably-not-you
1•kevinak•7m ago•0 comments

Why does C have the best file API?

https://maurycyz.com/misc/c_files/
1•ulrischa•7m ago•0 comments

Making Claude Beep: A Dive into Hooks with Claude Code

https://www.drewhyde.io/blog/claude-code-beep-hooks
1•Andrewryanhyde•11m ago•0 comments

The Cathode Ray Tube site

https://www.crtsite.com/didactic-crt.html
1•joebig•11m ago•0 comments

Giving Claude a Parent: Multi-Model Code Review via MCP

https://www.drewhyde.io/blog/codex-mcp-claude-code
1•Andrewryanhyde•11m ago•0 comments

Show HN: ParseHive – AI-powered invoice data extraction for Windows and Mac

https://parsehive.app
1•misha_dev•12m ago•0 comments

Show HN: RAG-Enterprise – 100% local RAG system for enterprise documents

https://github.com/I3K-IT/RAG-Enterprise
1•primoco•13m ago•1 comments

Wordles new number game rival

https://the67numbergame.github.io/
1•_snory•18m ago•1 comments

ChatGPT Recommends Claude

https://xcancel.com/deedydas/status/2028030521973125617?s=20
1•doener•20m ago•0 comments

Emacs is shell root but no schwag?

https://shop.fsf.org/
1•krry•20m ago•1 comments

Google Killed the Rent-a-Domain Era

https://growtika.com/blog/publisher-affiliate-collapse
1•Growtika•21m ago•1 comments

Show HN: Nummi – AI companion with memory and daily guidance

https://www.nummi.ai/download
1•ab-abg•25m ago•1 comments

Show HN: Practicing Interview with AI

https://sungatae.com/posts/interviewshark/
1•visujosh•25m ago•0 comments

Give AI agents a real browser, watch them live via WebRTC

https://github.com/lowjax-com/vscreen
1•lowjax•25m ago•1 comments

Brain's "RAM" and "Hard Drive"

1•0ut0flin3•25m ago•1 comments

Show HN: Aide – Opinionated, deterministic code editing for AI agents

https://github.com/avataristvan/a-i-d-e
1•avataristvan•27m ago•0 comments

4,500 Physicians Agree (About Bacon)

https://machielreyneke.com/blog/persuasion/
1•machielrey•29m ago•0 comments

Antarctica just saw the fastest glacier collapse ever recorded

https://www.sciencedaily.com/releases/2026/02/260226042454.htm
2•yusufaytas•32m ago•0 comments

Ws – Keep Claude Code's context visible in your terminal

https://github.com/n-filatov/ws
1•notwhalee•33m ago•1 comments

Show HN: ZcoreAI – Z-score regression channel screener

https://www.zcoreai.com/
1•tchantchov•35m ago•0 comments