frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Italy warns stricken Russian tanker could explode in Med at any time

https://www.bbc.com/news/articles/cn9e9dxw3e1o
1•tartoran•40s ago•0 comments

Garry Tan's Claude Code Setup

https://github.com/garrytan/gstack/tree/main
1•alienreborn•1m ago•0 comments

I kept getting surprise API bills from my agents

https://agentbudget.dev
1•sahiljagtapyc•1m ago•1 comments

Common sense: not to be rejected, but to be mastered and overcome

https://hashzn.substack.com/p/common-sense
1•hashino•1m ago•0 comments

Show HN: AgentMarket – API marketplace where AI agents buy and sell capabilities

https://agentmkt.dev
1•AgentMarket•2m ago•0 comments

Show HN: Soros – AI for geopolitical macro investing

https://www.asksoros.com
2•muggermuch•3m ago•1 comments

OpenAI to Cut Back on Side Projects in Push to 'Nail' Core Business

https://www.wsj.com/tech/ai/openai-chatgpt-side-projects-16b3a825
1•mattas•4m ago•0 comments

A menu bar utility that transforms your clipboard – encode, format, and more

https://github.com/dmasior/dmtool
1•ecce_homo•7m ago•0 comments

Tri-skill framework for routing, verification, and judgment hygiene

https://github.com/SyntagmaNull/judgment-hygiene-stack
1•SyntagmaNull•7m ago•1 comments

CodeSandbox: Deprecation Notice

1•bstrama•8m ago•0 comments

Treasuries and other government bonds will keep selling off, BlackRock says

https://www.msn.com/en-us/money/markets/treasuries-and-other-government-bonds-will-keep-selling-o...
2•petethomas•9m ago•0 comments

Catly Browser (GeckoView) Beats Chrome on Speedometer 1.0 (Samsung S21)

1•MattSK•9m ago•0 comments

Conversational Software Engineering: Compiling Intent

https://robenglander.com/writing/cse-compiling-intent/
1•perelin•11m ago•0 comments

Show HN: Wuobly – An AI agent that searches the live web for B2B leads

https://wuobly.com
1•Bgach•11m ago•0 comments

Agent harness to use with 8090 Software Factory and apply AI agents to your SDLC

https://github.com/8090-inc/software-factory-harness
1•arjun_krishna1•16m ago•0 comments

Ask HN: How does HN do updates?

2•vsgherzi•18m ago•0 comments

Gaia-GIC-1: An Evolving Catastrophic Planetesimal Collision Candidate

https://iopscience.iop.org/article/10.3847/2041-8213/ae3ddc
1•jacquesm•18m ago•0 comments

Ask HN: How do you handle privacy policies for side projects?

2•sergei_pch•19m ago•4 comments

Pwning AWS Bedrock AgentCore's AI Code Interpreter

https://www.beyondtrust.com/blog/entry/pwning-aws-agentcore-code-interpreter
5•kmcquade•24m ago•0 comments

Ask HN: Why would this be a good idea?

1•ZLStas•25m ago•1 comments

Mistral AI Releases Forge

https://mistral.ai/news/forge
20•pember•25m ago•0 comments

Equipping workers with insights about compensation

https://openai.com/index/equipping-workers-with-insights-about-compensation
2•surprisetalk•26m ago•1 comments

ACP – Cryptographic admission control layer for autonomous agent actions

https://github.com/chelof100/acp-framework-en
1•chelof100•26m ago•2 comments

Apple Screen Sharing High Performance

3•chapoly1305•28m ago•0 comments

Regex Blaster

https://mdp.github.io/regex-blaster/
2•mdp•29m ago•1 comments

Device Hunt – Find Device by USB/PCI VID/PID

https://devicehunt.com/
1•Velocifyer•29m ago•1 comments

It feels like Claude goes down almost daily now

13•mrprincerawat•29m ago•4 comments

Contactless Respiratory Monitoring Using Acoustic Convolutional Neural Networks

https://www.mdpi.com/2673-4591/127/1/1
2•PaulHoule•29m ago•0 comments

Nvidia GTC 2026, More Signs of the AI Dark Compute Cycle

https://coastaljournal.substack.com/p/nvidia-gtc-2026-more-signs-of-the
1•petethomas•30m ago•0 comments

MCP server for Solana – wallet cleanup and trading on 12 DEXes

https://github.com/RefundYourSOL/refundyoursol-mcp
1•DesttE•31m ago•0 comments