frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

AST and LLM Navigation Tool

https://analect.dev/
1•ascent817•38s ago•0 comments

Realistic AI Avatars

https://percify.io
1•drepheus•48s ago•0 comments

Drive Without Gasoline – V8 Wood Gas Chevrolet Fleetside [video]

https://www.youtube.com/watch?v=bWnhtqDJwIU
1•johnnyApplePRNG•1m ago•0 comments

Stop Destroying Video Games

https://www.youtube.com/watch?v=xSla5vfGi3A
1•doppp•1m ago•0 comments

EU Parliament Hearing: Are Publishers Allowed to Disable Games You Bought? [video]

https://www.youtube.com/watch?v=oXcogLmxnJw
1•throwawayk7h•8m ago•0 comments

LLM inference infrastructure for a systems audience

https://blog.mihirnanavati.com
1•PaulHoule•10m ago•0 comments

Show HN: Valuepulse – search docs, query data, and build dashboards in one place

https://valuepulse.ai
1•ygudeta•12m ago•0 comments

Basic Intrusion Detection System with Mtree (FreeBSD)

https://henryleach.com/2026/03/basic-intrusion-detection-system-with-mtree/
1•DASD•21m ago•0 comments

Show HN: Hailuo 3.0 AI – AI Video Generator

https://hailuo30.net
1•danielmateo773•24m ago•0 comments

US Bill Mandates On-Device Age Verification

https://reclaimthenet.org/us-bill-mandates-on-device-age-verification
3•ronsor•26m ago•0 comments

Software Bonkers

https://craigmod.com/essays/software_bonkers/
1•mooreds•27m ago•0 comments

How AI changed your daily work at office?

1•XDataY•28m ago•0 comments

Opus 4.7 is horrible at writing

2•limalabs•30m ago•1 comments

Component "archive.ubuntu.com" and a few other components are Down

https://status.canonical.com/
1•SoftTalker•31m ago•2 comments

The first signed, drift-monitored W3C WebMCP manifest

https://getspeakable.ai/blog/webmcp-launch/
1•quickersilver•31m ago•0 comments

Has anyone used Claude Opus 4.7 API on Qubrid or another platform? Use case?

https://platform.qubrid.com/
1•tech_curator•37m ago•0 comments

Show HN: A collection of GPT-IMAGE-2 prompts from X(Twitter)

https://gptimage2.one/awesome-gpt-image-2-prompts
2•kevinhacker•41m ago•0 comments

IPC medley: message-queue peeking, io_uring, and bus1

https://lwn.net/Articles/1065490/
1•signa11•42m ago•0 comments

Robot golf vs. holes that keep getting harder [video]

https://www.youtube.com/watch?v=2OfjZ3ORJfc
1•Timothee•43m ago•0 comments

Parcae: Doing more with fewer parameters using stable looped models

https://www.together.ai/blog/parcae
1•gmays•53m ago•0 comments

Component "security.ubuntu.com" and a few other components are Down

https://status.canonical.com/#/incident/KNms6QK9ewuzz-7xUsPsNylV20jEt5kyKsd8A-3ptQFa8k37yMVCakd1y...
2•zinekeller•54m ago•0 comments

Named Entity Recognition (NER) in Python with Spacy

https://www.analyticsvidhya.com/blog/2021/06/nlp-application-named-entity-recognition-ner-in-pyth...
2•downboots•56m ago•1 comments

AWS Security Agent on-demand penetration testing is now generally available

https://aws.amazon.com/about-aws/whats-new/2026/03/aws-security-agent-ondemand-penetration/
2•computersuck•1h ago•0 comments

RV32I Reference [pdf]

https://hoult.org/rv32i.pdf
1•brucehoult•1h ago•1 comments

I built 7 AI agents that attack the same task in parallel – armyai.app

https://armyai.app
1•Tilica•1h ago•0 comments

A few tips to get more out of Opus 4.7

https://twitter.com/bcherny/status/2044847848035156457
2•tzury•1h ago•0 comments

Show HN: The Onion Shell

https://the.onionshell.ch
1•ewindisch•1h ago•1 comments

I built an AI that analyzes rental leases before you sign

https://goleazly.com/
3•octadevcba•1h ago•0 comments

Autoresearch on Steroids with Sandboxes

https://www.tensorlake.ai/blog/autoresearch-on-steroids-with-sandboxes
1•cooleel•1h ago•0 comments

Global warming is making the strongest hurricanes stronger

https://yaleclimateconnections.org/2026/04/global-warming-is-making-the-strongest-hurricanes-stro...
5•pier25•1h ago•0 comments