frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Skills for Real Engineers. Straight from my .claude directory

https://github.com/mattpocock/skills
1•thunderbong•5m ago•0 comments

Show HN: Monitoring Confidential Inference Providers

https://confidentialinference.net/attestation
1•aaaljaz•6m ago•0 comments

The Night My Sega Dreamcast Called the Cops [video]

https://www.youtube.com/watch?v=uTagIZUs-u0
1•truxs•8m ago•0 comments

The Boot Chain of a RISC-V Board: From Silicon to Ubuntu 26.04

https://blog.ludovic.dev/2026/06/08/spacemit-k3-boot-process.html
1•luhenry•9m ago•0 comments

Driverless Trucks Are Here–and They're Delivering Bags of Doritos

https://www.wsj.com/business/logistics/driverless-trucks-are-hereand-theyre-delivering-bags-of-do...
1•JumpCrisscross•11m ago•0 comments

FleetPing – WhatsApp inspections for small fleets

https://fleetping.app/
1•surajitfp•12m ago•0 comments

It's Like Minesweeper

https://etamponi.github.io/posts/its-exactly-like-minesweeper/
1•etamponi•13m ago•0 comments

3D Japan

https://twitter.com/i/status/2061314177399169040
1•marklit•18m ago•0 comments

Global stock markets fall as concerns persist over tech firms

https://www.theguardian.com/business/2026/jun/08/stock-markets-fall-tech-firms-ai-boom-oil-prices...
1•01-_-•20m ago•0 comments

An easter egg in the new Lego Batman

https://social.panic.com/@cabel/116710623616975906
1•robin_reala•20m ago•0 comments

Free and private AI chat from DuckDuckGo

https://duck.ai/
4•strzibny•20m ago•0 comments

Greenwald density limit isn't a hard wall: density-free regime seen on EAST

https://www.science.org/doi/10.1126/sciadv.adz3040
1•nryoo•21m ago•0 comments

'Poisoned' AI: the ChatGPT shopping scams that lead to fake websites

https://www.theguardian.com/money/2026/jun/07/ai-chatgpt-shopping-scams-fake-websites
1•01-_-•21m ago•0 comments

AI bills can be as big as a postdoc salary. Is the cost worth it?

https://www.nature.com/articles/d41586-026-01369-z
2•giuliomagnifico•28m ago•0 comments

History of CentOS

https://www.theregister.com/os-platforms/2026/06/08/history-of-centos-how-a-biochemists-linux-hob...
2•sohkamyung•29m ago•0 comments

Show HN: Veritrooper – find what your AI gets wrong about your own docs

https://veritrooper.com/
1•brian8620•29m ago•0 comments

Some Uses of { and }

https://www.jsoftware.com/papers/from.htm
1•tosh•31m ago•0 comments

Show HN: Tinytasktree – Behavior-tree-style task orchestration for LLM agents

https://github.com/orion-arm-ai/tinytasktree
1•hit9•32m ago•0 comments

The Cursor Developer Habits Report

https://cursor.com/insights
2•nsoonhui•33m ago•0 comments

Prompt Injection in RAG Agentic Systems

https://ulad.net/prompt-injection-in-rag-agentic-systems/
1•delduca•34m ago•0 comments

The EU CHIPS Act Is a Failure [video]

https://www.youtube.com/watch?v=eqoX9OIR-DI
2•obscurette•34m ago•0 comments

The lost social infrastructure of work

https://www.science.org/doi/10.1126/science.aeh9559
1•pseudolus•34m ago•0 comments

BastionRoute A WebSocket relay fabric for UDP with zero-inbound ports

https://github.com/klauscam/BastionRoute
1•klauscam•35m ago•0 comments

Ask HN: How do you stay focused while working from home?

3•infoinlet•35m ago•5 comments

Bonzi Buddy

https://en.wikipedia.org/wiki/BonziBuddy
1•ColinWright•36m ago•0 comments

Wonderwerk Cave bones reveal possible fire use by human ancestor 1.79M years ago

https://phys.org/news/2026-06-wonderwerk-cave-bones-reveal-human.html
2•pseudolus•38m ago•0 comments

Show HN: Makememe – a meme CLI for your Claude Code

https://github.com/dhruvmehra/makememe
1•dhruvme•40m ago•2 comments

A Dialog on APL (2015)

https://www.dyalog.com/blog/2015/05/a-dialog-on-apl/
1•tosh•41m ago•0 comments

Premature Optimization Is Fun Sometimes

https://invlpg.com/posts/2025-06-19-premature-optimization.html
1•throawayonthe•41m ago•0 comments

Blaise v0.10.0 (alpha) – Incremental compile, Native Back end and more

https://github.com/graemeg/blaise/releases/tag/v0.10.0
1•mariuz•41m ago•0 comments