frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

The future of discovery: Keeping it fair for creators and partners

https://blog.google/company-news/inside-google/around-the-globe/google-europe/the-future-of-disco...
1•fauria•30s ago•0 comments

Transducers: Middleware for Reducing Functions

https://dgr.github.io/clojurecrazy/2022/01/16/transducers-middleware-for-reducing-functions-part-...
1•drob518•1m ago•0 comments

Disappearing Polymorph

https://en.wikipedia.org/wiki/Disappearing_polymorph
1•janandonly•1m ago•0 comments

Show HN: Medievalizer, an extension that rewrites pages into Old English scripts

https://github.com/theletterf/medievalizer
2•theletterf•1m ago•0 comments

Envoy Proxy: high-performance, open-source edge and service proxy

https://github.com/envoyproxy/envoy
1•Jcowell•2m ago•0 comments

Replacing Uptime Kuma with Gatus on a Tiny VPS for $1.20/Year

https://nick-web.co.uk/gatus-tiny-vps/
1•nickweb•2m ago•0 comments

New

1•bebaeshetu•3m ago•0 comments

Show HN: Dropshot – built a browser tool for animated product shots

https://www.usedropshot.com/
1•antoninkus•5m ago•1 comments

Live Translation, Running in the Browser

https://artisincode.com/essays/live-translation-right-in-the-browser/
1•parentheses•5m ago•1 comments

Show HN: YouDeserveNow – An AI enabler that rationalizes your impulse buys

https://www.youdeservenow.com
1•oneprofiledev•5m ago•1 comments

The Vercel breach: OAuth attack exposes risk in platform environment variables

https://www.trendmicro.com/en_us/research/26/d/vercel-breach-oauth-supply-chain.html
5•queenelvis•6m ago•0 comments

Ibuilt a tiny Unix‑like 'OS' with shell and filesystem for Arduino UNO (2KB RAM)

https://github.com/Arc1011/KernelUNO
1•Arc1011•6m ago•0 comments

FastVLA – Training 7B Robotics Policies for $0.48/HR on Nvidia T4/L4

https://github.com/BouajilaHamza/fastvla
1•hamzabouajila•7m ago•0 comments

Show HN: TickerDB – pre-computed market context for agents

https://tickerdb.com/playground
1•wolfman1•7m ago•0 comments

Work with the Garage Door Up

https://notes.andymatuschak.org/Work_with_the_garage_door_up
1•jxmorris12•7m ago•0 comments

Long-term cigarette butts' decomposition over 10 years

https://www.sciencedirect.com/science/article/abs/pii/S0269749126003143
1•PaulHoule•8m ago•0 comments

Mind the van Emden Gap

https://blog.fogus.me/llm/van-emden.html
1•adityaathalye•10m ago•0 comments

Resolution Challenge

https://rico.ibs.fr/helixplorer/resolution/
1•ray__•11m ago•0 comments

SpaceX president Shotwell earned $85M last year, document shows

https://www.reuters.com/world/spacex-president-shotwell-earned-85-million-last-year-document-show...
2•1vuio0pswjnm7•12m ago•0 comments

Ordering with the Starbucks ChatGPT app was a true coffee nightmare

https://www.theverge.com/ai-artificial-intelligence/915821/starbucks-chatgpt-app-testing
1•mattas•12m ago•0 comments

The Simpsons Hit and Run Ported to WebAssembly and WebGL

https://shar-wasm.cjoseph.workers.dev/?skipfe&l1&m1&autostartmission
1•calebj0seph•12m ago•1 comments

Musk bought $1.4B SpaceX shares last year, The Information reports

https://www.reuters.com/technology/musk-bought-14-billion-spacex-shares-last-year-information-rep...
1•1vuio0pswjnm7•12m ago•0 comments

The largest randomized controlled trial of leading longevity drug is underway

https://www.gethealthspan.com/research/article/human-rapamycin-trial-longevity-evidence
3•dtawfik1•12m ago•0 comments

The Download: turning down human noise, and LA's subway upgrade

https://www.technologyreview.com/2026/04/21/1136246/the-download-human-noise-la-subway-upgrade/
1•joozio•13m ago•0 comments

Show HN: GBS Music – web based, open source Game Boy music editor/tracker

1•chrismaltby•13m ago•0 comments

Show HN: Hands-on x86-64 page table walk:finding a flag in physical RAM with GDB

https://github.com/jazho76/page_table_walk
1•jazho76•13m ago•0 comments

Flow Control Is an Architectural Decision (Not a Performance Optimization)

https://www.synadia.com/blog/nats-edge-event-architecture-4-flow-control-is-an-architectural-deci...
1•jonzu•13m ago•0 comments

More than 20 organic compounds found on Mars – many for the first time

https://www.chemistryworld.com/news/more-than-20-organic-compounds-found-on-mars-many-for-the-fir...
1•crescit_eundo•14m ago•1 comments

Ask HN: How much are you making?

1•downbad_•14m ago•1 comments

The Wharton Blueprint for AI Agent Adoption

https://knowledge.wharton.upenn.edu/special-report/wharton-blueprint-ai-agent-adoption/
1•gmays•14m ago•1 comments