frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Kapbit – A Kafka-powered workflow orchestrator for Go

https://github.com/kapbit/kapbit-go
1•ymz_ncnk•45s ago•0 comments

What Happened to Hetzner Pricing?

https://www.hetzner.com/de/dedicated-rootserver/matrix-ex/
1•_nhh•1m ago•1 comments

WebKit Features for Safari 26.5

https://webkit.org/blog/17938/webkit-features-for-safari-26-5/
1•ksec•2m ago•0 comments

TanStack NPM supply-chain incident: browser-only lockfile scanner

https://quarkassistant.github.io/tanstack-lockfile-check/
1•jabbah•3m ago•0 comments

What if the browser was the server?

https://arthurcornil.com/blog/ship-it-to-the-user/
1•quirissum•3m ago•1 comments

BoundaryX – On-chain cricket prediction market on Polygon

https://boundaryx.co/tournament
1•angad_s•6m ago•0 comments

Organising the First English-Language Jubensha Convention

https://mssv.net/2026/05/12/on-organising-the-first-english-language-jubensha-convention/
1•adrianhon•6m ago•0 comments

Go fuzzing was missing half the toolkit. We forked the toolchain to fix it

https://blog.trailofbits.com/2026/05/12/go-fuzzing-was-missing-half-the-toolkit.-we-forked-the-to...
1•ingve•7m ago•0 comments

Show HN: Music visualizers that react to audio in real time

https://vizz.fm/app/
1•lowtecky•7m ago•0 comments

Canvas hack: company pays criminals to delete students' stolen data

https://www.bbc.co.uk/news/articles/cdepzg83x87o
2•GaryBluto•8m ago•0 comments

Making cross-platform SIMD code pleasant

https://bkaradzic.github.io/posts/typeless-simd/
1•fanf2•14m ago•0 comments

FMS – A Groovebox for the Nintendo Game Boy Advance

https://lo-bit.club/fms
1•natebc•15m ago•0 comments

Subvert.fm – a co-op music platform – launched today

https://www.subvert.fm
2•MK2k•16m ago•0 comments

Show HN: Agent Harness with Prolog and WASM core incl. 90s Borland-style TUI

https://www.deepclause.ai/
2•schmuhblaster•17m ago•0 comments

I/O Multiplexing: select(), poll(), and epoll() Explaination Extended

https://0xkiire.com/io-multiplexing-guide/
1•kiirecodes•19m ago•0 comments

Building a GUI Library from Scratch

https://0xkiire.com/gui_guide/
2•kiirecodes•20m ago•0 comments

The Geometry of Reasoning and Learning in the Age of AI [video]

https://www.youtube.com/watch?v=82eqQ6oEDi4
1•chrsw•21m ago•0 comments

Refunds are a proxy for regret your marketing should measure

https://zencapital.substack.com/p/refunds-are-a-proxy-for-regret-your
3•zenincognito•21m ago•0 comments

How driving test booking is changing for learner drivers

https://www.bbc.co.uk/news/articles/ckgpl9zdw3po
1•YeGoblynQueenne•22m ago•0 comments

CNCF TOC votes in favor of OTel Graduation

https://github.com/cncf/toc/pull/2134
1•hyzyla•22m ago•0 comments

PEP 661 – Sentinel Values, accepted 5 years later

https://peps.python.org/pep-0661/
2•PaulHoule•25m ago•0 comments

ESA and JAXA team up on planetary defence, Ramses mission to asteroid Apophis

https://www.esa.int/Space_Safety/Planetary_Defence/ESA_and_JAXA_team_up_on_planetary_defence_Rams...
1•rustoo•30m ago•0 comments

Apple's iOS 26.5 Update Patches More Than 50 Security Flaws

https://www.macrumors.com/2026/05/11/ios-26-5-security-fixes/
1•akyuu•32m ago•0 comments

Show HN: I mage GhosttyFX, a JavaFX terminal view that uses libghostty

https://github.com/vlaaad/ghosttyfx/
1•vlaaad•33m ago•1 comments

Building a devcontainer: workspace mounts, DNS wildcards and /etc./resolv.conf

https://topaz.thecloudtheory.com/blog/devcontainer-topaz/
1•kamilmrzyglod•34m ago•0 comments

The Download: the hantavirus outbreak and Musk vs. Altman week 2

https://www.technologyreview.com/2026/05/11/1137031/the-download-hantavirus-outbreak-musk-altman-...
1•joozio•35m ago•0 comments

Allowlisting Config Capabilities by Embedding Rye in Go

https://ryelang.org/blog/posts/whitelist-config-with-rye/
1•middayc•36m ago•0 comments

Houses are for living, not for speculation

https://en.wikipedia.org/wiki/Houses_are_for_living,_not_for_speculation
3•robtherobber•36m ago•0 comments

LUKSbox – Store sensitive files in the cloud

https://luksbox.penthertz.com/
1•campuscodi•38m ago•0 comments

Show HN: I made a weird language and if you no think it dumb, I want your help

https://github.com/DO-SAY-GO/freelang
2•keepamovin•40m ago•2 comments