frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Ootils – An open source supply chain engine designed for AI agents (not humans)

https://github.com/ngoineau/ootils-core
1•ngoineau•2m ago•0 comments

A whirlwind tour of systemd-nspawn containers (2025)

https://quantum5.ca/2025/03/22/whirlwind-tour-of-systemd-nspawn-containers/
1•indigodaddy•3m ago•0 comments

Show HN: Vocab extractor for language learners using Stanza and frequency ranks

https://huggingface.co/spaces/vladvlasov256/vocab-nlp
2•crivlaldo•4m ago•0 comments

Can this technology end drone warfare? [video]

https://www.youtube.com/watch?v=unraT22a4zY
1•teleforce•5m ago•0 comments

Open-Sourcing Our Mail Client Mono Mail

https://github.com/erickim20/monomail-desktop
2•rhksnrla•12m ago•1 comments

Arena Zero Ep.1 [video]

https://www.youtube.com/watch?v=qqcH-1Rk-ow
1•thewanderer1983•13m ago•0 comments

M4 and M5 Macs cannot run 4k screens in HiDPI mode – limited to 3.3k

https://github.com/waydabber/BetterDisplay/discussions/4215
3•smcleod•14m ago•1 comments

Build123d: A Python CAD programming library

https://github.com/gumyr/build123d
2•Ivoah•15m ago•0 comments

Age verification, child protection and economic power

https://www.cyberverso.net/age-verification-child-protection-and-economic-power/
2•MatteoFrigo•16m ago•0 comments

TeamPCP Supply Chain Campaign: Update 002

https://isc.sans.edu/diary/32838
2•jruohonen•18m ago•0 comments

Samsung Magician disk utility takes 18 steps and two reboots to uninstall

https://chalmovsky.com/2026/03/29/samsung-magician.html
2•chalmovsky•18m ago•0 comments

Things I learned building a model validation library

https://wilsoniumite.com/2025/01/24/things-i-learned-building-a-model-validation-library/
2•Wilsoniumite•19m ago•0 comments

AI isn't killing jobs, it's 'unbundling' them into lower-paid chunks

https://www.theregister.com/2026/03/24/ai_job_unbundling/
5•gnabgib•22m ago•1 comments

Para-Academic Techno-Philosophy

https://elftheory.substack.com/p/para-academic-techno-philosophy
2•lentoutcry•22m ago•0 comments

Generating one token at a time is a blessing in disguise

https://kachkach.com/blog/generating-one-token-at-a-time-is-a-blessing-in-disguise
2•halflings•24m ago•1 comments

The Acceleration of Addictiveness (2010)

https://paulgraham.com/addiction.html
2•microsoftedging•25m ago•0 comments

Show HN: OpsScaleIQ – The operational intelligence OS for franchise operators

https://opsscaleiq.com
2•dsptl•25m ago•0 comments

Personal story: BR airlines sites sucks. Struggling to cancel seat selection

https://blog.thisago.com/story/20260329-cancellingFlightSeatSelection.txt
2•thisago•25m ago•0 comments

Show HN: Tabical – Tinder-style city micro-itineraries, personalized by swipe

https://tabical.com/
4•akhilpotturi•27m ago•0 comments

Hundreds of strangers flock to San Francisco beach to dig a really big hole

https://www.sfgate.com/sf-culture/article/hundreds-strangers-flock-sf-beach-dig-really-big-221583...
2•Stratoscope•28m ago•0 comments

Ask HN: What is TensorFlow still good for now?

1•asxndu•30m ago•1 comments

What category theory teaches us about dataframes

https://mchav.github.io/what-category-theory-teaches-us-about-dataframes/
5•fanf2•32m ago•0 comments

Show HN: Crazierl – An Erlang Operating System

https://crazierl.org/demo/
3•toast0•35m ago•1 comments

The Agentic Passive Voice

https://lethain.com/agentic-passive-voice/
1•jbernardo95•35m ago•0 comments

AI on deck: assessing impact of MLB's new ball-strike system

https://news.cornell.edu/stories/2026/03/ai-deck-assessing-impact-mlbs-new-ball-strike-system
1•rmason•36m ago•0 comments

Magellan: AI agents for autonomous cross-disciplinary scientific discovery

https://github.com/kakashi-ventures/magellan-cli
1•ameft•36m ago•1 comments

An uncatchable CoreML crash: MLIR compiler failures on the iPhone SE 2

https://medium.com/@wagaodongo/the-uncatchable-crash-why-my-coreml-app-works-on-every-iphone-exce...
2•volvogradSaint•40m ago•1 comments

The road signs that teach travellers about France

https://www.bbc.com/travel/article/20260327-the-road-signs-that-teach-travellers-about-france
2•1659447091•44m ago•0 comments

Cleveland Clinic and IBM debut new quantum simulation workflow

https://www.ibm.com/quantum/blog/cleveland-clinic-protein-qcsc
1•rbanffy•46m ago•0 comments

Visual reasoning benchmark based on Analog Clocks

https://clockbench.ai/
1•yrds96•46m ago•0 comments