frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Are most sentences unique? An empirical examination of Chomskyan claims

https://arxiv.org/abs/2509.19108
1•bryanrasmussen•40s ago•0 comments

Does YC support startups paying the 100K$ H1B fee to import talents?

2•blobembassay•2m ago•0 comments

Linux Kernel Version Numbers

http://www.kroah.com/log/blog/2025/12/09/linux-kernel-version-numbers/
1•JNRowe•5m ago•0 comments

Cashfree Payments Powers High-Scale, Speed-First Support

https://tech.cashfree.com/4-ways-cashfree-payments-powers-high-scale-speed-first-support-d93ac85e...
1•manishajayson•5m ago•0 comments

Why are there so many react developers?

1•blobembassay•6m ago•0 comments

Google Maps allocates survival across London's restaurants

https://laurenleek.substack.com/p/how-google-maps-quietly-allocates
1•justincormack•6m ago•0 comments

I'm Not AI, I'm Just Autistic

https://www.latoyarachelle.com/im-not-ai-im-just-autistic/
2•sodic•7m ago•0 comments

Building a Databricks Jobs Error Monitoring Dashboard

https://medium.com/dev-genius/building-a-databricks-jobs-error-monitoring-dashboard-a72f90650c87
1•protmaks•7m ago•0 comments

The Invisible Iceberg of AI Technical Debt

https://old.reddit.com/r/AIQuality/comments/1m83846/the_invisible_iceberg_of_ai_technical_debt/
1•PranayBatta•10m ago•1 comments

Show HN: I revived Spotify-TUI (now Spotatui) with native streaming and updates

https://github.com/LargeModGames/spotatui
1•LargeModGames•11m ago•1 comments

Show HN: Vieta Space, a visual LaTeX math editor

https://docs.vietaspace.com/guide/features
3•liamhawtin•15m ago•1 comments

'Alan's Universe' Shows What It Might Look Like to Win at YouTube

https://www.nytimes.com/2025/12/09/arts/television/youtube-alans-universe.html
1•fleahunter•19m ago•0 comments

Bascetta Star

https://mathematische-basteleien.de/bascettastar.htm
1•coolius•21m ago•0 comments

Spied: BMW's First Electric M Car

https://www.thedrive.com/news/2027-bmw-ix3-m-spy-shots
1•PaulHoule•22m ago•0 comments

Show HN: Bifrost – open-source LLM Gateway (50x lower latency than LiteLLM)

https://github.com/maximhq/bifrost
3•dskuldeep•22m ago•0 comments

Springer Nature retracts ~40 publications that trained ANNs on 'bonkers' dataset

https://www.thetransmitter.org/retraction/exclusive-springer-nature-retracts-removes-nearly-40-pu...
2•sundarurfriend•24m ago•0 comments

Show HN: A Werewolf-style puzzle with zero lying

https://www.cluesofwho.com/
1•soasme•25m ago•0 comments

Apple will not let me join the Developer Program – and will not say why

https://yomuapp.kulman.sk/support
2•tectiv3•27m ago•1 comments

PeerTube V8: manage your videos with your team

https://framablog.org/2025/12/09/peertube-v8-manage-your-videos-with-your-team/
4•tcit•27m ago•0 comments

Whitehall rejects £1.8B digital ID price tag – but won't say what it will cost

https://www.theregister.com/2025/12/09/uk_digital_id_costs/
1•jjgreen•27m ago•0 comments

Compiler Engineering in Practice – Part 1: What Is a Compiler?

https://chisophugis.github.io/2025/12/08/compiler-engineering-in-practice-part-1-what-is-a-compil...
1•todsacerdoti•28m ago•0 comments

OpenAI paused a focus on AGI for 8 weeks to quickly improve ChatGPT

https://www.wsj.com/tech/ai/openai-sam-altman-google-code-red-c3a312ad
2•Hyeonjong•29m ago•0 comments

DynamoDB: Resilience and lessons from the Oct 2025 service disruption

https://www.youtube.com/watch?v=YZUNNzLDWb8
1•belter•31m ago•0 comments

The Joel Test: 12 Steps to Better Code (2000)

https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-steps-to-better-code/
1•tamnd•32m ago•0 comments

Velocity-Bridge: Copy on iPhone. Paste on Linux. No Cloud, No macOS Required

https://github.com/Trex099/Velocity-Bridge
1•thunderbong•35m ago•0 comments

Nano Banana

https://imgeditor.co/
1•bellamoon544•37m ago•1 comments

The Joy of Playing Grandia, on Sega Saturn

https://www.segasaturnshiro.com/2025/11/27/the-joy-of-playing-grandia-on-sega-saturn/
4•tosh•37m ago•2 comments

What Is a Codec and Why Is It Important for Video Streaming?

https://www.red5.net/blog/what-is-a-codec/
1•mondainx•42m ago•0 comments

SGML Syntax Reference

http://sgmljs.sgml.net/docs/sgmlrefman.html
1•fanf2•44m ago•0 comments

UN environment report 'hijacked' by US and others over fossil fuels

https://www.bbc.com/news/articles/c1w9ge93w9po
2•defrost•48m ago•0 comments