frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Micron Is Exiting Its "Crucial" Consumer Business

https://www.servethehome.com/ai-data-center-markets-are-so-big-that-micron-is-exiting-its-crucial...
1•dannyobrien•3m ago•0 comments

Top Journal Retracts Study Predicting Catastrophic Climate Toll

https://www.nytimes.com/2025/12/03/business/economy/study-climate-damage-retracted.html
3•apparent•5m ago•1 comments

The Not So Short Introduction to LaTeX [pdf]

https://tobi.oetiker.ch/lshort/lshort.pdf
1•teleforce•7m ago•0 comments

The LLM Evaluation Guidebook

https://huggingface.co/spaces/OpenEvals/evaluation-guidebook
2•aratahikaru5•11m ago•0 comments

Tor: What we've learned from fighting censorship in Iran and Russia

https://blog.torproject.org/staying-ahead-of-censors-2025/
1•iamnothere•11m ago•0 comments

Ask HN: Which merge tool do you use?

1•axismundi•11m ago•0 comments

Hairstyle try-on landing pages without training your own model

https://www.ailabtools.com/docs/ai-portrait/effects/hairstyle-editor-pro
1•SkrKing•11m ago•0 comments

Drive with "SpongeBob" on Waze

https://blog.google/waze/waze-spongebob/
2•gnabgib•14m ago•0 comments

Alma Telescope engineering logs show spectral normalization is deleting outliers

https://zenodo.org/records/17808349
1•ryanbeem•14m ago•1 comments

Ask HN: Anyone writing code from scratch or mostly doing architecting and LLM?

1•mattfrommars•16m ago•0 comments

Linux 6.19 Goes Ahead and Enables Microsoft C Extensions Support

https://www.phoronix.com/news/Linux-6.19-Enables-MS-Ext
2•mikece•17m ago•0 comments

Lessons learned from the Rust Vision Doc process

https://blog.rust-lang.org/2025/12/03/lessons-learned-from-the-rust-vision-doc-process/
1•mikece•19m ago•0 comments

Framebuffer Modifiers Part 1

https://bwidawsk.net/blog/2021/2/modifiers/
1•jakogut•20m ago•0 comments

President DJT Said He Just Legalized Cheaper, Smaller 'Cute' Kei Cars in America

https://www.theautopian.com/president-trump-said-he-just-legalized-cheaper-smaller-cute-kei-cars-...
2•schmuckonwheels•21m ago•0 comments

From design patterns to category theory (2017)

https://blog.ploeh.dk/2017/10/04/from-design-patterns-to-category-theory/
1•bramadityaw•22m ago•0 comments

Transactions for Ghostty

https://hcb.hackclub.com/ghostty/transactions
1•susam•24m ago•0 comments

Trump Launches Largest Environmental Rollback in U.S. History

https://oilprice.com/Energy/Energy-General/Trump-Launches-Largest-Environmental-Rollback-in-US-Hi...
3•testrun•25m ago•0 comments

STL Therapy v3.1: Absolute Eradication Strain

https://zenodo.org/records/17809368
1•Trinhchuong•27m ago•1 comments

More than 82,000 tires recalled for lengthy identification number, NHTSA says

https://www.ksat.com/news/local/2025/12/02/more-than-82000-tires-recalled-for-lengthy-identificat...
1•jshprentz•28m ago•1 comments

Show HN: Canvas and Agent: Cursor and Canvas makes a baby

https://canvas-agent.vercel.app/
1•lout332•28m ago•0 comments

Abstract Interpretation in the Toy Optimizer

https://bernsteinbear.com/blog/toy-abstract-interpretation/
3•ChadNauseam•32m ago•0 comments

Half of Linux Users Stick with X11, Despite Years of Wayland Being Forced

https://www.youtube.com/watch?v=18A1mrULu4U
1•snvzz•34m ago•0 comments

President DJT Appears to Approve Kei Cars for the USA

https://www.roadandtrack.com/news/a69623655/president-donald-trump-kei-cars-usa/
1•schmuckonwheels•34m ago•0 comments

ESP32-Powered PPG Signal Acquisition: Open-Source Hardware and Software

https://www.mdpi.com/2813-6640/3/4/15
2•PaulHoule•35m ago•0 comments

Show HN: Elevate – A minimal, privacy-first new tab with a collapsible HN drawer

https://elevate-tab.com/
1•shoarek•36m ago•0 comments

Human Input to Computer Systems: Theories, Techniques and Technology (2011 WIP)

https://billbuxton.com/inputManuscript.html
1•leoc•38m ago•0 comments

Apex GPU: Run CUDA Apps on AMD GPUs Without Recompilation

https://github.com/kentstone84/APEX-GPU
11•ArchitectAI•38m ago•7 comments

Show HN: Seedream 4.5 – High-Consistency AI Image Generation for Creators

https://www.seedream4.net/seedream-4-5
1•lu794377•42m ago•0 comments

Why WinQuake exists and how it works

https://fabiensanglard.net/winquake/index.html
3•wicket•43m ago•0 comments

Apple’s head of user interface design, Alan Dye, will join Meta

https://www.cnbc.com/2025/12/03/liquid-glass-alan-dye-leaving-apple.html
24•Noaidi•48m ago•7 comments