frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Data Centers Are a 'Gold Rush' for Construction Workers

https://www.wsj.com/business/data-centers-are-a-gold-rush-for-construction-workers-6e3c5ce0
1•JumpCrisscross•1m ago•0 comments

Show HN: Thermodynamic Alignment Forces Gemini Thinking into "Burn Protocol"

https://github.com/CodeIncept1111/Sovereign-Stack
1•CodeIncept1111•4m ago•1 comments

18 Months of Events Fit on Four Floppy Disks

https://docs.eventsourcingdb.io/blog/2025/12/01/18-months-of-events-fit-on-four-floppy-disks/
1•goloroden•4m ago•0 comments

How to run phones while being struck by suicide drones

https://nasa.cx/hn/posts/how-to-run-hundreds-of-phones-while-being-struck-by-suicide-drones/
6•nasaok•4m ago•0 comments

PowerShell's curl can run JavaScript [video]

https://www.youtube.com/watch?v=KJKnEd6_WlI
1•mathiasdpx•5m ago•0 comments

Particle Physicists Detect 'Magic' at the Large Hadron Collider

https://www.quantamagazine.org/particle-physicists-detect-magic-at-the-large-hadron-collider-2025...
1•tzury•7m ago•0 comments

White House gives Maduro ultimatum as U.S. moves toward land operations

https://www.miamiherald.com/news/nation-world/world/americas/venezuela/article313261442.html
2•clanky•8m ago•0 comments

Switzerland votes decisively against inheritance tax

https://www.economist.com/europe/2025/11/30/switzerland-votes-decisively-against-inheritance-tax
1•vinni2•10m ago•0 comments

Show HN: Generate Storyboards with Nano Banana from the CLI

https://github.com/kierangilliam/storyboard
2•kierangill•10m ago•0 comments

Who Will Observe the Observability? eBPF Performance at Scale

https://blog.zmalik.dev/p/who-will-observe-the-observability
1•tanelpoder•10m ago•0 comments

Why Staff+ Hiring Is a Different Game

https://medium.com/@yves.greijn_19041/fcb10ed6e880
1•hunglee2•11m ago•0 comments

Volkswagen can now build EVs in China, claiming it can cut costs by up to 50%

https://electrek.co/2025/11/25/volkswagen-build-evs-china-cut-costs-by-50/
2•ilamont•12m ago•1 comments

Plans for MySQL Vector Support and a MySQL Binlog Server

https://www.percona.com/blog/building-the-future-of-mysql-announcing-plans-for-mysql-vector-suppo...
1•tanelpoder•12m ago•0 comments

Show HN: GoodQuestions – a tiny site of genuinely good, human-curated questions

https://goodquestions.qzz.io/
1•juliakzl_•13m ago•0 comments

Lightyear.fm – radio waves far from Earth

https://lightyear.fm/
1•memalign•14m ago•1 comments

Waze but Built for Tesla

https://old.reddit.com/r/TeslaLounge/comments/1p9x9zk/i_created_a_better_inbrowser_tesla_waze_map...
1•ryanvogel•14m ago•0 comments

Norway's $2T Wealth Fund Has Become an Election Football

https://www.bloomberg.com/news/articles/2025-09-04/norway-election-trump-ally-takes-on-world-s-bi...
1•alephnerd•26m ago•0 comments

Building the Perfect Linux PC with Linus Torvalds

https://youtu.be/mfv0V1SxbNA?si=ASyHL7YiMtdOCVen
7•tiernano•28m ago•0 comments

Hacking on the ReMarkable 2

https://sgt.hootr.club/blog/hacking-on-the-remarkable-2/
2•todsacerdoti•38m ago•0 comments

By my count, Linux has 11% of the desktop market. Here's how I got that number

https://www.zdnet.com/article/why-people-keep-flocking-to-linux-in-2025-and-its-not-just-to-escap...
13•breve•40m ago•1 comments

Subversion beats Perforce in handling large files, and it's not even close

https://www.liamfoot.com/subversion-beats-perforce-in-handling-large-files-and-its-not-even-close
2•prmph•43m ago•1 comments

Kv.js: Advanced in-memory caching for JavaScript

https://www.npmjs.com/package/@heyputer/kv.js
1•ent101•46m ago•0 comments

Reverse Engineering the Next.js Job Interview Malware (Hidden in Next.config.js)

https://dzentota.medium.com/reverse-engineering-the-next-js-job-interview-malware-targeting-lastp...
2•dzentota•46m ago•1 comments

Oxylipins from Soybean Oil Driving Obesity

https://www.jlr.org/article/S0022-2275(25)00195-6/fulltext
1•Noaidi•47m ago•0 comments

Dangerous Streets: Using ML to Prioritize Cyclist Safety

https://joshfonseca.com/blogs/dangerous-streets
2•m-hodges•47m ago•0 comments

$1000 bounty to add a feature to coolify

https://github.com/coollabsio/coolify/issues/7423
3•jimmydin7•48m ago•0 comments

Golden Dome (orbital weapon system)

https://en.wikipedia.org/wiki/Golden_Dome_(missile_defense_system)
2•exomonk•51m ago•0 comments

GhidrAssist and GhidrAssistMCP LLM plugins reached v1.0

2•jtang613•51m ago•0 comments

Training Foundation Models on a Full-Stack AMD Platform

https://arxiv.org/abs/2511.17127
1•ngaut•52m ago•0 comments

Can bigger-is-better 'scaling laws' keep AI improving forever?

https://theconversation.com/can-bigger-is-better-scaling-laws-keep-ai-improving-forever-history-s...
6•devonnull•53m ago•0 comments