frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

The just-say-no engineer was a ZIRP phenomenon

https://www.seangoedecke.com/the-just-say-no-engineer-was-a-zirp-phenomenon/
1•jxmorris12•1m ago•0 comments

Wpm with open source stenography keyboard

https://github.com/pizzalover125/sten0/
1•pizzalover12512•1m ago•0 comments

DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5

https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-...
2•ripvanwinkle•5m ago•0 comments

Run Llama.cpp on a Mac Pro 6,1 with Dual FirePro D700 GPUs on Ubuntu

https://matthewgribben.com/blog/mac-pro-6-1-llama-cpp-firepro-d700-vulkan-ubuntu
1•coloneltcb•12m ago•0 comments

The AI Decoupling

https://vintagedata.org/blog/posts/the-ai-decoupling
2•jxmorris12•12m ago•0 comments

'Catnomics': how Japan's feline fixation has become an industry worth billions

https://www.theguardian.com/world/2026/may/27/japan-obsessed-wth-cats-popular-pet-industry-worth-...
2•n1b0m•13m ago•0 comments

Why Does Your AI Agent Work Better for You Than for Me?

https://vexjoy.com/posts/why-your-ai-agent-works-better-for-you/
1•AndyNemmity•13m ago•0 comments

Credit card skimmer disguised as Google Tag Manager

https://anchor.host/so-you-get-hit-with-a-credit-card-skimmer-what-now/
1•logickkk1•13m ago•0 comments

I shipped a real product for $29.63 with five AI agents

https://github.com/vggg/agent-project-bootstrap
3•vggg•19m ago•0 comments

FML-Bench: A Controlled Study of AI Research Agent Strategies

https://arxiv.org/abs/2605.17373
1•matt_d•20m ago•0 comments

Crossing the Proof of Concept Valley

https://deploy95.substack.com/p/crossing-the-poc-valley
1•dddddaviddddd•21m ago•0 comments

When Quiet Undersea Volcanoes Turn Disruptive

https://www.quantamagazine.org/when-quiet-undersea-volcanoes-turn-disruptive-20260526/
1•anujbans•23m ago•0 comments

OpenRouter $113M Series C

https://www.nytimes.com/2026/05/26/business/dealbook/openrouter-ai-models-fundraising.html
1•swyx•26m ago•1 comments

I added achievements to my portfolio site

https://charlie.dudzik.me
1•cd-4•30m ago•0 comments

Power bills more than 250 per cent higher near data centres

https://www.theglobeandmail.com/investing/investment-ideas/article-market-factors-power-bills-mor...
3•cdrnsf•31m ago•0 comments

1.96.0 pre-release testing – Inside Rust Blog

https://blog.rust-lang.org/inside-rust/2026/05/26/1.96.0-prerelease/
1•kazu11max17•34m ago•0 comments

You Can't Stop This Data Center, a Mom Was Told. She Won't Quit

https://www.nytimes.com/2026/05/26/us/data-centers-kassi-solberg.html
3•1vuio0pswjnm7•38m ago•0 comments

Skills Folder Is a Junk Drawer

https://james-pritchard.com/blog/skills-junk-drawer
2•ArcaneMoose•40m ago•0 comments

Ambsheets: Spreadsheets for Exploring Scenarios

https://www.inkandswitch.com/ambsheets/notebook/
1•antran22•42m ago•0 comments

Micro-Expert-Router: Running Mixtral-Class Moe Models on NVMe SSDs Without a GPU

https://github.com/randyap8-wq/Micro-Expert-Router-SSD-Streamed-MoE-MER
1•randyap8•42m ago•0 comments

OpenAI's Altman says AI unlikely to lead to 'jobs apocalypse'

https://www.reuters.com/world/asia-pacific/openais-altman-says-ai-unlikely-lead-jobs-apocalypse-2...
4•1vuio0pswjnm7•44m ago•0 comments

Finding deadlocks in CuTe kernels with SPIN

https://metaworld.me/blog/public/Statically-finding-races-in-CUTE-kernels-or-Proving-absences-of-...
2•matt_d•45m ago•0 comments

A Case for Tracing Based DSL Kernel Languages

https://metaworld.me/blog/public/A-Case-for-Tracing-Based-DSL-Kernel-Languages
2•matt_d•47m ago•0 comments

Billionaire Mark Cuban says bye-bye Bitcoin: Why he is 'disappointed' by crypto

https://fortune.com/2026/05/26/mark-cuban-bitcoin-disappointed-crypto/
4•1vuio0pswjnm7•52m ago•0 comments

Google's Angle Merges Wayland Support, Unblocking Chromium Embedded Framework

https://www.phoronix.com/news/ANGLE-Merges-Wayland
4•DefineOutside•54m ago•0 comments

We reduced RAG retrieval cost 10× with a hippocampus-inspired memory substrate

https://www.bricbybric.ae/blog/hippocampus-memory-engine
4•aneesalsajir•54m ago•0 comments

The Codex Showcase

https://www.augmentedswe.com/p/openai-shows-you-how-to-use-codex
4•wordsaboutcode•57m ago•0 comments

Arias: Human Proof for FOSS Contributions

https://lwn.net/Articles/1074534/
2•prakashqwerty•58m ago•0 comments

The Coming Coordination Calamity

https://surfingcomplexity.blog/2026/05/24/the-coming-coordination-calamity/
2•wapasta•1h ago•0 comments

Ask HN: Looking for experienced web dev to make math website

2•marysminefnuf•1h ago•1 comments