frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Building a hill-climbing machine: Launching seven new MAI models

https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/
1•taquangkhoi•4m ago•0 comments

Tay AI Chatbot

https://en.wikipedia.org/wiki/Tay_(chatbot)
2•Jimmc414•6m ago•1 comments

Dark Software Factories Are Cool. What Comes After Them Is More Interesting

https://flummadiddle.bearblog.dev/dark-software-factories/
1•_doctor_love•6m ago•0 comments

Animation Vocabulary

https://animations.dev/vocabulary
1•itzlambda•6m ago•0 comments

OpenAI new privacy policy to include info about ads in ChatGPT

1•mmarian•14m ago•1 comments

AI hiring algorithms reject Black, Asian job seekers at higher rates

https://www.theregister.com/ai-ml/2026/05/27/ai-hiring-algorithms-reject-black-asian-job-seekers-...
1•erehweb•14m ago•0 comments

'Dumbass' criminal breaks the 'first rule of ransomware club'

https://www.theregister.com/cyber-crime/2026/06/02/dumbass-criminal-breaks-the-first-rule-of-rans...
2•Cider9986•15m ago•0 comments

GPU Forecasters: Language Models as Selective Surrogates for Kernel Optimization

https://arxiv.org/abs/2605.31464
1•matt_d•15m ago•0 comments

Benchmarking LLM-as-a-Judge for Long-Form Output Evaluation

https://arxiv.org/abs/2606.01629
1•berlianta•18m ago•0 comments

Daily Harvest sued after gallbladders removed after people consumed its product

https://www.cnn.com/2022/07/01/tech/daily-harvest-recall-lawsuits
2•JumpCrisscross•19m ago•0 comments

Can AI Do Intelligence Analysis? Apparently Not

https://blog.predictivedefense.io/p/can-ai-do-intelligence-analysis-apparently
1•beatrobot•26m ago•0 comments

One Equation. Thirty Binaries. Zero Agents

https://github.com/silentnoisehun/Bio-Binaries
1•silentnoisehun•26m ago•1 comments

Database-Centric Architecture

https://en.wikipedia.org/wiki/Database-centric_architecture
1•teleforce•30m ago•0 comments

Trump's Takeover of the American Regulatory Machine

https://www.wsj.com/politics/policy/trump-takeover-regulators-130b57a3
2•KnuthIsGod•32m ago•0 comments

Americans Are Leaving the U.S. in Record Numbers

https://www.wsj.com/podcasts/the-journal/americans-are-leaving-the-us-in-record-numbers/f2ae7db5-...
4•KnuthIsGod•33m ago•1 comments

Ask HN: How do people secure their Linux computer?

2•foo12bar•34m ago•2 comments

Community usage metrics and cost analytics for Claude Code subscriptions

https://meter.vsits.co/
1•sea-gold•35m ago•1 comments

Optimized Point Addition Circuits for Elliptic Curve Discrete Logarithms [pdf]

https://arxiv.org/abs/2606.02235
2•aburan28•36m ago•0 comments

Show HN: 3GPP Spec Manager – A GUI app to track and download 3GPP specifications

https://github.com/chsung/3gpp-spec-manager
2•tughvn•41m ago•1 comments

Implicit.js, a way to program 3D models with mathematical functions

https://www.implicit.sh/
1•softservo•42m ago•1 comments

Does Llms.txt Replace Sitemap.xml

https://docsalot.dev/blog/llms-txt-vs-sitemap-xml
1•fazkan•44m ago•0 comments

Platypus – create native Mac applications from command line scripts

https://github.com/sveinbjornt/Platypus
3•gregsadetsky•44m ago•1 comments

Nvidia to spend $150B a year in Taiwan, 'epicentre' of AI revolution

https://www.reuters.com/world/asia-pacific/nvidia-ceo-says-taiwan-is-epicentre-ai-revolution-2026...
1•JumpCrisscross•44m ago•0 comments

C64 OS – Ready for Internet Action – C64 OS steps it up [video]

https://www.youtube.com/watch?v=9TmJMBHrg7A
1•amichail•44m ago•0 comments

The Effort to Build Ukraine's Ground Robot Arsenal

https://www.twz.com/news-features/inside-the-effort-to-build-ukraines-ground-robot-arsenal
1•JumpCrisscross•48m ago•0 comments

Microsoft's Project Solara is an Android OS designed for agents instead of apps

https://arstechnica.com/gadgets/2026/06/microsofts-project-solara-is-an-android-os-designed-for-a...
1•thunderbong•51m ago•0 comments

A whale of a deal: Paramount's takeover of Warner Bros

https://www.reuters.com/graphics/WARNER-BROS-DIS-MA/PARAMOUNT-SKYDAN/byprngedkpe/
3•giuliomagnifico•51m ago•0 comments

Slow Tools

https://www.quarter--mile.com/Slow-Tools
3•ogundipeore•53m ago•0 comments

Global EV Outlook 2026: Growing sales amid an energy crisis [pdf]

https://iea.blob.core.windows.net/assets/3718cf37-fac6-4ee2-aeb0-1546e6222cfc/GlobalEVOutlook2026...
1•toomuchtodo•55m ago•1 comments

Ransomecare.io a tabletop journey where everything sucks

https://ransomecare.io/value
2•splintersio•56m ago•1 comments