frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

China Targeted Downing Street Phones for Years

https://www.theregister.com/2026/01/27/chinalinked_hackers_accused_of_yearslong/
1•qwertyuiop_•1m ago•0 comments

Show HN: Boston, the Videogame

https://bostonvideogame.com/
1•piratebroadcast•1m ago•0 comments

Show HN: Pegasus3301 – a Cicada 3301–inspired online puzzle game

https://pegasus3301.com/
1•Perseus_•2m ago•0 comments

Loops

https://loops.video/about
1•doener•3m ago•0 comments

We Cut Token Usage by 83% and Still Hit 90%+ Retrieval Precision

https://www.byterover.dev/blog/maintain-90-retrieval-precision-with-83-token-reduction-file-based...
3•Arindam1729•6m ago•0 comments

LibrePCB 2.0 with new slint.rs based UI

https://librepcb.org/blog/2026-01-28_release_2.0.0/
1•rnestler•7m ago•0 comments

Claude Code Tips

https://agenticcoding.substack.com/p/32-claude-code-tips-from-basics-to
1•ykev•7m ago•0 comments

America: Re-Becoming Exceptional Again

https://gist.github.com/avkcode/c819ae510da669da1d933556fc96ada7
1•KyleVlaros•8m ago•0 comments

Pillbugs Are Getting Top Dollar Online. Poachers Have Noticed

https://www.nytimes.com/2026/01/28/climate/isopods-pillbugs-for-sale-online.html
2•leephillips•10m ago•0 comments

Stop Using Pseudo-Types

https://f2r.github.io/en/stop-using-pseudo-types.html
1•speckx•11m ago•0 comments

Ollama RLM Influenced App

https://jimliddle.github.io/Ollama-RLM-Analyzer/
1•JAL_UK•11m ago•1 comments

Russia's 1.2M casualties in Ukraine dwarf all its conflicts since WW2

https://www.cnn.com/2026/01/28/europe/russia-ukraine-casualties-csis-report-intl-hnk-ml
2•RickJWagner•14m ago•1 comments

Tell your audience what your blog posts are about as early as possible

https://www.marginalia.nu/log/a_129_finding_audience/
2•marginalia_nu•16m ago•0 comments

Software is forking into content and utility. Nothing in between survives

https://moldandyeast.substack.com/p/memes-and-machines
1•rmrmrm•17m ago•0 comments

People Hate Data Centers, So the Industry Is Spending Millions to Rebrand Them

https://www.motherjones.com/politics/2026/01/data-centers-public-opposition-industry-advertising-...
1•cdrnsf•18m ago•0 comments

Online Gambling Paradox: Cryptographic Verification and Behavioral Harm

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6065213
1•7777777phil•19m ago•0 comments

AI agent skills are scattered everywhere, so I indexed 10k

https://ai-skills.io/
1•edvaldodfreitas•20m ago•0 comments

The Refragmentation

https://paulgraham.com/re.html
1•Anon84•27m ago•0 comments

Amazon confirms 16,000 job cuts after accidental email

https://www.bbc.co.uk/news/articles/cx2ywzxlxnlo
8•c-oreills•28m ago•0 comments

Show HN: Prism.Tools – Now 100% Offline capable

1•BLGardner•28m ago•0 comments

The Productivity Ceiling of AI Coding Tools

https://pushtoprod.substack.com/p/stop-babysitting-your-ai-coding-agents
1•sciurus•29m ago•0 comments

My Ridiculously Robust Photo Management System (Immich Edition)

https://jaisenmathai.com/articles/my-ridiculously-robust-photo-management-system-immich-edition/
1•jmathai•29m ago•0 comments

What a Password Spray Attack Can Teach You About CIAM Integration Needs (2025)

https://ciamweekly.substack.com/p/what-a-password-spray-attack-can
1•mooreds•30m ago•0 comments

Dozens of CDC vaccination databases have been frozen under RFK Jr.

https://arstechnica.com/health/2026/01/rfk-jr-lets-cdc-vaccination-data-rot-dozens-of-databases-f...
3•oldnetguy•30m ago•0 comments

SkyPilot at Shopify: Multi-cloud GPUs without the pain (2026)

https://shopify.engineering/skypilot
1•tosh•31m ago•0 comments

The Seduction (and Folly) of Rollups, Points, and (Most) Time Tracking

https://cutlefish.substack.com/p/tbm-403-the-seduction-and-folly-of
1•mooreds•31m ago•0 comments

ImageFK

https://imagefk.com
1•zhouhua•31m ago•0 comments

AmiGUS sound card on my Amiga 3000

https://www.epsilonsworld.com/2026/01/amigus-sound-card-on-my-amiga-3000.html
1•doener•32m ago•0 comments

Ask HN: Hardware Setup for a Beginner Tinkerer

1•aneeqdhk•33m ago•0 comments

Show HN: FLUX.2 Klein – Sub-Second AI Image Generation – 4B and 9B Models

https://flux2klein.co
1•evon0231•33m ago•0 comments