frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

The Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

https://arxiv.org/abs/2606.02437
1•Anon84•1m ago•0 comments

A way to exclude sensitive files issue still open for OpenAI Codex

https://github.com/openai/codex/issues/2847
2•pikseladam•4m ago•1 comments

Starbucks Is One of the Largest Banks

https://www.msn.com/en-us/money/companies/starbucks-is-secretly-one-of-the-world-s-largest-banks/...
2•ColinWright•6m ago•0 comments

Guy in his basement creates a drug to treat Alzheimer's disease using AI

https://twitter.com/DouglasYaoDY/status/2070904914050797582
3•binyu•8m ago•0 comments

NASA tests AI medic for astronauts too far from Earth to call a doctor

https://www.theregister.com/ai-and-ml/2026/06/27/nasa-tests-ai-medic-for-astronauts-too-far-from-...
2•LorenDB•8m ago•0 comments

Show HN: Sambee – browser-based file manager for SMB shares and local drives

https://sambee.net
1•helgek•9m ago•0 comments

Warp Point: A curated webring of video game websites

https://www.warppoint.games/
1•mysterydip•9m ago•0 comments

The crowd-funded Porsche 911

https://project996.fun
2•mhavelka77•11m ago•2 comments

Show HN: Parseflow – Extract data from any document. Entirely on your Mac

https://www.parseflow.io/
1•devtanna•17m ago•0 comments

Beyond symbolic algebra with quantum picturalism

https://www.frontiersin.org/journals/cognition/articles/10.3389/fcogn.2026.1790789/full
2•mathgenius•17m ago•0 comments

Denoising Voice Recordings On-Device

https://www.duration.ai/blog/denoising-voice-recordings-on-device
1•sudb•20m ago•0 comments

Ask HN: Are OTA updates for native iOS/Swift apps allowed?

1•jackappdev•22m ago•0 comments

Cypherpunk Library

https://www.cypherpunklibrary.com/collection
2•bookofjoe•23m ago•0 comments

Nearly Three-Quarters of Dutch Responses to EU Tobacco Rules Were AI-Generated

https://pointer.kro-ncrv.nl/meerderheid-nederlandse-inspraak-op-strengere-eu-tabakswet-afkomstig-...
2•stefanvdw1•27m ago•0 comments

Ask HN: If someone invested $100k now for your startup, how would you spend it?

4•aurenvale•32m ago•0 comments

Tldr.fail – buggy servers break PQ KEX compatibility in TLS

https://tldr.fail/
1•basilikum•34m ago•0 comments

Kids Act Would Require Age Checks to Get Online

https://www.eff.org/deeplinks/2026/06/kids-act-would-require-age-checks-get-online
2•bilsbie•35m ago•0 comments

CORS Explained in Plain English

https://sanyamserver.online/posts/cors/
2•RickJWagner•35m ago•0 comments

More evidence of life on Mars but still no life

https://www.cbc.ca/radio/quirks/more-evidence-of-life-on-mars-but-still-no-life-1.7649645
11•pseudolus•36m ago•2 comments

From Prompts to Loops: Building Autonomous Coding Agents

https://animeshgaitonde.medium.com/from-prompts-to-loops-building-autonomous-coding-agents-6135bf...
1•animesh371g•38m ago•0 comments

Expect Claude Fable 5 to Be Turned Back on in a Matter of Days, Report Says

https://gizmodo.com/expect-claude-fable-5-to-be-turned-back-on-in-a-matter-of-days-report-says-20...
2•HiroProtagonist•38m ago•0 comments

Beyond Functional Programming: The Verse Programming Language (2022) [pdf]

https://simon.peytonjones.org/assets/pdfs/haskell-exchange-22.pdf
1•tosh•38m ago•0 comments

France records around 1k additional deaths amid extreme heat wave

https://apnews.com/article/europe-heat-temperature-records-france-deaths-germany-61f444317600cf1b...
2•geox•38m ago•0 comments

Almavivo – The On-Device Health Platform

https://almavivo.com
2•morog•43m ago•0 comments

Policy Pulse – Issue #21 – Week of June 27, 2026

https://blog.disclose.io/policy-pulse-issue-21-week-of-june-27-2026/
1•jruohonen•45m ago•0 comments

Show HN: Warren – run isolated instances of any CLI tool (no containers,no root)

https://github.com/swadhinbiswas/warren
1•0xER•46m ago•0 comments

Shadcn/UI components that can be used without react

https://basecoatui.com/
2•buckwheatmilk•47m ago•1 comments

Imagine Telling Someone in 1999

https://twitter.com/JesseTinsley/status/2070306180543500530
1•ksec•47m ago•0 comments

Show HN: Genius AI Detector

https://geniusaidetector.com/
2•Rudism•48m ago•0 comments

GhostGrid drift detection and edge tamper evidence via Ed25519

https://ghostgrid.dannygc.cloud/
1•aisoverighn•49m ago•0 comments