frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Handler – A2A Protocol Client CLI and TUI

https://github.com/alDuncanson/Handler
1•alDuncanson•24s ago•0 comments

Residents of these states will live longer

https://thehill.com/homenews/nexstar_media_wire/5636807-residents-of-these-states-will-live-longe...
1•wjb3•2m ago•1 comments

Netflix's $72B Warner Bros Deal: A Defensive Move Driven by Fear, Not Strategy

https://ericlamb.substack.com/p/netflix-warner-bros-and-the-deal
1•ericlamb89•4m ago•0 comments

Determining if a signing entitlement is real or hallucinated

https://developer.apple.com/forums/thread/799000
1•tech234a•5m ago•0 comments

A geothermal amoeba sets a new upper temperature limit for eukaryotes

https://www.biorxiv.org/content/10.1101/2025.11.24.690213v1.full
2•wjb3•5m ago•0 comments

Learn Cutlass the Hard Way

http://www.kapilsharma.dev/posts/learn-cutlass-the-hard-way/
1•qwertyforce•6m ago•0 comments

Hatred of Israel Caused Iran's Water Crisis

https://aish.com/hatred-of-israel-caused-irans-water-crisis/
2•mhb•7m ago•0 comments

AI Slop Is Ruining Reddit for Everyone

https://www.wired.com/story/ai-slop-is-ruining-reddit-for-everyone/
1•INGELRII•7m ago•0 comments

Use this website to hit your manager

https://www.hitmymanager.com/
1•estheryang•8m ago•0 comments

Show HN: I built an open-source Python alternative to LabVIEW for my Physics PhD

https://github.com/prathameshnium/PICA-Python-Instrument-Control-and-Automation/blob/main/README.md
1•prathameshnium•9m ago•0 comments

CI regression tests for UX workflows

https://criticui.com/
1•nlei•10m ago•0 comments

Quadratic: Spreadsheet with AI, Code, and Connections

https://github.com/quadratichq/quadratic
1•saikatsg•10m ago•0 comments

Production-ready templates for GenAI agents on Google Cloud

https://github.com/GoogleCloudPlatform/agent-starter-pack
1•andrewstetsenko•11m ago•0 comments

Air Transat begins shutdown of operations as pilots serve strike notice

https://www.travelweek.ca/news/airlines/air-transat-begins-gradual-shutdown-of-operations-as-pilo...
1•geox•12m ago•0 comments

Casper 4 – A deterministic, governance-only autonomy stack for highspeed UAV SIM

https://github.com/FoxhunterLabs/CasperV4
2•FoxhunterLabs•14m ago•1 comments

Estimates are difficult for developers and product owners

https://thorsell.io/2025/12/07/estimates.html
8•todsacerdoti•14m ago•0 comments

Get notified when AWS end-of-lifes a service you use

https://dav3.app
2•jgrahamc•16m ago•2 comments

Show HN: Pre-deployment validation tool for Google Cloud TPU environments

https://github.com/clay-good/tpu-preflight
1•hireclay•16m ago•0 comments

Ask HN: Who else got pwned by the Next.js RCE?

3•whycombinetor•22m ago•0 comments

Americans Mess Up Their Taxes. A New Law Will Help

https://www.wakeuptopolitics.com/p/millions-of-americans-mess-up-their
2•toomuchtodo•28m ago•1 comments

Visualizing Google Page Rank [video]

https://www.youtube.com/watch?v=RWEzFmxqwPQ
1•sequant•29m ago•0 comments

German teachers: Pupils do not know how to listen, hold pens, or use bathroom

https://brusselssignal.eu/2025/12/german-teachers-sound-alarm-pupils-do-not-know-how-to-listen-ho...
4•obscurette•30m ago•1 comments

Apple Chip Chief Johny Srouji Could Be Next to Go as Exodus Continues

https://www.macrumors.com/2025/12/07/srouji-could-be-next-to-go-as-exodus-continues/
5•layer8•31m ago•0 comments

The War over Wine

https://thehustle.co/originals/the-war-over-wine
1•Anon84•33m ago•0 comments

Indie Outdoors

https://www.indieoutdoors.com/
1•toomuchtodo•34m ago•0 comments

Improving my productivity and context switching with Git worktree

https://futurepixels.co.uk/posts/improving-my-productivity-and-context-switching-with-git-worktrees/
2•smilinmonki666•36m ago•0 comments

Rnj-1: Building Instruments of Intelligence

https://essential.ai/research/rnj-1
1•nh43215rgb•36m ago•0 comments

Ask HN: AI tools to enhance old SATB choir recordings?

1•fdeage•37m ago•0 comments

Asymmetrical Wing Configuration for Reduced Drag in Transonic Flight

https://patents.google.com/patent/US4139172A/en
2•aggrrrh•38m ago•0 comments

Vulnerabilities in Spain's mandatory vehicle emergency beacon

https://github.com/LuisMirandaAcebedo/security_articles/blob/main/help_flash_iot/README.md
2•asp1•39m ago•0 comments