frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Is chain-of-thought AI reasoning a mirage?

https://www.seangoedecke.com/real-reasoning/
19•ingve•1h ago

Comments

NitpickLawyer•1h ago
Finally! A good take on that paper. I saw that arstechnica article posted everywhere, and most of the comments are full of confirmation bias, and almost all of them miss the fineprint - it was tested on a 4 layer deep toy model. It's nice to read a post that actually digs deeper and offers perspectives on what might be a good finding vs. just warranting more research.
sempron64•30m ago
Betteridge's Law of Headlines.

https://en.m.wikipedia.org/wiki/Betteridge's_law_of_headline...

robviren•20m ago
I feel it is interesting but not what would be ideal. I really think if the models could be less linear and process over time in latent space you'd get something much more akin to thought. I've messed around with attaching reservoirs at each layer using hooks with interesting results (mainly over fitting), but it feels like such a limitation to have all model context/memory stuck as tokens when latent space is where the richer interaction lives. Would love to see more done where thought over time mattered and the model could almost mull over the question a bit before being obligated to crank out tokens. Not an easy problem, but interesting.
dkersten•4m ago
Agree! I’m not an AI engineer or researcher, but it always struck me as odd that we would serialise the 100B or whatever parameters of latent space down to maximum 1M tokens and back for every step.
mentalgear•15m ago
> Whether AI reasoning is “real” reasoning or just a mirage can be an interesting question, but it is primarily a philosophical question. It depends on having a clear definition of what “real” reasoning is, exactly.

It's pretty easy: causal reasoning. Causal, not statistic correlation only as LLM do, with or without "CoT".

naasking•9m ago
Define causal reasoning?
glial•8m ago
Correct me if I'm wrong, I'm not sure it's so simple. LLMs are called causal models in the sense that earlier tokens "cause" later tokens, that is, later tokens are causally dependent on what the earlier tokens are.

If you mean deterministic rather than probabilistic, even Pearl-style causal models are probabilistic.

I think the author is circling around the idea that their idea of reasoning is to produce statements in a formal system: to have a set of axioms, a set of production rules, and to generate new strings/sentences/theorems using those rules. This approach is how math is formalized. It allows us to extrapolate - make new "theorems" or constructions that weren't in the "training set".

empath75•13m ago
One thing that LLMs have exposed is how much of a house of cards all of our definitions of "human mind"-adjacent concepts are. We have a single example in all of reality of a being that thinks like we do, and so all of our definitions of thinking are inextricably tied with "how humans think", and now we have an entity that does things which seem to be very like how we think, but not _exactly like it_, and a lot of our definitions don't seem to work any more:

Reasoning, thinking, knowing, feeling, understanding, etc.

Or at the very least, our rubrics and heuristics for determining if someone (thing) thinks, feels, knows, etc, no longer work. And in particular, people create tests for those things thinking that they understand what they are testing for, when _most human beings_ would also fail those tests.

I think a _lot_ of really foundational work needs to be done on clearly defining a lot of these terms and putting them on a sounder basis before we can really move forward on saying whether machines can do those things.

gdbsjjdn•3m ago
Congratulations, you've invented philosophy.
naasking•10m ago
> Because reasoning tasks require choosing between several different options. “A B C D [M1] -> B C D E” isn’t reasoning, it’s computation, because it has no mechanism for thinking “oh, I went down the wrong track, let me try something else”. That’s why the most important token in AI reasoning models is “Wait”. In fact, you can control how long a reasoning model thinks by arbitrarily appending “Wait” to the chain-of-thought. Actual reasoning models change direction all the time, but this paper’s toy example is structurally incapable of it.

I think this is the most important critique that undercuts the paper's claims. I'm less convinced by the other point. I think backtracking and/or parallel search is something future papers should definitely look at in smaller models.

The article is definitely also correct on the overreaching, broad philosophical claims that seems common when discussing AI and reasoning.

OSNews goes ad-free, for everyone, and we need your support

https://www.osnews.com/story/143052/osnews-goes-ad-free-for-everyone-and-we-need-your-support/
1•ksec•29s ago•0 comments

An interstellar mission to test astrophysical black holes

https://arxiv.org/abs/2504.14576
1•bikenaga•38s ago•0 comments

AGI: Probably Not 2027

https://www.verysane.ai/p/agi-probably-not-2027
1•leptoniscool•39s ago•0 comments

Trends in US Children's Mortality and Health

https://jamanetwork.com/journals/jama/article-abstract/2836060
1•idoubtit•2m ago•1 comments

Saint Seiya Singer Nobuo Yamada Dies at 61

https://www.animenewsnetwork.com/news/2025-08-13/saint-seiya-singer-nobuo-yamada-dies-at-61/.227613
1•ksec•4m ago•0 comments

Leeches and the Legitimizing of Folk-Medicine

https://press.asimov.com/articles/leeches-and-the-legitimizing-of-folk-medicine
1•mailyk•4m ago•0 comments

Parallel: Web Search Infrastructure for AIs

https://parallel.ai/blog/introducing-parallel
1•meetpateltech•5m ago•0 comments

Omarchy (micro) forks Chromium [video]

https://www.youtube.com/watch?v=ZEFYTdzYxQM
2•crbelaus•6m ago•0 comments

How One Activist Is Using a Decades-Old Policy to Stall Green Energy Projects

https://www.propublica.org/article/irene-gilbert-oregon-solar-green-energy-policy
2•voxadam•6m ago•0 comments

For Some Patients, the 'Inner Voice' May Soon Be Audible

https://www.nytimes.com/2025/08/14/science/brain-neuroscience-computers-speech.html
1•nabla9•8m ago•1 comments

Microsoft's canceled "Surface Neo" dual-screen PC

https://www.windowscentral.com/hardware/surface/surface-neo-review
1•Ezhik•8m ago•0 comments

When Your AI Friend Gets a Corporate Makeover

https://getcoai.com/article/chatgpt5-launch/
1•djabatt•9m ago•0 comments

Australia Blocks Polymarket After Regulator Targets Illegal Online Betting

https://cryptonews.com.au/news/australia-blocks-polymarket-after-regulator-targets-illegal-online-betting-130407/
1•perihelions•9m ago•0 comments

Launch HN: Cyberdesk (YC S25) – Automate Windows legacy desktop apps

3•mahmoud-almadi•9m ago•0 comments

DARPA christens unmanned ship aimed at revolutionizing naval capability

https://www.darpa.mil/news/2025/nomars-christening
1•geox•10m ago•0 comments

Show HN: FilterQL – A tiny query language for filtering structured data

https://github.com/adamhl8/filterql
1•genshii•10m ago•0 comments

Why Remediation Is the Hardest Problem in NHI Security

https://www.token.security/blog/why-remediation-is-the-hardest-problem-in-nhi-security
2•mooreds•12m ago•0 comments

Show HN: Accentless – Right-click to add native accents your writing

https://accentless.app/
1•boros2me•13m ago•0 comments

Car brands using curl, Car brands sponsoring or paying for curl support

https://mastodon.social/@bagder/115025727082593712
2•pabs3•14m ago•0 comments

Study reveals salps play outsize role in damping global warming

https://www.sciencedaily.com/releases/2023/02/230205081319.htm
1•thunderbong•14m ago•0 comments

A generative deep learning approach to de novo antibiotic design

https://www.cell.com/cell/abstract/S0092-8674(25)00855-4
1•fidotron•16m ago•0 comments

Lasso Transactions – Fund Creators and Combat Free-Riders Without Copyright

https://rasmuskirk.com/articles/2025-06-18_lasso-transactions-as-an-alternative-to-copyright/
3•bitterblotter•17m ago•0 comments

The Architecture Industry Can Teach Us About Smooth UI/UX Design Handoffs (2023)

https://cloudcity.io/blog/2023/02/20/What-the-Architecture-Industry-Can-Teach-Us/
1•mooreds•17m ago•0 comments

Ask HN: How do you tune your personality to get better at interviews?

6•tombert•19m ago•6 comments

RockyLinux update to kernel 4.18.0-553.66 or 64

1•roscas•20m ago•0 comments

We Should All Code Like Steve Jobs

https://priver.dev/blog/we-should-all-code-like-steve-jobs/
2•antfarm•21m ago•0 comments

AI-designed antibiotics pave way for defeating superbugs

https://www.bbc.com/news/articles/cgr94xxye2lo
4•roboboffin•23m ago•0 comments

Statement Regarding Misleading Media Reports

https://www.kodak.com/en/company/blog-post/statement-regarding-misleading-media-reports/
18•whicks•25m ago•1 comments

The Great American Fitness Boom

https://www.derekthompson.org/p/the-great-american-fitness-boom
2•gamechangr•25m ago•0 comments

FoundationDB: A Distributed Database That Can't Be Killed

https://thenewstack.io/foundationdb-a-distributed-database-that-cant-be-killed/
3•mooreds•27m ago•0 comments