frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Windows 11 is finally killing off legacy printer drivers in 2026

https://www.windowscentral.com/microsoft/windows-11/windows-11-finally-pulls-the-plug-on-legacy-p...
1•ValdikSS•37s ago•0 comments

From Offloading to Engagement (Study on Generative AI)

https://www.mdpi.com/2306-5729/10/11/172
1•boshomi•2m ago•1 comments

AI for People

https://justsitandgrin.im/posts/ai-for-people/
1•dive•3m ago•0 comments

Rome is studded with cannon balls (2022)

https://essenceofrome.com/rome-is-studded-with-cannon-balls
1•thomassmith65•8m ago•0 comments

8-piece tablebase development on Lichess (op1 partial)

https://lichess.org/@/Lichess/blog/op1-partial-8-piece-tablebase-available/1ptPBDpC
2•somethingp•10m ago•0 comments

US to bankroll far-right think tanks in Europe against digital laws

https://www.brusselstimes.com/1957195/us-to-fund-far-right-forces-in-europe-tbtb
2•saubeidl•11m ago•0 comments

Ask HN: Have AI companies replaced their own SaaS usage with agents?

1•tuxpenguine•14m ago•0 comments

pi-nes

https://twitter.com/thomasmustier/status/2018362041506132205
1•tosh•16m ago•0 comments

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

https://github.com/garnetliu/crew
1•gl2334•16m ago•0 comments

New hire fixed a problem so fast, their boss left to become a yoga instructor

https://www.theregister.com/2026/02/06/on_call/
1•Brajeshwar•18m ago•0 comments

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

https://www.theregister.com/2026/02/06/ai_capex_plans/
1•Brajeshwar•18m ago•0 comments

A free Dynamic QR Code generator (no expiring links)

https://free-dynamic-qr-generator.com/
1•nookeshkarri7•19m ago•1 comments

nextTick but for React.js

https://suhaotian.github.io/use-next-tick/
1•jeremy_su•20m ago•0 comments

Show HN: I Built an AI-Powered Pull Request Review Tool

https://github.com/HighGarden-Studio/HighReview
1•highgarden•21m ago•0 comments

Git-am applies commit message diffs

https://lore.kernel.org/git/bcqvh7ahjjgzpgxwnr4kh3hfkksfruf54refyry3ha7qk7dldf@fij5calmscvm/
1•rkta•23m ago•0 comments

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs

https://clawemail.com
1•aleks5678•30m ago•1 comments

UnAutomating the Economy: More Labor but at What Cost?

https://www.greshm.org/blog/unautomating-the-economy/
1•Suncho•37m ago•1 comments

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

https://gettorr.com/
1•BenaouidateMed•38m ago•0 comments

Statin drugs safer than previously thought

https://www.semafor.com/article/02/06/2026/statin-drugs-safer-than-previously-thought
1•stareatgoats•40m ago•0 comments

Handy when you just want to distract yourself for a moment

https://d6.h5go.life/
1•TrendSpotterPro•41m ago•0 comments

More States Are Taking Aim at a Controversial Early Reading Method

https://www.edweek.org/teaching-learning/more-states-are-taking-aim-at-a-controversial-early-read...
2•lelanthran•43m ago•0 comments

AI will not save developer productivity

https://www.infoworld.com/article/4125409/ai-will-not-save-developer-productivity.html
1•indentit•48m ago•0 comments

How I do and don't use agents

https://twitter.com/jessfraz/status/2019975917863661760
1•tosh•54m ago•0 comments

BTDUex Safe? The Back End Withdrawal Anomalies

1•aoijfoqfw•56m ago•0 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
7•michaelchicory•59m ago•1 comments

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

https://github.com/O0000-code/Ensemble
1•IO0oI•1h ago•1 comments

PR to support XMPP channels in OpenClaw

https://github.com/openclaw/openclaw/pull/9741
1•mickael•1h ago•0 comments

Twenty: A Modern Alternative to Salesforce

https://github.com/twentyhq/twenty
1•tosh•1h ago•0 comments

Raspberry Pi: More memory-driven price rises

https://www.raspberrypi.com/news/more-memory-driven-price-rises/
2•calcifer•1h ago•0 comments

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•1h ago•1 comments
Open in hackernews

The Illusion of Thinking: A Reality Check on AI Reasoning

https://leotsem.com/blog/the-illusion-of-thinking/
21•leotsem•7mo ago

Comments

leotsem•7mo ago
Apple’s recent paper on the limits of AI reasoning is an uncomfortable but important read.

Instead of relying on standard benchmarks, the authors designed controlled environments—like Tower of Hanoi and River Crossing puzzles—to test how models handle increasing compositional complexity. The results: performance doesn’t taper off, it collapses. And even when the models fail, they continue to produce fluent, structured reasoning traces that sound convincing but fall apart logically.

If you’re building on top of LLMs or reasoning-augmented models, it’s well worth a look.

salviati•7mo ago
If you ask me to solve increasingly dififcult Tower of Hanoi problems, I don't expect to be good at it. Neither would I expect a fellow human to be. So based on this should we question our intelligence?

I heard about that paper through an "AI explained" video [0], so I might be biased, but I agree with that video that the Apple paper is "meh" at best: it points out LLM limitations that are hardly a surprise.

[0] https://www.youtube.com/watch?v=wPBD6wTap7g

vincnetas•7mo ago
Probably the difference between you and AI is that you would acknowledge that it's too difficult for you, and not to bullshit your way through.
saithound•7mo ago
That's _exactly_ what the LLM did: the article's authors decided to count that as a failure.
vincnetas•7mo ago
Hm was reading only TFA not the research paper. But TFA mentions this :

  Perhaps the most unsettling finding is what failure looks like. Even when models are completely wrong, they sound persuasive. The reasoning is fluent, the explanations are structured, and the conclusions are confidently delivered. But the logic doesn’t hold.
rcarmo•7mo ago
That sounds a lot like a salesperson. And yes, there is a human tendency to twist reasoning to make the written word look polished, and I don’t think LLM training has fixed that bias.
ForHackernews•7mo ago
Curious about the use of the word "uncomfortable" -- for people working on AI who thought that LLM or L"R"Ms were a path to AGI?

To me, that paper was reassuring that I wasn't taking crazy pills. I've worked with these tools to produce code, and they routinely make mistakes that no thinking entity (yes, I've worked with some dimwitted junior devs) ever would. Yes, they are powerful and useful tools, but they're not "thinking" in any meaningful sense (defined here as a rigorously determining an algorithm and applying it correctly).

archon1410•7mo ago
The blog itself reads as if it was written by an LLM. (e.g. "This isn't about X, it's about Y." "... is timely ..." "X isn't Y".)

Weird.

And it has been discussed to death already:

Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems) [https://www.lesswrong.com/posts/5uw26uDdFbFQgKzih/beware-gen...]

Seven replies to the viral Apple reasoning paper and why they fall short [https://news.ycombinator.com/item?id=44278403]

antirez•7mo ago
The chain of thoughts is not where the reasoning capabilities of a model happens: models have reasoning capabilities that are part of the next token inference, what CoT does is searching/sampling the model space of representations and notions in order to "ground" the final reply, putting in the context window in an explicit way all the related knowledge and ideas the model possess about the question.

It is absolutely obvious that algorithmic problems like the Tower of Hanoi can't benefit from sampling. Also, algorithmic problems are domains that are comfortable for the paper authors to have a verifiable domain of puzzles, but are very far from what we want the models to do, and what they are good at. Models would solve this by implementing an algorithm in Python and calling a tool to execute it. This is how they can more easily solve such problems.

Moreover: in most benchmarks CoT improves LLMs performances a lot, because sampling helps immensely to provide a better reply. So this paper negative result is basically against a very vast experience of CoT being a powerful tool for LLMs, simply because most benchmarks operate on domains where sampling is very useful.

In short, the Apple paper mostly says things that were very obvious: it is like if they were trying to reach a negative result. It was a widespread vision that CoT can't help performing algorithmic work by concatenating tokens, if not in the most obvious ways. Yet, it helps a lot when there is to combine existing (inside the model) knowedge/ideas to provide a better reply.

pyman•7mo ago
What they're saying is that pattern-matching isn't the path to AGI. Humans and AI can both solve the Tower of Hanoi, but once the number of disks goes up, we both struggle.

Apple's point is that if we want to build something smarter than us, we need to look at intelligence and reasoning from a different angle.

rcarmo•7mo ago
Exploring how to consistently arrive at a negative result is still a valid research goal. I don’t think we’ve had enough of that kind of research regarding LLMs—-everything is so positive that it defies basic statistics…
jsnell•7mo ago
This paper, rebuttals, and rebuttals to rebuttals have been on HN repeatedly over the last couple of weeks (including literally now). At this point a summary of the original paper doesn't seem like it's adding much.

E.g.

https://news.ycombinator.com/item?id=44203562

https://news.ycombinator.com/item?id=44221900

https://news.ycombinator.com/item?id=44234626

https://news.ycombinator.com/item?id=44278403

https://news.ycombinator.com/item?id=44286086

crowie•7mo ago
This might be a dumb question, and will inevitably showcase my ignorance in this field to others, but I will risk that; Why can't AI at a certain level execute algorithms with solutions that have been proved to work for a very long time? What I mean is, the solution of the Hanoi towers problem is known. It does not take a lot of computational power to achieve the result. What is stopping an AI such as the objects of exam in the paper to execute such algorithms and gather the solutions, like a human programmer would? Do they get sidetracked in the process due to the amount of tokens? (edit: typo)
pyman•7mo ago
If humanity moves to Mars one day and leaves behind all the AI servers running on solar power, then comes back a billion years later, the AI would still be saying the same things. Why? Because no matter how powerful it is, AI doesn't evolve or grow on its own.
crowie•7mo ago
Gotcha, but I didn't mean it in that way. What I meant is, that problems like the case-study ones don't need a revolutionary nor an original answer which would require growth, they can be solved with old solutions which I would assume would be in some way embedded into the learning dataset of these models. Yeah, the scope of the problem is bigger, but the correct answer should come down in any case to a correct implementation of the known algorithm. The thing I'm asking is what causes the hindrance which prevents these AIs from performing in appropriate ways given old problems and old solutions.
ryandvm•7mo ago
I like your thought experiment and I think you're correct, but that's because we never gave it the physical possibility of a feedback loop (a.k.a. evolution).

I think if you added a step where the LLMs tweak their own build process and redeploy, your experiment would have wildly different results.

Yizahi•7mo ago
The so called "reasoning" of LLM programs is really a sham. And authors of those programs are sometimes expose it themselves. For example the article by Anthropic about Claude "reasoning". When they get to the math block they ask the program to add two numbers and then ask to write step by step flow how the LLM did it. LLM generates a human-based flow, because that's what it copied from the training data, while the real flow of LLM adds numbers is vastly different.

Basically so called "reasoning" is just generation of additional intermediary output, resembling real reasoning, but not being it.

https://transformer-circuits.pub/2025/attribution-graphs/bio...

rsynnott•7mo ago
> Apple’s new paper, The Illusion of Thinking, quietly released ahead of WWDC 2025, challenges many of the assumptions we’ve come to rely on in the LLM space.

So... wait, were people _really_ assuming that these things were reasoning? Why? Like, because the marketing said so? I had the idea that that was generally viewed as puffery; obviously they're not reasoning.

It's an interesting paper, but its outcome is completely unsurprising. What would have been surprising is if it had shown something different.

> Perhaps the most unsettling finding is what failure looks like. Even when models are completely wrong, they sound persuasive.

Again... This has been a fairly well-known problem with LLMs since GPT-3 or so. I'm not sure why anyone would find it unsettling at this point; they're confident-sounding bullshit engines.