news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Large Language Model Reasoning Failures

https://arxiv.org/abs/2602.06176

5•T-A•2h ago

Comments

chrisjj•1h ago

The only reasoning failures here are in the minds of humans gulled into expecting chatbot reasoning ability.

sergiomattei•9m ago

Papers like these are much needed bucket of ice water. We antropomorphize these systems too much.

Skimming through conclusions and results, the authors conclude that LLMs exhibit failures across many axes we'd find to be demonstrative of AGI. Moral reasoning, simple things like counting that a toddler can do, etc. They're just not human and you can reasonably hypothesize most of these failures stem from their nature as next-token predictors that happen to usually do what you want.

So. If you've got OpenClaw running and thinking you've got Jarvis from Iron Man, this is probably a good read to ground yourself.

Note there's a GitHub repo compiling these failures from the authors: https://github.com/Peiyang-Song/Awesome-LLM-Reasoning-Failur...

Armatron

https://armatron.vercel.app/

1•thomasfromcdnjs•5m ago•0 comments

Sanders warns US has no clue about speed and scale of coming AI revolution

https://www.theguardian.com/us-news/2026/feb/21/ai-revolution-bernie-sanders-warning

1•thunderbong•5m ago•0 comments

PostHog's 404 Page

https://posthog.com/skdjghdjkfhgkjhdfg

1•howToTestFE•6m ago•1 comments

Perplexity Pro promo subscription suspended without explanation?

1•aanno•9m ago•0 comments

Indian food delivery giant leak location metadata,food preferences to strangers

https://medium.com/@jatin.b.rx3/how-a-zomato-feature-enables-stalking-which-they-call-working-as-...

1•jatin-dot-py•10m ago•0 comments

Building a language that people want

https://blog.merigoux.fr/en/2026/02/19/building-proper-pl.html

1•art-w•10m ago•0 comments

Palantir Captured the UK Ministry of Defence

https://www.ft.com/content/5207928a-13e8-4832-8c6f-2e78740c16c9

2•macleginn•12m ago•0 comments

Code has always been the easy part

https://laughingmeme.org/2026/02/09/code-has-always-been-the-easy-part.html

2•Ozzie_osman•13m ago•0 comments

What Happened to Software Is Happening to Finance and Accounting

https://doempke.com/what-happened-to-software-is-happening-to-finance-and-accounting/

2•robk•14m ago•0 comments

Rare Blood Clots After Certain Covid Vaccines

https://www.mcgill.ca/oss/article/covid-19-medical-health-and-nutrition-technology/rare-blood-clo...

3•cyrc•16m ago•0 comments

Show HN: Shellspec – DSL to Test CLIs

https://github.com/itsfarseen/shellspec

1•itsfarseen-1•17m ago•0 comments

Ask HN: Best file format for AI reports output?

1•azkalam•28m ago•0 comments

A distributed queue in a single JSON file on object storage

https://turbopuffer.com/blog/object-storage-queue

3•Sirupsen•28m ago•0 comments

Show HN: Go Implementation of Systemd Time

https://gitlab.com/allddd/go-systemd-time

1•allddd•29m ago•0 comments

Joseph Weizenbaum's Hackerkritik

https://sdf.org/~pkal/src+etc/hacker-kritik.html

1•pkal•31m ago•0 comments

Relooted, a game where you take back stolen African artifacts from museums

https://www.theguardian.com/games/2026/feb/21/south-african-video-game-artefacts-western-museums

1•atombender•37m ago•0 comments

'Psychological torture': Spanish tenants fight back against housing 'harassment'

https://www.theguardian.com/world/2026/feb/21/spanish-tenants-fight-back-against-housing-harassme...

1•Geekette•37m ago•0 comments

Launch of Dozy

https://www.dozy.site/

1•david-kelen•38m ago•1 comments

Moltbook-CLI – crates.io: Rust Package Registry

https://crates.io/crates/moltbook-cli

1•kelexine•46m ago•0 comments

Show HN: RealDeed PropPass – Own Indian Real Estate Digitally from INR 10k

https://realdeed.in/

1•oxfpr555•47m ago•0 comments

Google Is Exploring Ways to Use Its Financial Might to Take on Nvidia

https://www.wsj.com/tech/ai/google-is-exploring-ways-to-use-its-financial-might-to-take-on-nvidia...

3•JumpCrisscross•48m ago•0 comments

Lindenmayer Systems

https://justinpombrio.net/2026/02/16/l-systems.html

1•birdculture•52m ago•0 comments

Baby chick study challenges a theory of how humans evolved language

https://www.scientificamerican.com/article/baby-chicks-pass-the-bouba-kiki-test-challenging-a-the...

1•atombender•55m ago•0 comments

Search and analyze documents from the DOJ Epstein Files release with local LLM

https://github.com/artmedlar/epstein-files-analyzer

2•edward•55m ago•0 comments

Show HN: Sketchdown – Wireframes as code, like Mermaid but for UI mockups

https://www.sketchdown.dev/

1•alexxozo2•56m ago•1 comments

Searching for Birds

https://SearchingForBirds.VisualCinnamon.com/

1•the-mitr•58m ago•0 comments

Opening Day at the Maternal Center of Excellence

https://pihsierraleone.org/news/opening-day-maternal-center-excellence

1•boriskourt•58m ago•0 comments

Show HN: Tastefinder – swipe-based movie and TV recommendations

https://tastefinder.io/

1•tastefinder_io•1h ago•0 comments

Andrej Karpathy talks about "Claws"

https://simonwillison.net/2026/Feb/21/claws/

56•helloplanets•1h ago•42 comments

Building Modern Databases with the FDAP Stack

https://gotopia.tech/articles/412/building-modern-databases-with-the-fdap-stack

1•mpweiher•1h ago•0 comments