frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Mirage Reasoning: The Illusion of Visual Understanding

https://arxiv.org/abs/2603.21687
3•MrBuddyCasino•1h ago

Comments

MrBuddyCasino•1h ago
tl;dr: AI scores highly in medical image test, even though it hasn’t seen the image.

Details: https://x.com/euanashley/status/2037993596956328108

fernly•5m ago
From the Conclusion section of the PDF[1],

"Multimodal AI systems are increasingly deployed on the assumption that their benchmark performance reflects genuine visual understanding. Our results fundamentally challenge these assumptions. Across every model-benchmark pair tested, the accuracy that frontier models achieved without any access to images exceeded the additional accuracy they gained when images were provided. Moreover, a text-only 3-billion-parameter model, trained solely on question-answer pairs stripped of images, outperformed all frontier multimodal systems and human radiologists on a held-out chest radiology benchmark. Taken together, these results demonstrate that high benchmark accuracy does not reliably indicate visual understanding."

Basically, they are so good at extracting clues from the text of the questions, and extrapolating from them, that they proceed to answer _as if_ they had an image to view. With confidence, of course.

[1] https://arxiv.org/pdf/2603.21687

Typing and Keyboards

https://lzon.ca/posts/series/grateful/typing-and-keyboards/
1•jpmitchell•35s ago•0 comments

They bought an SF laundromat for passive income. Then problems

https://www.sfgate.com/local/article/the-laundry-hub-sf-22063097.php
1•paulpauper•1m ago•0 comments

How to Find Your Personal Optimal Diet

https://www.exfatloss.com/p/how-to-find-your-personal-optimal
1•paulpauper•2m ago•0 comments

Show HN: Unlisted – Daily job alerts from 30 low-competition sources

https://unlisted.shelter.money
1•dreamsandcode•2m ago•0 comments

Toyota CEO Warns Top Suppliers: 'Unless Things Change, We Will Not Survive'

https://www.autonews.com/toyota/an-toyota-suppliers-koji-sato-kenta-kon-warning-boost-productivit...
1•ilamont•2m ago•1 comments

Bypassing the DOM to Mathematically Deep-Fry MP4s and Images (Rust/WASM)

https://theglitch.ing/
1•helba-ai•4m ago•0 comments

Why Socializing Loses to Alcohol in Addiction

https://neurosciencenews.com/alcohol-bias-anterior-insula-30223/
1•gnabgib•5m ago•0 comments

Show HN: DeepRepo – AI architecture diagrams from GitHub repos

https://deeprepo.dev
2•uwais12•5m ago•0 comments

Ahead of Its Mega-IPO, SpaceX Reminds Investors Disruption Is Coming

https://www.barrons.com/articles/mega-ipo-elon-muskspacex-disruption-spectrum-abe285a6?mod=goog_f...
1•gmays•9m ago•0 comments

Ask HN: Settings.json is an insane design choice for OpenClaw?

2•tpurves•9m ago•0 comments

Monaspace 1.400 introduces support for Cyrillic, Greek, and Vietnamese

https://github.com/githubnext/monaspace/releases/tag/v1.400
1•harmonics•9m ago•0 comments

UEdu – Student writing mental health signals with 40 psycholinguistic features

https://github.com/harold-wang-dev/uedu
1•ueduvan•12m ago•0 comments

Autoresearch for Integer Factorization

https://github.com/iliazintchenko/agent-factoring
2•chaisan•13m ago•0 comments

ChatGPT, Claude, Gemini, and Grok are all bad at crediting news outlets

https://www.niemanlab.org/2026/03/chatgpt-claude-gemini-and-grok-are-all-bad-at-crediting-news-ou...
1•PretzelFisch•17m ago•0 comments

AI Is Not About to Become Sentient

https://quillette.com/2026/03/27/ai-is-not-about-to-become-sentient-moltbook-openclaw/
5•measurablefunc•18m ago•0 comments

Legacy PC design misery (2009)

https://mjg59.livejournal.com/118098.html
3•birdculture•20m ago•1 comments

Show HN: I made a free list of 100 places to promote your SaaS

https://launchdirectories.com
3•rosennn•20m ago•0 comments

Show HN: AgentLens – Chrome DevTools for AI Agents (open-source, self-hosted)

https://github.com/tranhoangtu-it/agentlens
1•tranhoangtu•20m ago•0 comments

I built a tool to prove you don't need a GPU upgrade

https://best-gpu.com/upgrade.php
1•Nebyl•21m ago•1 comments

Spatial Audio Notifications for Multi Window Claude Code –> Claudio

https://github.com/FlorisFok/Claudio
2•FlorisFok•28m ago•1 comments

The Augmentation of Doug Engelbart

https://www.youtube.com/watch?v=_7ZtISeGyCY
1•larve•30m ago•0 comments

The Mind Layer: Minds, Not Brains

https://metaversus.substack.com/p/level-13-the-mind-layer
1•ryanfoo•30m ago•0 comments

Show HN: Timezone App – Visual meeting scheduler for distributed teams

https://timezoneapp.co/
3•choogi•32m ago•0 comments

100% Interception of Multi-Turn Jailbreaks on GPT-4o-Mini and Gemini

https://zenodo.org/records/19314889
1•mthree•35m ago•0 comments

AI software for smart glasses wins £1M prize for helping people with dementia

https://www.theguardian.com/society/2026/mar/18/ai-smart-glasses-1m-prize-technology-dementia
5•ohjeez•36m ago•0 comments

Show HN: TNTStack – Monorepo template for cross-platform apps (Tauri+Next.js)

https://tntstack.odest.dev/en
1•odest•36m ago•1 comments

Stripe withheld $85,000 from our EU platform – no legal basis given

7•MelkerWendelbo•36m ago•3 comments

Show HN: In-Browser Video Calls

https://just-call.app/
1•ddoronin•37m ago•0 comments

Did the obesity epidemic start with sugar? Or start with vitamins?

https://twitter.com/CraigBrockie/status/2038288653781438909
1•bilsbie•40m ago•0 comments

An open-source tool for designing homes using AI

https://github.com/bayllama/homemaker
2•graphllama•49m ago•0 comments