frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
2•tosh•4m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•9m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•13m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•14m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•15m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
3•okaywriting•22m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•25m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•25m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•26m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•27m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•27m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•28m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•28m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•32m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•32m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•34m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•34m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•42m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•42m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•44m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•44m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
2•surprisetalk•45m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
5•pseudolus•45m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•45m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•47m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
2•1vuio0pswjnm7•47m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•47m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
2•jackhalford•49m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•49m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
2•tangjiehao•52m ago•0 comments
Open in hackernews

The Path to Medical Superintelligence

https://microsoft.ai/new/the-path-to-medical-superintelligence/
10•brandonb•7mo ago

Comments

PaulHoule•7mo ago
I was doing a comparative analysis of the acquistion strategies of various "big tech" firms and was a little startled that I missed Microsoft's 2022 acquistion of Nuance, largely for its speech recognition systems aimed at the medical sector:

https://news.microsoft.com/source/2022/03/04/microsoft-compl...

gm678•7mo ago
> Microsoft AI Diagnostic Orchestrator (MAI-DxO) correctly diagnoses up to 85% of NEJM case proceedings, a rate more than four times higher than a group of experienced physicians.

> Clinicians in our study worked without access to colleagues, textbooks, or even generative AI, which may feature in their normal clinical practice.

1. As I understand, it's very common for doctors to fall back on reference material in their practice, especially for the most complex cases. If all access to resources was cut off (as seems to be implied by the second quote), the comparison seems somewhat unfair.

2. What were the publication dates of the case records? I can't find this information, and it makes a difference if the NEJM case studies were in the LLMs' training data.

miraculixx•7mo ago
Exactly. The study has been set up to produce this exact result. They essentially limited the human doctors to bare essentials, on specialist cases(!), while providing the LLMs with all sorts of help, including discussion among several AIs.

That's like letting one group of students have a strict closed-book exam, while another group can take the test as a group exercise and accessing any material they like, then claiming that closed-book exams lead to worse outcomes.

In a nutshell the study is just slop designed to get attention. The headline result is what they really want people to hear, and that's all the media will be repeating.

miraculixx•7mo ago
As any AI researcher knows, if you have a model that does 4x better than the naive baseline (the humans, in this case), you are likely looking at overfit, not real-life performance. This study is just slop, and you can tell so by the mere fact that they did not submit a paper, but just published a PR article.
LargoLasskhyfv•7mo ago
They didn't? What am I looking at, then?

https://arxiv.org/abs/2506.22405

This appears when you click on 'View Publication' in the article near the end, right before Q&A.

brandonb•7mo ago
In the paper, they say they used the most recent 56 cases (from 2024–2025) as a holdout set. The majority of those cases happened after the o4 training cutoff of May 31, 2024.
miraculixx•7mo ago
Are these 56 cases distinct from all other cases in the data?
FlyingLawnmower•7mo ago
Yes. They are about entirely different patient reports.