frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Medical AI benchmarks are broken – we're building a community-driven alternative

https://medicalsphere.ai/
1•medicalsphere•16h ago

Comments

medicalsphere•16h ago
We built Medical Sphere, a public platform for objective evaluation of AI models in healthcare. You can think of it as LM Arena, but purpose-built for the medical domain, which we believe is large and specific enough to warrant its own dedicated evaluation effort.

Instead of static multiple-choice benchmarks, we let people upload real medical cases (text + images), optionally anonymously, run frontier models (open and closed) side-by-side, and compare where they agree, fail, or hallucinate, all for free.

We also verify medical professionals who want to participate and connect them with research and industry collaboration opportunities.

We’ll be frequently releasing open benchmarks, with PHI/PII carefully redacted, as we’ve already started doing.

The goal is to enable transparent, real-world benchmarks and a data flywheel for more trustworthy medical AI. Live and free to use. Would love feedback from folks working on AI evaluation, healthcare ML, or safety.

Gemini tops leaderboard on research math problems

https://epoch.ai/frontiermath
1•gsf_emergency_6•7m ago•0 comments

LOL

https://www.digitalocean.com/community/tutorials/how-to-create-a-new-user-and-grant-permissions-i...
1•devpbrilius•8m ago•0 comments

We Built Another Object Storage (and Why It's Different)

https://fractalbits.com/blog/why-we-built-another-object-storage/
5•fractalbits•12m ago•1 comments

How to review AI generated PRs

https://thoughtbot.com/blog/how-to-review-ai-generated-prs
1•ben_s•12m ago•0 comments

Take the web for a fresh spin with GenTabs, built with Gemini 3

https://blog.google/technology/google-labs/gentabs-gemini-3/
1•doener•13m ago•0 comments

Philosophy Lectures on YouTube

https://jaredhenderson.substack.com/p/the-best-philosophy-lectures-on-youtube
1•synthetictask•14m ago•0 comments

Could America win the AI race but lose the war?

https://www.ft.com/content/12581344-6e37-45a0-a9d5-e3d6a9f8d9ba
1•pseudolus•22m ago•1 comments

Our First Contact with Aliens Will Be Their Last Words – Eschatian Hypothesis [video]

https://www.youtube.com/watch?v=jSlbplt7GhA
1•hammadmajid•29m ago•0 comments

Germany's train service is one of Europe's worst. How did it get so bad?

https://www.npr.org/2025/12/12/g-s1-100794/germany-train-rail-deutsche-bahn
5•pseudolus•32m ago•0 comments

Toppleware

https://hannibalglaser.dk/toppleware
1•closingreunion•32m ago•0 comments

Show HN: What are your thoughts on this?

https://www.npmjs.com/package/dotenv-diff
1•chrilleweb•33m ago•0 comments

Math Predicting the Death of Nations

https://www.youtube.com/watch?v=B5cMfyFqKmM
1•aranw•35m ago•0 comments

YouTube's CEO limits his kids' social media use – other tech bosses do the same

https://www.cnbc.com/2025/12/13/youtubes-ceo-is-latest-tech-boss-limiting-his-kids-social-media-u...
4•pseudolus•37m ago•3 comments

Aethism, God and Startups

https://vednig.medium.com/aethism-god-and-startups-ed135031af98
1•vednig•37m ago•0 comments

Texas Attorney General Sues Epic for Gatekeeping Medical Data

https://www.texasattorneygeneral.gov/news/releases/attorney-general-ken-paxton-sues-major-medical...
2•jasoneisen•40m ago•0 comments

Using Linear Algebra to Predict a Non-Linear Pendulum

https://chillphysicsenjoyer.substack.com/p/using-linear-algebra-to-predict-a
3•crescit_eundo•41m ago•0 comments

RIP American Tech Dominance

https://www.theatlantic.com/economy/2025/12/trumps-china-ai-chips/685235/
1•breve•41m ago•0 comments

Bohra Cuisine: A Pinch of Salt and Desserts First

https://www.thedawoodibohras.com/bohra-cuisine-a-pinch-of-salt-and-desserts-first/
1•thunderbong•42m ago•0 comments

Recent GeoServer Vulnerability Exploited in Attacks

https://www.securityweek.com/recent-geoserver-vulnerability-exploited-in-attacks/
1•Bender•49m ago•0 comments

OpenAI built an AI coding agent and uses it to improve the agent itself

https://arstechnica.com/ai/2025/12/how-openai-is-using-gpt-5-codex-to-improve-the-ai-tool-itself/
1•Bender•49m ago•0 comments

Online gaming escaped Australia's social media ban-critics say just as addictive

https://www.bbc.co.uk/news/articles/c93w90kqgv9o
1•n1b0m•50m ago•0 comments

Browser Benchmark Test

https://browserbench.org/Speedometer3.1/
1•Bender•52m ago•0 comments

Slide Rule

https://en.wikipedia.org/wiki/Slide_rule
1•tosh•55m ago•0 comments

Gut microbial imbalance can impact memory, says study

https://www.newindianexpress.com/states/kerala/2025/Dec/12/gut-microbial-imbalance-can-impact-mem...
3•sundarurfriend•57m ago•0 comments

Broadcom tumbles 11% after earnings as AI trade sells off

https://www.cnbc.com/2025/12/12/broadcom-tumbles-10percent-after-earnings-as-ai-trade-sells-off-....
2•kristianp•1h ago•1 comments

1.5M Plastic Bottles Are Turned into Clothing Every Day (2024) [video]

https://www.youtube.com/watch?v=FChEek0NSOI
2•mgh2•1h ago•1 comments

Show HN: The Lost World 2030 – A full comic built with HTML+CSS

https://www.google.com/search?q=the+lost+world+2030+&sca_esv=8c8f0d3927118de8&sxsrf=AE3TifNm7oDHP...
2•OSCAR-ORO•1h ago•0 comments

The Number That Turned Sideways

https://zuriby.github.io/math.github.io/the-number-that-turned-sideways.html
1•tzury•1h ago•0 comments

Retroviral insertions contributed to the divergence of human and chimp brains

https://www.biorxiv.org/content/10.64898/2025.12.12.693858v1
1•XzetaU8•1h ago•0 comments

I Fed 24 Years of My Blog Posts to a Markov Model

https://susam.net/fed-24-years-of-posts-to-markov-model.html
2•susam•1h ago•0 comments