news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/

1•keepamovin•2m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl

1•buildingwdavid•2m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk

1•tosh•7m ago•0 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501

2•onurkanbkrc•8m ago•0 comments

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

https://github.com/Concode0/Versor

1•concode0•9m ago•1 comments

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

https://medresearch-ai.org/hypotheses-hub/

1•panossk•12m ago•0 comments

Big Tech vs. OpenClaw

https://www.jakequist.com/thoughts/big-tech-vs-openclaw/

1•headalgorithm•14m ago•0 comments

Anofox Forecast

https://anofox.com/docs/forecast/

1•marklit•14m ago•0 comments

Ask HN: How do you figure out where data lives across 100 microservices?

1•doodledood•14m ago•0 comments

Motus: A Unified Latent Action World Model

https://arxiv.org/abs/2512.13030

1•mnming•15m ago•0 comments

Rotten Tomatoes Desperately Claims 'Impossible' Rating for 'Melania' Is Real

https://www.thedailybeast.com/obsessed/rotten-tomatoes-desperately-claims-impossible-rating-for-m...

3•juujian•16m ago•1 comments

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

https://www.science.org/doi/10.1126/scisignal.adv0660

1•thunderbong•18m ago•0 comments

Los Alamos Primer

https://blog.szczepan.org/blog/los-alamos-primer/

1•alkyon•21m ago•0 comments

NewASM Virtual Machine

https://github.com/bracesoftware/newasm

2•DEntisT_•23m ago•0 comments

Terminal-Bench 2.0 Leaderboard

https://www.tbench.ai/leaderboard/terminal-bench/2.0

2•tosh•23m ago•0 comments

I vibe coded a BBS bank with a real working ledger

https://mini-ledger.exe.xyz/

1•simonvc•23m ago•1 comments

The Path to Mojo 1.0

https://www.modular.com/blog/the-path-to-mojo-1-0

1•tosh•26m ago•0 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md

5•sakanakana00•29m ago•1 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/

3•pieterdy•32m ago•0 comments

Hot Reloading in Rust? Subsecond and Dioxus to the Rescue

https://codethoughts.io/posts/2026-02-07-rust-hot-reloading/

3•Tehnix•32m ago•1 comments

Skim – vibe review your PRs

https://github.com/Haizzz/skim

2•haizzz•34m ago•1 comments

Show HN: Open-source AI assistant for interview reasoning

https://github.com/evinjohnn/natively-cluely-ai-assistant

4•Nive11•34m ago•6 comments

Tech Edge: A Living Playbook for America's Technology Long Game

https://csis-website-prod.s3.amazonaws.com/s3fs-public/2026-01/260120_EST_Tech_Edge_0.pdf?Version...

2•hunglee2•38m ago•0 comments

Golden Cross vs. Death Cross: Crypto Trading Guide

https://chartscout.io/golden-cross-vs-death-cross-crypto-trading-guide

3•chartscout•40m ago•1 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/

3•AlexeyBrin•43m ago•0 comments

What the longevity experts don't tell you

https://machielreyneke.com/blog/longevity-lessons/

2•machielrey•45m ago•1 comments

Monzo wrongly denied refunds to fraud and scam victims

https://www.theguardian.com/money/2026/feb/07/monzo-natwest-hsbc-refunds-fraud-scam-fos-ombudsman

3•tablets•49m ago•1 comments

They were drawn to Korea with dreams of K-pop stardom – but then let down

https://www.bbc.com/news/articles/cvgnq9rwyqno

2•breve•51m ago•0 comments

Show HN: AI-Powered Merchant Intelligence

https://nodee.co

1•jjkirsch•54m ago•0 comments

Bash parallel tasks and error handling

https://github.com/themattrix/bash-concurrent

2•pastage•54m ago•0 comments

Open in hackernews

VLM Showdown: GPT vs. Gemini vs. Claude vs. Orion

https://chat.vlm.run/showdown

15•fzysingularity•2mo ago

Comments

fzysingularity•2mo ago

We ran a small visual benchmark [1] of GPT, Gemini, Claude, and our new visual agent Orion [2] on a handful of visual tasks: object detection, segmentation, OCR, image/video generation, and multi-step visual reasoning.

The surprising part: models that ace benchmarks often fail on seemingly trivial visual tasks, while others succeed in unexpected places. We show concrete examples, side-by-side outputs, and how each model breaks when chaining multiple visual steps.

We go into more details in our technical whitepaper [3]. Play around with Orion for free here [4].

[1] Showdown: https://chat.vlm.run/showdown

[2] Learn about Orion: https://vlm.run/orion

[3] Technical whitepaper: https://vlm.run/orion/whitepaper

[4] Chat with Orion: https://chat.vlm.run/

Happy to answer questions or dig into specific cases in the comments.