frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Πfs – The Data-Free Filesystem

https://github.com/philipl/pifs
1•ravenical•3m ago•0 comments

Go-busybox: A sandboxable port of busybox for AI agents

https://github.com/rcarmo/go-busybox
1•rcarmo•4m ago•0 comments

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf
1•gmays•4m ago•0 comments

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

https://www.bloomberg.com/news/newsletters/2026-02-03/musk-s-xai-merger-poses-bigger-threat-to-op...
1•andsoitis•5m ago•0 comments

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

https://www.youtube.com/watch?v=UNorxwlZlFk
1•lysace•6m ago•0 comments

Zen Tools

http://postmake.io/zen-list
1•Malfunction92•8m ago•0 comments

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

https://hailey.at/posts/3mear2n7v3k2r
1•carnevalem•8m ago•0 comments

The purpose of Continuous Integration is to fail

https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail
1•zdw•10m ago•0 comments

Apfelstrudel: Live coding music environment with AI agent chat

https://github.com/rcarmo/apfelstrudel
1•rcarmo•11m ago•0 comments

What Is Stoicism?

https://stoacentral.com/guides/what-is-stoicism
3•0xmattf•12m ago•0 comments

What happens when a neighborhood is built around a farm

https://grist.org/cities/what-happens-when-a-neighborhood-is-built-around-a-farm/
1•Brajeshwar•12m ago•0 comments

Every major galaxy is speeding away from the Milky Way, except one

https://www.livescience.com/space/cosmology/every-major-galaxy-is-speeding-away-from-the-milky-wa...
2•Brajeshwar•12m ago•0 comments

Extreme Inequality Presages the Revolt Against It

https://www.noemamag.com/extreme-inequality-presages-the-revolt-against-it/
2•Brajeshwar•12m ago•0 comments

There's no such thing as "tech" (Ten years later)

1•dtjb•13m ago•0 comments

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

https://medium.com/@aglaforge/what-really-killed-flash-player-a-six-year-campaign-of-deliberate-p...
1•jbegley•14m ago•0 comments

Ask HN: Anyone orchestrating multiple AI coding agents in parallel?

1•buildingwdavid•15m ago•0 comments

Show HN: Knowledge-Bank

https://github.com/gabrywu-public/knowledge-bank
1•gabrywu•21m ago•0 comments

Show HN: The Codeverse Hub Linux

https://github.com/TheCodeVerseHub/CodeVerseLinuxDistro
3•sinisterMage•22m ago•2 comments

Take a trip to Japan's Dododo Land, the most irritating place on Earth

https://soranews24.com/2026/02/07/take-a-trip-to-japans-dododo-land-the-most-irritating-place-on-...
2•zdw•22m ago•0 comments

British drivers over 70 to face eye tests every three years

https://www.bbc.com/news/articles/c205nxy0p31o
25•bookofjoe•22m ago•9 comments

BookTalk: A Reading Companion That Captures Your Voice

https://github.com/bramses/BookTalk
1•_bramses•23m ago•0 comments

Is AI "good" yet? – tracking HN's sentiment on AI coding

https://www.is-ai-good-yet.com/#home
3•ilyaizen•24m ago•1 comments

Show HN: Amdb – Tree-sitter based memory for AI agents (Rust)

https://github.com/BETAER-08/amdb
1•try_betaer•25m ago•0 comments

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
2•anhxuan•25m ago•0 comments

Show HN: Seedance 2.0 Release

https://seedancy2.com/
2•funnycoding•25m ago•0 comments

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
1•thelok•25m ago•0 comments

Towards Self-Driving Codebases

https://cursor.com/blog/self-driving-codebases
1•edwinarbus•26m ago•0 comments

VCF West: Whirlwind Software Restoration – Guy Fedorkow [video]

https://www.youtube.com/watch?v=YLoXodz1N9A
1•stmw•27m ago•1 comments

Show HN: COGext – A minimalist, open-source system monitor for Chrome (<550KB)

https://github.com/tchoa91/cog-ext
1•tchoa91•27m ago•1 comments

FOSDEM 26 – My Hallway Track Takeaways

https://sluongng.substack.com/p/fosdem-26-my-hallway-track-takeaways
1•birdculture•28m ago•0 comments
Open in hackernews

Tell HN: Google increased existing finetuned model latency by 5x

13•deaux•2mo ago
Since 5 days ago, the latency of our Finetuned 2.5 Flash models has suddenly jumped by 5x. For those less familiar, such finetuned models are often used to get close to the performance of a big model at one specific task with much less latency and cost. This means they're usually used for realtime, production use cases that see a lot of use and where you want to respond to the user quickly. Otherwise, finetuning generally isn't worth it. Many spend a few thousand dollars (at a minimum) on finetuning a model for one such task.

Five days ago, Google released Nano Banana Pro (Gemini 3.0 Image Preview) to the world. And since five days ago, the latency of our existing finetuned models has suddenly quintupled. We've talked with other startups who also make use of finetuned 2.5 Flash models, and they're seeing the exact same, even those in different regions. Obviously this has a big impact on all of our products.

From Google's side, nothing but silence, and this is talking about paid support. The reply to the initial support ticket is a request for basic information that has already been provided in that ticket or is trivially obvious. Since then, it's been more than 48 hours of nothingness.

Of course the timing could be a pure coincidence - though we've never seen any such latency instability before - but we can all see what's most likely here; Nano Banana Pro and Gemini 3 Preview consuming a huge amount of compute, and they're simply sacrificing finetuned model output for those. It's impossible to take them seriously for business use after this, who knows what they'll do next time. For all their faults, OpenAI have been a bastion of stability, despite being the most B2C-focused of all the frontier model providers. Google with Vertex claims to be all about enterprise and then breaks product of their business customers to get consumers their Ghibli images 1% faster. They've surely gotten plenty of tickets about this, and given Google's engineering, they must have automated monitoring that catches such a huge latency increase immediately. Temporary outages are understandable and happen everywhere, see AWS and Cloudflare recently, but 5+ days - if they even fix it - of 5x latency is effectively a 5+ day outage of a service.

I'm posting this mostly as a warning to other startups here to not rely on Google Vertex for user-facing model needs going forward.

Comments

jpau•2mo ago
Hey we're also a Vertex tuning customer in a similar spot. We're seeing other capacity issues, although not a leap in latency. Can you DM me? I'd love to trade notes. https://x.com/hellofromjames
deaux•2mo ago
Not a verified X user, but happy to exchange here or elsewhere. The latency leap is still the same for us. We're on us-west1 but reports are that it's similar on at least us-central1 if not elsewhere. We simply can't use the finetuned models in prod any more due to this, but whenever we run our automated tests with them, including today, the latency is still there. We haven't seen issues on non-finetuned models.