frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

https://gettorr.com/
1•BenaouidateMed•39s ago•0 comments

Statin drugs safer than previously thought

https://www.semafor.com/article/02/06/2026/statin-drugs-safer-than-previously-thought
1•stareatgoats•2m ago•0 comments

Handy when you just want to distract yourself for a moment

https://d6.h5go.life/
1•TrendSpotterPro•4m ago•0 comments

More States Are Taking Aim at a Controversial Early Reading Method

https://www.edweek.org/teaching-learning/more-states-are-taking-aim-at-a-controversial-early-read...
1•lelanthran•5m ago•0 comments

AI will not save developer productivity

https://www.infoworld.com/article/4125409/ai-will-not-save-developer-productivity.html
1•indentit•10m ago•0 comments

How I do and don't use agents

https://twitter.com/jessfraz/status/2019975917863661760
1•tosh•16m ago•0 comments

BTDUex Safe? The Back End Withdrawal Anomalies

1•aoijfoqfw•19m ago•0 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
3•michaelchicory•21m ago•1 comments

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

https://github.com/O0000-code/Ensemble
1•IO0oI•25m ago•1 comments

PR to support XMPP channels in OpenClaw

https://github.com/openclaw/openclaw/pull/9741
1•mickael•25m ago•0 comments

Twenty: A Modern Alternative to Salesforce

https://github.com/twentyhq/twenty
1•tosh•27m ago•0 comments

Raspberry Pi: More memory-driven price rises

https://www.raspberrypi.com/news/more-memory-driven-price-rises/
1•calcifer•32m ago•0 comments

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•36m ago•1 comments

Di.day is a movement to encourage people to ditch Big Tech

https://itsfoss.com/news/di-day-celebration/
3•MilnerRoute•37m ago•0 comments

Show HN: AI generated personal affirmations playing when your phone is locked

https://MyAffirmations.Guru
4•alaserm•38m ago•3 comments

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

https://github.com/paolobietolini/gtm-mcp-server
1•paolobietolini•39m ago•0 comments

Launch of X (Twitter) API Pay-per-Use Pricing

https://devcommunity.x.com/t/announcing-the-launch-of-x-api-pay-per-use-pricing/256476
1•thinkingemote•40m ago•0 comments

Facebook seemingly randomly bans tons of users

https://old.reddit.com/r/facebookdisabledme/
1•dirteater_•41m ago•1 comments

Global Bird Count Event

https://www.birdcount.org/
1•downboots•41m ago•0 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
2•soheilpro•43m ago•0 comments

Jon Stewart – One of My Favorite People – What Now? with Trevor Noah Podcast [video]

https://www.youtube.com/watch?v=44uC12g9ZVk
2•consumer451•46m ago•0 comments

P2P crypto exchange development company

1•sonniya•59m ago•0 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
2•jesperordrup•1h ago•0 comments

Write for Your Readers Even If They Are Agents

https://commonsware.com/blog/2026/02/06/write-for-your-readers-even-if-they-are-agents.html
1•ingve•1h ago•0 comments

Knowledge-Creating LLMs

https://tecunningham.github.io/posts/2026-01-29-knowledge-creating-llms.html
1•salkahfi•1h ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•1h ago•0 comments

Sid Meier's System for Real-Time Music Composition and Synthesis

https://patents.google.com/patent/US5496962A/en
1•GaryBluto•1h ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
7•keepamovin•1h ago•1 comments

Show HN: Empusa – Visual debugger to catch and resume AI agent retry loops

https://github.com/justin55afdfdsf5ds45f4ds5f45ds4/EmpusaAI
1•justinlord•1h ago•0 comments

Show HN: Bitcoin wallet on NXP SE050 secure element, Tor-only open source

https://github.com/0xdeadbeefnetwork/sigil-web
2•sickthecat•1h ago•1 comments
Open in hackernews

Tell HN: Google increased existing finetuned model latency by 5x

13•deaux•2mo ago
Since 5 days ago, the latency of our Finetuned 2.5 Flash models has suddenly jumped by 5x. For those less familiar, such finetuned models are often used to get close to the performance of a big model at one specific task with much less latency and cost. This means they're usually used for realtime, production use cases that see a lot of use and where you want to respond to the user quickly. Otherwise, finetuning generally isn't worth it. Many spend a few thousand dollars (at a minimum) on finetuning a model for one such task.

Five days ago, Google released Nano Banana Pro (Gemini 3.0 Image Preview) to the world. And since five days ago, the latency of our existing finetuned models has suddenly quintupled. We've talked with other startups who also make use of finetuned 2.5 Flash models, and they're seeing the exact same, even those in different regions. Obviously this has a big impact on all of our products.

From Google's side, nothing but silence, and this is talking about paid support. The reply to the initial support ticket is a request for basic information that has already been provided in that ticket or is trivially obvious. Since then, it's been more than 48 hours of nothingness.

Of course the timing could be a pure coincidence - though we've never seen any such latency instability before - but we can all see what's most likely here; Nano Banana Pro and Gemini 3 Preview consuming a huge amount of compute, and they're simply sacrificing finetuned model output for those. It's impossible to take them seriously for business use after this, who knows what they'll do next time. For all their faults, OpenAI have been a bastion of stability, despite being the most B2C-focused of all the frontier model providers. Google with Vertex claims to be all about enterprise and then breaks product of their business customers to get consumers their Ghibli images 1% faster. They've surely gotten plenty of tickets about this, and given Google's engineering, they must have automated monitoring that catches such a huge latency increase immediately. Temporary outages are understandable and happen everywhere, see AWS and Cloudflare recently, but 5+ days - if they even fix it - of 5x latency is effectively a 5+ day outage of a service.

I'm posting this mostly as a warning to other startups here to not rely on Google Vertex for user-facing model needs going forward.

Comments

jpau•2mo ago
Hey we're also a Vertex tuning customer in a similar spot. We're seeing other capacity issues, although not a leap in latency. Can you DM me? I'd love to trade notes. https://x.com/hellofromjames
deaux•2mo ago
Not a verified X user, but happy to exchange here or elsewhere. The latency leap is still the same for us. We're on us-west1 but reports are that it's similar on at least us-central1 if not elsewhere. We simply can't use the finetuned models in prod any more due to this, but whenever we run our automated tests with them, including today, the latency is still there. We haven't seen issues on non-finetuned models.