frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Tell HN: Google increased existing finetuned model latency by 5x

2•deaux•1h ago
Since 5 days ago, the latency of our Finetuned 2.5 Flash models has suddenly jumped by 5x. For those less familiar, such finetuned models are often used to get close to the performance of a big model at one specific task with much less latency and cost. This means they're usually used for realtime, production use cases that see a lot of use and where you want to respond to the user quickly. Otherwise, finetuning generally isn't worth it. Many spend a few thousand dollars (at a minimum) on finetuning a model for one such task.

Five days ago, Google released Nano Banana Pro (Gemini 3.0 Image Preview) to the world. And since five days ago, the latency of our existing finetuned models has suddenly quintupled. We've talked with other startups who also make use of finetuned 2.5 Flash models, and they're seeing the exact same, even those in different regions. Obviously this has a big impact on all of our products.

From Google's side, nothing but silence, and this is talking about paid support. The reply to the initial support ticket is a request for basic information that has already been provided in that ticket or is trivially obvious. Since then, it's been more than 48 hours of nothingness.

Of course the timing could be a pure coincidence - though we've never seen any such latency instability before - but we can all see what's most likely here; Nano Banana Pro and Gemini 3 Preview consuming a huge amount of compute, and they're simply sacrificing finetuned model output for those. It's impossible to take them seriously for business use after this, who knows what they'll do next time. For all their faults, OpenAI have been a bastion of stability, despite being the most B2C-focused of all the frontier model providers. Google with Vertex claims to be all about enterprise and then breaks product of their business customers to get consumers their Ghibli images 1% faster. They've surely gotten plenty of tickets about this, and given Google's engineering, they must have automated monitoring that catches such a huge latency increase immediately. Temporary outages are understandable and happen everywhere, see AWS and Cloudflare recently, but 5+ days - if they even fix it - of 5x latency is effectively a 5+ day outage of a service.

I'm posting this mostly as a warning to other startups here to not rely on Google Vertex for user-facing model needs going forward.

NZ's draft science curriculum favours rote learning over critical thinking

https://theconversation.com/nzs-draft-science-curriculum-favours-rote-learning-over-critical-thin...
3•billybuckwheat•1m ago•0 comments

U-turn: Google wants to bring JPEG XL back to Chrome

https://www.heise.de/en/news/U-turn-Google-wants-to-bring-JPEG-XL-back-to-Chrome-11089880.html
1•peterwyatt-pdfa•2m ago•1 comments

Jeff Dean on Important AI Trends [video]

https://www.youtube.com/watch?v=AnTw_t21ayE
1•todsacerdoti•4m ago•0 comments

What a CTO should know about tech

https://deadsimpletech.com/blog/cto_tech_capabilities
1•mirawelner•6m ago•0 comments

A Software Engineer's Guide to Agentic Software Development

https://brittanyellich.com/agentic-software-development/
1•overcommitted•7m ago•0 comments

Alphabet in Motion: An ABC Pop-Up Book about Typography

https://www.kellianderson.com/books/alphabetinmotion.html
2•bhattisatish•8m ago•1 comments

The Druridge Bay Ruin [video]

https://www.youtube.com/watch?v=mCceufLwJxU
2•DoreenMichele•12m ago•0 comments

Memories of .us

https://computer.rip/2025-11-11-dot-us.html
1•todsacerdoti•14m ago•0 comments

How I talk to whales

https://www.nytimes.com/2025/11/23/opinion/whale-language-ai.html
2•flabber•14m ago•0 comments

Show HN: I wrote my lecture notes in Typst

https://github.com/zhengnanli/ss-notes
2•subtlemuffins•15m ago•0 comments

Turbine Transport Transformer

https://mitxela.com/projects/turbine_transport_transformer
1•mhb•15m ago•0 comments

Kubricks' 2001: One Man's Incredible Odyssey (2015)

http://nzpetesmatteshot.blogspot.com/2015/01/kubricks-2001-one-mans-incredible.html
1•exvi•15m ago•0 comments

Mind-altering 'brain weapons' no longer only science fiction, say researchers

https://www.theguardian.com/world/2025/nov/22/mind-altering-brain-weapons-no-longer-only-science-...
1•zdw•16m ago•0 comments

Magicians of the Miniature (2014)

http://nzpetesmatteshot.blogspot.com/2014/12/magicians-of-miniature.html
1•exvi•17m ago•0 comments

I built a $19 forensic ATS scanner because Jobscan costs $50/mo

https://www.interviewghost.us/
1•ryanpedram•18m ago•1 comments

Video posted by Garry Tan shows suspect who robbed his friend of $11M in crypto

https://www.sfchronicle.com/crime/article/sf-cryptocurrency-robbery-21203804.php
2•markerz•18m ago•0 comments

Show HN: I built a CLI to use devcontainers without VS Code

https://github.com/UPwith-me/Container-Maker
2•DEVINHE111•20m ago•0 comments

Mitigating Application Resource Overload with Targeted Task Cancellation

http://muratbuffalo.blogspot.com/2025/11/mitigating-application-resource.html
1•zdw•21m ago•0 comments

Unpaid Labor Allegations Cast Shadow over Naver WEBTOON's Market Dominance

https://www.animenewsnetwork.com/feature/2025-11-05/unpaid-labor-allegations-cast-shadow-over-nav...
2•PaulHoule•21m ago•0 comments

Through the Looking Glass: The Traditional Glass Shot Matte Painting (2016)

http://nzpetesmatteshot.blogspot.com/2016/08/through-looking-glass-traditional-glass.html
1•exvi•21m ago•0 comments

Eggroll: Novel general-purpose machine learning algorithm provides 100x speed

https://eshyperscale.github.io/
2•felineflock•23m ago•0 comments

Astrl– a free AI-powered Khan Academy for self-guided learning

https://tryastrl.com/
1•jjwilkin•37m ago•1 comments

We're Stuck in an Infinite Loop of Terrible Tech

https://timyc.substack.com/p/were-stuck-in-an-infinite-loop-of
3•TimDotC•38m ago•1 comments

An Auto Holy Grail: Motors That Don't Rely on Chinese Rare Earths

https://www.nytimes.com/2025/11/24/business/automakers-rare-earth-minerals-magnets.html
1•mmooss•39m ago•0 comments

Anthropic introduces cheaper, more powerful, more efficient Opus 4.5 model

https://arstechnica.com/ai/2025/11/anthropic-introduces-opus-4-5-cuts-api-pricing-and-enables-muc...
1•jnord•41m ago•1 comments

Humanoid robot walked 66 miles in 3 days, right into the Guinness World Records

https://www.cbsnews.com/news/china-humanoid-robot-agibot-a2-walks-66-miles-guinness-world-records/
1•satonakamoto•41m ago•1 comments

Jakarta overtakes Tokyo as largest city, according to UN

https://www.abc.net.au/news/2025-11-25/jakarta-overtakes-tokyo-as-worlds-largest-city/106049122
2•Gaishan•42m ago•1 comments

Endogenous Automation Will Hit You

https://lydianottingham.substack.com/p/endogenous-automation-will-hit-you
1•eatitraw•44m ago•1 comments

Revolut hits $75B valuation

https://news.crunchbase.com/fintech/revolut-valuation-spikes-secondary-share-sale/
2•rudderdev•45m ago•3 comments

Beddel: Secure, Declarative, and Extensible Agent Runtimes

https://github.com/botanarede/beddel-alpha
1•mesenga•45m ago•1 comments