frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Gemini 3.5 Flash

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
142•spectraldrift•2h ago
https://ai.google.dev/gemini-api/docs/models/gemini-3.5-flas...

Comments

f311a•2h ago
$9/1M output
explosion-s•1h ago
I wonder if this is because it's a larger model or maybe just because they can? Although with the latest Deepseek it's really tough to compete pricing wise. Inference speed and integration (e.g. Antigravity) might be their only hope here
hydra-f•10m ago
It has to be a larger model, wouldn't make much sense otherwise. That isn't to say the price isn't artificially increased as well

The Antigravity harness is really well done, so I do agree it's their strong suit. Can't say the same about gemini-cli (though it has a really nice interface)

Would still choose Deepseek for the price

alexdns•1h ago
Its Gemini 3.5 Flash
nerdalytics•1h ago
Yeah, Google chose a misleading title for the blog post.
jader201•18m ago
> Today, we’re introducing Gemini 3.5, our latest family of models combining frontier intelligence with action. This represents a major leap forward in building more capable, intelligent agents. We’re kicking off the series by releasing 3.5 Flash.
swe_dima•1h ago
Flash family but costs like a Pro. $9 vs $12 for output.
asar•1h ago
$1.5/m input tokens $9/m output tokens

6x the price of 3.1 flash lite

himata4113•1h ago
I don't think input/output pricing matters, 90% of the cost is cache. $0.15 is pretty good, but still very expensive.
minimaxir•1h ago
10% of input pricing is standard especially compared to competition.
himata4113•1h ago
yah, which means that the input cost is the only value that should be paid attention to at the end + the cache discount (x10). If google would start offering x20 discount it would make it twice as cheap while input and output stayed the same.
wolttam•1h ago
It depends on the use-case. yes, 90% of cost is cache in agentic coding scenarios (actually 95% in my experience). But not when the model reasons for 200k+ tokens before answering a complex problem.
himata4113•1h ago
gemini models solve a problem in 80% less tokens so that's something to think about.
johaugum•21m ago
Source?
__jl__•1h ago
In our experience, caching is not very reliable with google. We always get random cache misses that don't happen with other providers. We find OpenAI, Anthropic and Fireworks (which we use a lot) all have higher cache hit rates. So it's not only about the costs of cached token but also what kind of cached hit rate you get.
svachalek•30m ago
In my experience Google is the most flaky in general, which is surprising considering the rock solid history of their search and other products. Just more likely not to respond at all, to give a response out of left field, to handle the same error in 12 different ways randomly (a rainbow of HTTP status codes and error messages), etc etc.
simonw•20m ago
Gemini caching is confusing though:

  $0.15 / million tokens
  $1.00 / 1,000,000 tokens per hour (storage price)
I much prefer the OpenAI/DeepSeek way of pricing caching where you don't have to think about storage price at all - you pay for cached tokens if you reuse the same prefix within a (loosely defined) time period.
John7878781•1h ago
[deleted]
stri8ed•1h ago
Output cost is 3x from Gemini 3 flash.
iwhalen•1h ago
I wonder why they didn't discuss price in the post?

Compare to the GPT-5.5 announcement: https://openai.com/index/introducing-gpt-5-5/

WarmWash•1h ago
I haven't used 3.5 at all yet, but previous Gemini (and Gemma models) are by far the most token light per task than any other model.

Cost per task is a more productive measure, but obviously a more difficult one to benchmark.

Aunche•35m ago
"Flash-Lite" is a different product from "Flash", which is more expensive. They couldn't be more confusing with their naming though, especially since they have 3.1 Pro and not 3.1 Flash non-lite.
himata4113•1h ago
Engineers at google have publically stated that the models are too big and are far from their potencial. Glad they're being proven right with every release.

They continue to focus on smaller models while openai and anthropic are increasing compute requirements for their SOTA models.

stri8ed•1h ago
Given the cost increase associated with this model, and previous model releases, I think the size is trending upwards, not down.
himata4113•1h ago
The speed says otherwise. I think they're increasing costs since they want to start seeing ROI.
JanSt•1h ago
Those are (mostly) new, faster TPU
himata4113•1h ago
latest TPU's appear to reach 800tok/s rather than the advertised 300tok/s.
maipen•1h ago
Don’t let that fool yourself. Google will have SOTA models as big as or even bigger than their competitors.

They are just refining their current models while they finish training the next generation.

They will all come out at about the same time. Anthropic, OpenAi, Google, xAI

ACCount37•1h ago
Anthropic has been sitting on Mythos for a while now. I guess they don't feel pressured to fuck it ship it until anyone else gets a 10T to work.
Sevii•1h ago
It's doubtful they have the compute to make mythos publicly available even after the SpaceX datacenter deal. And why sell it publicly if people are still willing to pay for Opus 4.7?
outside1234•1h ago
I suspect that Mythos doesn't have a business model that works
throwa356262•45m ago
According to people who have access to Mythos, it is slightly worse than GPT-5.5-xhigh. At least for security tasks.

Hold on, I think this claim needs some hard data. Here you go gentlemen:

https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...

ACCount37•20m ago
That claim keeps contradicted hard by other parties, who say Mythos beats 5.5 resoundingly on both autonomous search and discovery and creation of complex exploit chains.

There might be a harness difference, but also, this CTF-type benchmark might not capture the capability difference fully.

aesthesia•17m ago
See the later post testing a newer Mythos checkpoint, though: https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber...
abirch•20m ago
Anthropic can sell Mythos to Fortune 500 companies and bypass the average user. I'm not sure how much is hype but I see things like this https://blog.cloudflare.com/cyber-frontier-models/
howdareme•1h ago
Google’s pro models are almost certainly bigger than Openai’s lol
fikama•25m ago
Why would that be? I am curious why do you think that.
ActorNightly•2m ago
Because TPUs are more efficient, and its cheaper for them to field them in higher quantity since they own the chip.
Jabbles•41m ago
> Engineers at google have publically stated that the models are too big and are far from their potencial

Can you link to a source?

Dinux•25m ago
Source please cause i dont believe that for once second
golfer•1h ago
Here's the benchmark scoreboard they published:

https://storage.googleapis.com/gweb-uniblog-publish-prod/ori...

mixtureoftakes•1h ago
benchmarks look REALLY good, the price hike is big but it also beats sonnet 4.6 in every discipline?
SXX•1h ago

  > Create animated SVG of a frog on a boat rowing through jungle river. Single page self contained HTML page with SVG
3.5 Flash: Thinking Medium - 7516 tokens

https://gistpreview.github.io/?5c9858fd2057e678b55d563d9bff0...

3.5 Flash: Thinking High - 7280 tokens

https://gistpreview.github.io/?1cab3d70064349d08cf5952cdc165...

3.1 Pro - 28,258 tokens

https://gistpreview.github.io/?6bf3da2f80487608b9525bce53018...

Though 3.1 took 3 minutes of thinking to generate, but it only one that got animated movement.

abi•1h ago
Your links are broken FYI.
John7878781•1h ago
They work for me.
TacticalCoder•1h ago
They do work here too.
captn3m0•1h ago
All three links animate for me.
NitpickLawyer•1h ago
I think they mean the boat is moving. In the flash ones the paddles are animated but the boat is stationary for me.
codazoda•1h ago
The boat moves in all three for me
Fishkins•56m ago
The boat itself rocks, but do you see the background changing to indicate the boat is progressing through the environment? I only see that in the 3.1 Pro example. I believe that's what the OP meant.
Manuel_D•49m ago
I think this illustrates the problem with OP's prompt. If the goal is specifically to implement a scrolling background, this should have been in the prompt.
SXX•27m ago
Yup. My bad. It was just first idea that come to my mind since I enjoy visually compare each new release with unique prompts.
wslh•1h ago
Can you try with a more complex story such as "three little pigs"? I tried but it created a storybook instead of the SVG animation. I am looking to partially imitate Godogen [1][2] which is really great, even for animations.

[1] https://github.com/htdt/godogen

[2] https://drive.google.com/file/d/1ozZmWcSwieZQG0muYjbj7Xjhhlz...

SXX•1h ago
Gemini 3.1 Flash Lite Thinking High - 2,526 tokens:

https://gistpreview.github.io/?3496285c5dac5ba10ebbc0b201a1a...

Gemini 2.5 Pro - 5,325 tokens:

https://gistpreview.github.io/?cc5e0fefeaaffecd228c16c95e736...

Gemini 2.5 Flash - 7,556 tokens:

https://gistpreview.github.io/?263d6058fe526a62b8f270f0620ec...

SXX•33m ago
Gemma 4 E4B it via Edge Gallery on pixel phone:

https://gistpreview.github.io/?da742884e5e830ce71ee4db877519...

OFC this is just for fun, but nevertheless gave me working code on first try.

abtinf•1h ago
hesamation/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF @ Q6_K

8112 tokens @ 52.97 TPS, 0.85s TTFT

https://gistpreview.github.io/?7bdefff99aca89d1bc12405323bd4...

Full session: https://gist.github.com/abtinf/7bdefff99aca89d1bc12405323bd4...

Generated with LM Studio on a Macbook Pro M2 Max

https://huggingface.co/hesamation/Qwen3.6-35B-A3B-Claude-4.6...

SXX•30m ago
Well, honestly this is quite impressive compared to 3.1 Flash Lite and 2.5 Pro. Considering that 2.5 Pro is actually quite good at generating massive amounts of code one shot.
franze•34m ago
Opus 4.7

https://claude.ai/public/artifacts/128ebe5a-add7-406a-9bce-6...

cesarvarela•1h ago
Add Flash to the title, please.
meetpateltech•1h ago
edited it.
benbencodes•1h ago
Pricing is now live on ai.google.dev/pricing:

Gemini 3.5 Flash: $0.75 input / $4.50 output per 1M tokens, 1M context window. The output price explicitly "includes thinking tokens" — which is why it's higher than a typical flash-class model.

For comparison within the Gemini lineup: - Gemini 2.5 Flash: $0.30 / $2.50 - Gemini 3.1 Flash-Lite: $0.25 / $1.50 - Gemini 3.1 Pro Preview: $2.00 / $12.00

So 3.5 Flash is ~2.5x more expensive input vs 2.5 Flash. The pricing and "including thinking tokens" framing position it as a reasoning-capable flash model rather than just a pure speed optimization.

conorh•1h ago
I think you have your pricing wrong there, Gemini 3.5 flash is $1.50 input and $9 output.
mchusma•1h ago
Okay, it's kind of somewhere between haiku and sonnet level pricing, at somewhere between sonnet and opus level performance. Its a great option. I was hoping to see opus class intelligence at haiku level pricing out of google, and this is close to that!
mchusma•1h ago
Never mind, after looking at more benchmarks, seems closer to sonnet level intelligence at slightly lower cost. Speed is great for latency sensitive applications, but if this was 1/2 the cost it would have been priced to win.

If this is the big model release out of google, its a disappointent.

jpau•1h ago
Standard pricing is showing for me as $1.50 / $9.

(I suspect you're viewing the "flex" pricing).

lyjackal•1h ago
You’re quoting the batch pricing. On demand is 1.5 per input and 9 per M output. This is effectively comparable cost to Gemini 2.5 Pro in a flash tier model
ls_stats•1h ago
You are seeing batch inference, standard inference is $1.5/$9. I was excited until I saw that price.
Tiberium•57m ago
Please delete/edit your AI-written and factually wrong post.
MallocVoidstar•3m ago
In addition to people pointing out your LLM got the pricing wrong,

> The pricing and "including thinking tokens" framing position it as a reasoning-capable flash model rather than just a pure speed optimization

Every Gemini model starting with 2.5 has been a reasoning model.

aliljet•1h ago
Is there a good benchmark tracking hallucinations? The models are all incredibly good now, even the open ones, and my hope is that the rate of hallucinations is something that's falling off in concert with larger and larger context lengths.
Sevii•1h ago
I haven't been bothered by hallucinations in premier models since early last year. Still see it in smaller local models though.
aliljet•1h ago
I'm really running into this deep at the edges of content creation. Take, for example, a need to general some kind of legal work. The cost of painstakingly checking and rechecking each case cited is reducing the value of these frontier models immensely.

Coding, however, is solved like magic. Easier to add tests, to be fair.

throawayonthe•1h ago
well there is https://artificialanalysis.ai/evaluations/omniscience
goldenarm•40m ago
It's a gibberish input detection benchmark, and does not measure output hallucinations.
yieldcrv•1h ago
if last year's models were the ones people got familiar with in late 2022, hallucinations would be an underrepresented rumor, there would be no articles about it because its so rare. overconfident lawyers wouldn't have messed up dockets in court with fake case law, in other domains that move faster, sources would be only partially outdated with agentic search and mcp servers filling in the gaps

AI psychosis would be the problem people talk about more, not just outright agreement but subtle ways of making you feel confident in your ideas. "yes, buy that domain name buy these other ones for defensibility"

(the domain name is dumb and completely unmarketable)

jampekka•54m ago
The models still hallucinate bad when called via APIs, especially if web search is not enabled. Gemini hallucinates quite frequently even with the app and search enabled. More recent (e.g. ChatGPT 5.x and Deepseek v4) prompts/harnesses search very aggressively, which does greatly mitigate hallucinations.
majso•1h ago
maybe something like this? https://petergpt.github.io/bullshit-benchmark/viewer/index.v...
WarmWash•1h ago
People complain about them incessantly, but I can almost never get people to actually post receipts. Every provider allows sharing chats, and anyone can share a prompt that reliably produces hallucinations.

More often than not, people are using images in responses that go awry. Which is fair, the models are sold as multi-modal, but image analyses is still at gpt-4.0 text-analyses levels.

Also knowledge cutoff issues, where people forget the models exist months to a year or more in the past.

saberience•57m ago
I see hallucinations ALL the time. It's only obvious when you're prompting about a subject you know well.

And when I say all the time, I mean it, and this is for Opus 4.7 Adaptive.

I often have to say, please do searches and cite sources, as if it doesn't it will confidently give me wrong or outdated information.

If you're often asking questions about a topic that's not in your specialist knowledge you won't notice them.

droidjj•26m ago
Hallucination is also much better controlled in the context of agentic coding because outputs can be validated by running the code (or linters/LSP). I almost never notice hallucinations when I’m coding with AI, but when using AI for legal work (my real job) it hallucinates constantly and perniciously because the hallucinations are subtle—e.g., making up a crucial fact about a real case.
rjh29•28m ago
"People complain about them incessantly, but I can almost never get people to actually post receipts."

...my chats are all pretty long and involve personal conversations, or I've deleted them. It's a lot to ask for someone to post receipts. The number of complaints is enough data.

No matter how big the model is there will be edge cases where it has no data or is out of date. In these cases it just makes stuff up. You can detect it yourself by looking for words like usually or often when it states facts, e.g. "the mall often has a Starbucks." I asked it about a Genshin Impact character released in June 2025 and it consistently interpreted the name (Aino) as my player character because Aino wasn't in its data.

Honestly I'm surprised your haven't encountered it if you're using it more than casually. Pro is much better but not perfect.

hibikir•14m ago
I see constant hallucination in claude code when using specific tooling: It thinks it knows aws cli, for instance, but there's some flags that don't exist, it attempts to use all the time in 4.6 and 4.7. When asked about it, it says that yes , the flag doesn't exist in that command, but it exists in a different command (which it does), and yet, it attempts to use it without extra info.

Claude also believes it knows how AWS' KMS works, quite confidently, while getting things wrong. I have a separate "this is how KMS replication actually works" file just to deal with its misconceptions.

For gemini, I typically use it to query information from large corpuses, but it often web searches and hallucinates instead of reading the actual corpus. On a book series, it will hallucinate chapters and events which, while reasonable and plausible, do not exist. "Go look at the files and see if your reference is correct" shows that it's not correct, and it's a mandatory step. But that doesn't prevent hallucination, but makes sure you catch it after the fact, just like a method in a class that doesn't exist gets found out by the compiler. The LLM still hallucinated it.

hamdingers•7m ago
I can reliably produce hallucinations with this genre of prompt: "write a script that does <simple task> with <well known but not too-well-known API>." Even the frontier models will hallucinate the perfect API endpoint that does exactly what I want, regardless of if it exists.

The fix is easy enough though, a line in my global AGENTS.md instructing agents to search/ask for documentation before working on API integrations.

asdfasgasdgasdg•6m ago
https://gemini.google.com/share/9cd8ca68025a

I was trying to understand a game I've been playing, The Last Spell. I asked it for a tier list of omens -- which ones the community considers most important. At least a few of the names it posts are hallucinated ("omen of the sun" does not exist, and the omens that give extra gold are "savings," "fortune," and "great wealth").

Obviously not a critical use case but issues like this do keep me on my toes regarding whether the thing is working at all. I should ask 3.5 flash to do the same job.

FergusArgyll•53m ago
As long as the model uses web search, they almost never hallucinate anymore. The fast models (haiku, gpt-instant, flash) still sometimes have the problem where they don't search before answering so they can hallucinate
goldenarm•35m ago
I've seen chatGPT and Gemini hallucinate even from web search, it's better is not sufficient
bakugo•1h ago
Triple the price of the last Flash model ($3 -> $9 per 1M output). Quickly approaching Sonnet prices.

Feels like the AI pricing noose is tightening sooner rather than later.

eis•1h ago
3.5 Flash was more expensive than 3.1 Pro to run the Artifical Analysis test suite. $1551 for 3.5 Flash [0] vs $892 for 3.1 Pro [1]. That's 74% more cost while ranking lower. It's 2.5x as fast but I don't think the bang for the buck is there anymore like it was with 3.0 Flash. I'm a bit bummed out to be honest.

I did not expect such a huge (3x) price increase from 3.0 Flash and I bet many people will not just blindly upgrade as the value proposition is widely different.

One interesting point to note is that Google marked the model as Stable in contrast to nearly everything else being perpetually set as Preview.

[0] https://artificialanalysis.ai/models/gemini-3-5-flash [1] https://artificialanalysis.ai/models/gemini-3-1-pro-preview

ls_stats•1h ago
>3.5 Flash was more expensive than 3.1 Pro to run the Artifical Analysis test suite

That's everything I needed to know.

ekojs•1h ago
Seems like the only good thing about 3.5 Flash is its speed. Not cost-competitive or benchmark-leading by any means.
mijoharas•52m ago
That's what I came here to check. Last model release they only put it into preview[0] at first.

Does that mean this model is production ready?

[0] https://news.ycombinator.com/item?id=47076484

pingou•14m ago
How do they calculate that?

3.1 has 57M output tokens from Intelligence Index, 3.5 Flash has 73M, so not a lot more, and 3.5 is a bit cheaper, I don't get how 3.5 can be 74% more expensive.

nightski•1h ago
AI being a product is not the future. It's more like an operating system that deserves to be open and free (aka Linux). Unless that happens we are in for a very dystopian future. I wish I had the intelligence, resources and/or connections to try and make that happen.
lugu•3m ago
What we need today is a standard local API (think of it as a POSIX extension). So that each desktop app that needs AI to enhance a feature can simply call that. This way, those apps will need to handle the case where AI is not availabile. This will empower users.
HardCodedBias•1h ago
Oh boy.

GDM is making (or has been backed into a corner into making) the bet that high throughput, low latency, low capability models are the path forward.

That probably works for vibe coded apps by non-practitioners.

I suspect that practitioners/professionals will wait longer for better results.

brokencode•1h ago
Where do you see that it’s low capability?

And Google is trying to make something affordable enough for a mass market, ad-supported audience.

They aren’t hyper focused on enterprise like Anthropic is. And that’s okay. There’s room for different players in different markets.

OsrsNeedsf2P•1h ago
Beats 3.1 Pro for price per token, but artificial analysis is showing it's dumber per token and costs more overall
sauwan•57m ago
Yeah, bummer. I was very excited for this release, but this killed it.
droidjj•45m ago
The pricing is an issue.
golfer•40m ago
Arena.ai is saying "Gemini 3.5 Flash’s pricing shifts the Pareto frontier in Text. 8 models from GoogleDeepMind dominate the Text Arena Pareto curve where only 4 labs are represented for top performance in their price tiers."

https://x.com/arena/status/2056793180998361233

s3p•1h ago
Yikes. I think the concept of a 'flash' model is changing, no? Google used to market this as its lower-intelligence, faster, cheaper option. I appreciate that they are delivering on both of those, but personally I would appreciate if they could create an incremental knowledge improvement while holding price steady. Fortune 500 companies have to make their money I guess.
2001zhaozhao•23m ago
I think flash just means "fast" now
noelsusman•57m ago
The Artificial Analysis benchmark results are pretty underwhelming. Roughly the same "intelligence" as MiMo-V2.5-Pro for over 3x the cost. We'll have to see how that translates to actual usage but it's not a great sign.
hydra-f•18m ago
That really depends on whether they have similar parameter counts, doesn't it? Unless you know that, the comparison is just strange
merb•54m ago
Stil no new processor version for document ai https://docs.cloud.google.com/document-ai/docs/release-notes that is so weird. (Customer extractor)

It’s not possible to uptrain on preview releases and it did not get that much love for a while.

warthog•51m ago
GPT-5.5 on the benchmarks still seem to perform better than this

Plus the vibe of the gemini models are so weird particularly when it comes to tool calling

At this point I kinda need them to shock me to make the switch

simianwords•50m ago
No one talking about how this flash Beats Pro? Imagine what 3.5 pro looks like?

Also concerned about Gemini models being benchmaxxed generally

NitpickLawyer•31m ago
> concerned about Gemini models being benchmaxxed generally

I would say they are the least benchmaxxed out of all the top labs, for coding. They've always been behind opus/gpt-xhigh for agentic stuff (mostly because of poor tool use), but in raw coding tasks and ability to take a paper/blog/idea and implement it, they've been punching above their benchmarks ever since 2.5. I would still take 2.5 over all the "chinese model beats opus" if I could run that locally, tbh.

hubraumhugo•50m ago
Just updated my HN Wrapped project with it and it does well on my totally unscientific LLM humor benchmark: https://hn-wrapped.kadoa.com
npn•46m ago
The price is crazy.

And I guess Gemini 3.5 pro will have the pricing increment, too. 12 x 5 = 60?

It seems like google does want us to use Chinese models.

GodelNumbering•46m ago
Per million input/output tokens:

Gemini 2.5 flash: $0.30/$2.50

Gemini 3.0 flash preview: $0.50/$3.00

Gemini 3.5 flash: $1.50/$9.00

Interesting pricing direction. I don't think we have ever seen a 3x price increase for in the immediate next same-sized model (and lol @ 3 only ever getting a preview).

3.5 flash costs similar to Gemini 2.5 pro which was $1.25/$10

dbbk•37m ago
I don't think they're really comparable. Seems they created the Flash-Lite tier to take the spot of the old Flash models.
GodelNumbering•32m ago
No, 2.5 had both flash and flash lite.
rudedogg•37m ago
If Google is actually getting cheaper inference than everyone else with their TPUs, this smells like trouble to me. Maybe serving LLMs at a profit is proving difficult.

Or maybe they think because their benchmarks are good they can ramp up the prices. Seems like they don’t have the market share to justify a move like that yet to me.

IncreasePosts•32m ago
Maybe the margins are just very large for Google because they predict so much demand for 3.5?
GodelNumbering•29m ago
This combined with locally runnable models getting pretty good recently (e.g. Qwen 3.6) tells me that it's time to seriously consider local dev setup again
MASNeo•12m ago
Besides the cost you get the control, transparency and ability to identify small language models or LoRA you want to serve even more cost effective.
tempaccount420•13m ago
This is not priced at inference cost.

My guess: it's the price at which they make more money than if they rent the TPUs to other companies.

The Gemini team has had trouble securing enough TPUs for their user's needs. They struggle with load and their rate limits are really bad. Maybe at a higher price, they have a better chance at getting more TPUs assigned?

fnordsensei•34m ago
3.5 flash is listed as stable rather than preview, or am I misreading?

https://ai.google.dev/gemini-api/docs/models/gemini-3.5-flas...

GodelNumbering•33m ago
ah I mistakenly wrote preview
dr_dshiv•32m ago
3.1 flash lite — $0.25/$1.50 — plus insanely fast.

3.1 flash lite isn’t quite as good as 3 flash preview (which is the most incredible cheap model… I really love it) — but 3.1 is half the price and the insane speed opens up different use cases.

For comparison, Opus models are $5/$25

doginasuit•27m ago
They probably never intended to keep serving cheap models. This is a natural way to introduce the squeeze, now that they have people who built services on their API. It makes a lot of sense to have an abstraction layer where the provider doesn't matter. If you are working in Kotlin, Koog is excellent.
ilia-a•21m ago
Yeah, it is a massive jump in price, hardly a "Flash" model anymore... I wonder if they'll release flash lite or something with a bit more affordable price point.
LetsGetTechnicl•19m ago
Gen AI is unprofitable, especially at the insanely cheap rates they've been offering to get people in the door. So expect more increases in the future.
GaggiX•7m ago
If you don't need SOTA or near SOTA there are plenty of dirt cheap models, just look at Gemma 4 31B on Openrouter.
hei-lima•9m ago
We need another "Deepseek moment" or else it will become impossible for the regular dude to use AI. It will become something that only big companies can afford.
irthomasthomas•5m ago
And they are using this to power search answers?
photonair•4m ago
In general, Gemini flash is still relatively cheaper compared to the "mini" version of the other big 2. However, I agree that newer version seem to have multiple X price increase (similar to the new ChatGPT) and we certainly need competition from the open source models to keep these guys in check with pricing.
llmslave•39m ago
Conspiracy theory:

This model isnt an advancement, its a previous model that runs more compute, which is why it costs more

npn•36m ago
Nah, it costs what you are willing to pay.
golfer•39m ago
Arena.ai:

> Gemini 3.5 Flash’s pricing shifts the Pareto frontier in Text. 8 models from GoogleDeepMind dominate the Text Arena Pareto curve where only 4 labs are represented for top performance in their price tiers.

https://x.com/arena/status/2056793180998361233

andrewstuart•38m ago
The benchmark that matters - can it actually program as well as Claude code.

If not then I’m not using it.

Cancelled my account 3 months ago, only Claude code level capability would bring me back.

reconnecting•26m ago
Knowledge cutoff: January 2025

Latest update: May 2026

I have a very bad feeling about this lag.

hosel•18m ago
Can you explain what you mean?
nemomarx•15m ago
It might indicate core model training and pre training is really slowing down?
mixtureoftakes•3m ago
also parsing is harder + so much more of the new data is being generated by ai itself.

still the cutoff is very much concerning and inconvenient

stan_kirdey•19m ago
EXPENSIVE ._.
MASNeo•15m ago
Well, available for Gemini means these days that half the time they are “Receiving a lot of requests right now.” and so sorry they couldn’t complete the task. Luckily the model supports long time horizons because that’s what’s needed. /me likes Gemini a lot just wishing Google would add the compute!
simonw•14m ago
The pelican is a lot: https://github.com/simonw/llm-gemini/issues/133#issuecomment...

Not a great bicycle though, it forgot the bar between the pedals and the back wheel and weirdly tangled the other bars.

Expensive too - that pelican cost 13 cents: https://www.llm-prices.com/#it=11&ot=14403&sel=gemini-3.5-fl...

hedgehog•11m ago
That pelican looks like it's in Miami for a crypto conference.
hydra-f•6m ago
Same old issue with Gemini models trying to "enrich" everything
ralusek•12m ago
Those prices, what a disappointment.
mackross•7m ago
The antigravity teamwork-preview doesn't work for me -- upgraded to ultra, installed antigravity 2, ran teamwork-preview, keeps failing: "You have exhausted your capacity on this model. Your quota will reset after 0s."
jdw64•4m ago
Honestly, I feel like the new Gemini 3.5 Flash is a failure. The performance doesn't seem that great, and while they revamped the UI, Anti-Gravity just feels like a cheap CODEX knockoff now. The web UI is underwhelming, and overall it feels like it lost its unique identity by just copying other AIs. It’s a flop in both performance and price point. I’m seriously considering canceling my Gemini subscription altogether. Using Chinese AI models might actually be a better option at this point

Gemini 3.5 Flash

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
147•spectraldrift•2h ago•147 comments

I’ve built a virtual museum with nearly every operating system you can think of

https://virtualosmuseum.org/
371•andreww591•3h ago•79 comments

Google changes its search box

https://blog.google/products-and-platforms/products/search/search-io-2026/
54•berkeleyjunk•1h ago•178 comments

I’ve joined Anthropic

https://twitter.com/karpathy/status/2056753169888334312
883•dmarcos•4h ago•346 comments

Apple unveils new accessibility features

https://www.apple.com/newsroom/2026/05/apple-unveils-new-accessibility-features-and-updates-with-...
496•interpol_p•7h ago•262 comments

Show HN: Gaussian Splat of a Strawberry

https://superspl.at/scene/84df8849
418•danybittel•9h ago•165 comments

Disney erased FiveThirtyEight

https://www.natesilver.net/p/disney-erased-fivethirtyeight
55•7777777phil•48m ago•2 comments

Copy Fail, Dirty Frag, and Fragnesia kernel vulnerabilities

https://www.gentoo.org/news/2026/05/19/copy-fail-fragnesia-vulnerabilities.html
73•akhuettel•4h ago•18 comments

The Silver Swan

https://thebowesmuseum.org.uk/collections/the-silver-swan/
8•pseudolus•1d ago•0 comments

Era: From Nature publication to catalyzing Computational Discovery

https://research.google/blog/empirical-research-assistance-era-from-nature-publication-to-catalyz...
4•praccu•9m ago•0 comments

Show HN: Superlog (YC P26) – Observability that installs itself and fixes bugs

https://superlog.sh/
33•Magnanten•3h ago•31 comments

Gemini Omni

https://deepmind.google/models/gemini-omni/
94•meetpateltech•1h ago•40 comments

Show HN: Haystack – Review the PRs that need human attention

https://haystackeditor.com/
15•akshaysg•1d ago•6 comments

Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks

https://github.com/antoinezambelli/forge
8•zambelli•7h ago•2 comments

CISA Admin Leaked AWS GovCloud Keys on GitHub

https://krebsonsecurity.com/2026/05/cisa-admin-leaked-aws-govcloud-keys-on-github/
309•LelouBil•11h ago•142 comments

I found ultra-pure quantum crystals in an abandoned mine in the Atacama desert

https://medium.com/@breid.at/ultra-pure-quantum-crystals-from-an-abandoned-mine-in-a-mysterious-d...
231•vi_sextus_vi•2d ago•93 comments

Intro to TLA+ for the LLM Era: Prompt Your Way to Victory

https://emptysqua.re/blog/intro-to-tla-plus-for-the-llm-era/
80•zdw•2d ago•20 comments

KV Sharing, MHC, and Compressed Attention

https://magazine.sebastianraschka.com/p/recent-developments-in-llm-architectures
18•gmays•3h ago•1 comments

Hanoi’s humble beer glass and the memory of a nation

https://sundaylongread.com/2026/05/15/hanois-humble-beer-glass-and-the-memory-of-a-nation/
91•NaOH•1d ago•26 comments

Mistral AI Acquires Emmi AI to Create the Leading AI Stack

https://www.emmi.ai/news/mistral-ai-acquires-emmi-ai
4•doener•30m ago•0 comments

The last six months in LLMs in five minutes

https://simonwillison.net/2026/May/19/5-minute-llms/
680•yakkomajuri•18h ago•532 comments

Mini Shai-Hulud Strikes Again: 314 npm Packages Compromised

https://safedep.io/mini-shai-hulud-strikes-again-314-npm-packages-compromised/
323•theanonymousone•14h ago•248 comments

Why is almost everyone right-handed? A new study connects it to bipedalism

https://www.ox.ac.uk/news/2026-05-15-why-is-almost-everyone-right-handed-the-answer-may-lie-in-ho...
39•gmays•4h ago•59 comments

Peter Neumann has died

https://www.tuhs.org/pipermail/tuhs/2026-May/033748.html
294•pabs3•16h ago•23 comments

Show HN: I made a 3D pose maker for artists

https://setpose.com/
60•augustvdv•5h ago•28 comments

An Apple (II) for Teacher

https://technicshistory.com/2026/05/19/an-apple-ii-for-teacher/
47•cfmcdonald•19h ago•16 comments

Polypad

https://polypad.amplify.com/
198•ivank•2d ago•23 comments

'Capitalism has to become more humane': a Stanford economist on big tech

https://www.theguardian.com/books/2026/may/18/big-tech-monopolies-democracy-mordecai-kurz
37•xyzal•1h ago•22 comments

AI, "Humanity", and Dr. Manhattan Syndrome: A Communications Intervention

https://www.personfamiliar.com/p/ai-humanity-and-dr-manhattan-syndrome
9•stalfosknight•2h ago•0 comments

Deciphering the Hashihara Castle Town Map

https://www.obayashi.co.jp/en/kikan_obayashi/detail/kikan_64_project.html
5•1970-01-01•1h ago•0 comments