- Anyone actually able to use it? I get 'You've reached your rate limit. Please try again later'. (That said, I don't have a paid plan, but I've always had pretty much unlimited access to 2.5 pro)
[Edit: working for me now in ai studio]
I would like to try the model, wondering if it's worth setting up billing or waiting. At the moment trying to use it in AI Studio (on the Free tier) just gives me "Failed to generate content, quota exceeded: you have reached the limit of requests today for this model. Please try again tomorrow."
Edit: Now that I have access to Gemini 3 preview, I've compared the results of the same one shot prompts on the gemini app's 2.5 canvas vs 3 AI studio and they're very similar. I think the rumor of a stealth launch might be true.
How come I can't even see prices without logging in... they doing regional pricing?
"Failed to generate content, quota exceeded: you have reached the limit of requests today for this model. Please try again tomorrow."
"You've reached your rate limit. Please try again later."
Update: as of 3:33 PM UTC, Tuesday, November 18, 2025, it seems to be enabled.
when i signed up for billing via cloud console and entered my credit card, i got $300 "free credits".
i haven't thrown a difficult problem at gemini 3 pro it yet, but i'm sure i got to see it in some of the A/B tests in aistudio for a while. i could not tell which model was clearly better, one was always more succinct and i liked its "style" but they usually offered about the same solution.
I reckon it's an API key thing... you can more explicitly select a "paid API key" in AI Studio now.
> gemini-3-pro-preview-ais-applets
> gemini-3-pro-preview
For comparison: Gemini 2.5 Pro was $1.25/M for input and $10/M for output Gemini 1.5 Pro was $1.25/M for input and $5/M for output
They have started lock in with studio, I would say they are still in market penetration but stakeholders want to see path to profit so they are starting the price skimming, it's just the beginning.
2026: cure cancer
Standard Context(≤ 200K tokens)
Input $2.00 vs $1.25 (Gemini 3 pro input is 60% more expensive vs 2.5)
Output $12.00 vs $10.00 (Gemini 3 pro output is 20% more expensive vs 2.5)
Long Context(> 200K tokens)
Input $4.00 vs $2.50 (same +60%)
Output $18.00 vs $15.00 (same +20%)
For reference: gpt-5.1-thinking passes, gpt-5.1-instant fails, gpt-5-thinking fails, gpt-5-instant fails, sonnet-4.5 passes, opus-4.1 passes (lesser claude models fail).
This is a reminder that benchmarks are meaningless – you should always curate your own out-of-sample benchmarks. A lot of people are going to say "wow, look how much they jumped in x, y, and z benchmark" and start to make some extrapolation about society, and what this means for others. Meanwhile.. I'm still wondering how they're still getting this problem wrong.
I just chatted with gemini-3-pro-preview about an idea I had and I'm glad that I did. I will definitely come back to it.
IMHO, the current batch of free, free-ish models are all perfectly adequate for my uses, which are mostly coding, troubleshooting and learning/research.
This is an amazing time to be alive and the AI bubble doomers that are costing me some gains RN can F-Off!
it's easy to focus on what they can't do
Yeah I have my own set of tests and the results are a bit unsettling in the sense that sometimes older models outperform newer ones. Moreover, they change even if officially the model doesn't change. This is especially true of Gemini 2.5 pro that was performing much better on the same tests several months ago vs. now.
GPT4/3o might be the best we will ever have
No they’re not. Maybe you mean to say they don’t tell the whole story or have their limitations, which has always been the case.
>>my fairly basic python benchmark
I suspect your definition of “basic” may not be consensus. Gpt-5 thinking is a strong model for basic coding and it’d be interesting to see a simple python task it reliably fails at.
Is the only thing that prevents a benchmark from being meaningful publicity?
I think my benchmark is well designed. It's well designed because it's a generalization of a problem I've consistently had with LLMs on my code. Insofar that it encapsulates my coding preferences and communication style, that's the proper benchmark for me.
My benchmark says that I will stick with Codex CLI for the foreseeable future.
"Create me a SVG of a pelican riding on a bicycle"
So impressive it makes you wonder if someone has noticed it being used a benchmark prompt.
https://pbs.twimg.com/media/G6CFG6jXAAA1p0I?format=jpg&name=...
Also, the full document:
https://archive.org/details/gemini-3-pro-model-card/page/n3/...
The questions AND the answers are public.
If the LLM manages through reasoning OR memory to repeat back the answer then they win.
The scores represent the % of correct answers they recalled.
The person also claims that with thinking on the gap narrows considerably.
We'll probably have 3rd party benchmarks in a couple of days.
- GPT-5 medium is the best
- GPT-5.1 falls right between Gemini 2.5 Pro and GPT-5 but it’s quite a bit faster
Really wonder how well Gemini 3 will perform
Temperature continues to be gated to maximum of 0.2, and there's still the hidden top_k of 64 that you can't turn off.
I love the google AI studio, but I hate it too for not enabling a whole host of advanced features. So many mixed feelings, so many unanswered questions, so many frustrating UI decisions on a tool that is ostensibly aimed at prosumers...
Gemini 3 however, while still failing, at least recognized the 5th leg, but thought the dog was...well endowed. The 5th leg however is clearly a leg, despite being where you would expect the dogs member to be. I'll give it half credit for at least recognizing that there was something there.
Still though, there is a lot of work that needs to be done on getting these models to properly "see" images.
https://old.reddit.com/r/wallstreetbets/comments/1oz6gjp/new...
Its transcribing the meeting but hallucinates badly... both in fast and thinking mode. Fast mode only transcribed about a fifth of the meeting before saying its done. Thinking mode completely changed the topic and made up ENTIRE conversations. Gemini 2.5 actually transcribed it decently, just occasional missteps when people talked over each other.
I'm concerned.
nilsingwersen•1h ago