frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Gemini 3 Flash – Everything you need to know

https://artificialanalysis.ai/articles/gemini-3-flash-everything-you-need-to-know
1•Topfi•1h ago

Comments

Topfi•1h ago
That worst in class hallucination rate, coupled with a massive output token amount that ends up making the benchmark run more expensive than models such as Haiku 4.5 despite a cheaper per million token cost are really disappointing and do align with some personal testing of mine, not to mention the initial experience I commented on yesterday in the announcement thread.

I have a hard time understanding the significant positive sentiment considering how strongly the performance I am seeing deviates from the benchmark results published. 3 Flash is almost Grok level in this regard which is very disappointing for Google. Speed and cost are also not an edge seeing as e.g. Kimi K2 by not overly abusing the reasoning budget comes out cheaper in real world testing and reliably hits the same or higher throughput depending on the provider. Maybe I am underestimating how many users real life use cases cover solving ArcAGI games or publicly accessible and impossible to keep out of the training data databases of questions...

Scroll down to "Cost to Run Artificial Analysis Intelligence Index" for a per run cost comparison between 3 Flash, Kimi K2 Thinking and Haiku 4.5 with 3 Flash being almost twice as expensive as Haiku 4.5: https://artificialanalysis.ai/?models=gemini-3-flash-reasoni...