I still think ultimately (and somewhat sadly) Google will win the AI race due to its engineering talent and the sheer amount of data it has (and Android integration potential).
It may well be that they also didn't have a product culture as an organization, but were willing to experiment or let small teams do so.
It's still a lesson, but maybe a different one.
With organizational scale it becomes harder and harder to launch experiments under the brand. Red tape increases, outside scrutiny increases. Retaining the ability to do that is difficult.
Google does experiment a fair bit (including in AI, e.g. NotebookLLM and its podcast feature are I think a standout example of trying to see what sticks) but they also tend to try to hide their experiments in developer portals nowadays, which makes it difficult to get a signal from a general consumer audience.
I feel like Google tried to solve for this with their `withgoogle.com` domain and it just ends up being confusing or worse still, frustrating when you see something awesome and then nothing ever comes of it.
Google's AI offering is a complete nightmare to use. Three different APIs, at least two different subscriptions, documentation that uses them interchangeably.
For Gemini's API it's often much simpler to actually pay OpenRouter the 5% surchargeto BYOK than deal with it all.
I still can't use my Google AI Pro account with gemini-cli..
Fair criticism that it took someone else to make something of the tech that Google initially invented, but Google is furiously experimenting with all their active products since Sundar's "code red" memo.
This is evident in Android and the pixel lineup, which could be my favorite phone if not for some of the most baffling and frustrating decisions that lead to a very weirdly disjointed app experience (comparing to something like iOS's first party tools).
Like removing location based reminders from google tasks, for some reason? Still no apple shortcuts-like automation built-in, keep can still do location based reminders but it's a notes app so which am I supposed to use? Google tasks or keep? Well, gemini adds reminders to google tasks and not keep if I wanted to use keep primarily.
If they just spent some time polishing and integrating these tools, and add some of their ML magic to it they'd blow Apple out of the park.
All of Google's tech is cool and interesting, from a tech standpoint but it's not well integrated for a full consumer experience.
It would be weird to release that as a serious company. They tried making a deliberately-wacky chatbot but it was not fun.
Letting OpenAI to release it first was a right move.
I remember forming a really simple dead simple sveltekit website during Chatgpt 3. It was good, it was mind blowing and I was proud of it.
The only interactivity was a button which would go from one color to other and it would then lead to a pdf.
If I am going to be honest, the UI was genuinely good. It was great tho and still gives me more nostalgia and good vibes than current models. Em-dashes weren't that common in Chatgpt 3 iirc but I have genuinely forgotten what it was like to talk to it
It was one of the first things I tried when Claude Code went GA:
One of the biggest issues holding Gemini back, IMO, compared to the competitors.
Many LLMs are still plagued by "it's easier to reset the conversation than to unfuck the conversation", but Gemini 2.5 is among the worst.
simonw•1h ago
A few more in this genre:
https://x.com/cannn064/status/1973818263168852146 - "Make a SVG of a PlayStation 4 controller"
https://x.com/cannn064/status/1973415142302830878 "Create a single, self-contained HTML5 file that mimics a macOS Sonoma-style desktop: translucent menu bar with live clock, magnifying dock, draggable/resizable windows, and a dynamic wallpaper. No external assets; use inline SVG for icons."
https://x.com/synthwavedd/status/1973405539708056022 "Write full HTML, CSS and Javascript for a very realistic page on Apple's website for the new iPhone 18"
I've not seen it myself so I'm not sure how confident they are that it's Gemini 3.0.
ceejayoz•55m ago
Is this supposed to be a good example?
It looks like something I'd put together, and you don't want me doing design work.
ajcp•37m ago
diggan•32m ago
The only thing I've found to give me some sort of quantitative idea of how good a new model is, is my own private benchmarks. It doesn't cover everything I want to use LLMs for, and only has 20-30 tests per "category", but at least I'm 99% sure it isn't in the training datasets.
simonw•29m ago
I would be so entertained if I found out an AI lab had wasted their time cheating on my dumb benchmark!
ajcp•23m ago
Que intro: "The gang wastes their time cheating on a dumb benchmark"
Imustaskforhelp•8m ago
But now I am worried that since you have shared that you do SVG of an X riding a Y thing, maybe these models will try to cheat on the whole SVG of X riding Y thing instead of hyper focusing the pelican.
So now I suppose you might need to come up with an entirely new thing though :)
ajcp•21m ago
latemedium•4m ago