Ask HN: How is Google AI Mode so much faster than ChatGPT

2•excitedrustle•1h ago

After two years of ChatGPT use, over the past month or so, I've found myself using Google Search instead.

The "AI Overview" is often sufficient and is served very quickly. (Sometimes nearly instant. I assume Google is caching responses for common searches).

"Deep Mode" is just one click away. And the responses are much, much faster. A question that might take 10 or 15 seconds in ChatGPT (with the default GPT5) takes <1 second to first token with Google. And then remaining tokens stream in at a noticeably faster rate.

Is Google just throwing more hardware than OpenAI?

Playing other tricks to look faster? (E.g., use a smaller, faster, non-reasoning model to serve the first part of the response while a slower, reasoning model works on more detailed part of the later response).

Web search tool calls are much faster too, presumably powered by Google's 30 years of web search.

Comments

MrCoffee7•1h ago

Google's AI search overview is designed to quickly pull and summarize information from its massive web index, while ChatGPT search focuses on providing detailed conversational responses that may require more processing time. The speed difference users notice comes from fundamental differences in how these systems work - Google leverages its existing search infrastructure and pre-indexed web content, while ChatGPT processes queries through a more complex language model that generates responses token by token. Also, I would imagine that ChatGPT is using RAG more in generating some of its responses, and RAG is I/O bound. I/O bottlenecks are orders of magnitude slower than a process that could be completed mostly in memory.

maltelandwehr•1h ago

Google is using a special version of Gemini (fast, small) and a special version of their internal ranking API (faster, fewer anti-spam/quality measures).

That makes them very fast. But that also leads to a ton of hallucinations. If you ask for non existent things (like the cats.txt protocol), AI Overviews consistently fabricate facts. Ai Overviews can pull the content of the potential source ULRs directly from Google's cache.

ChatGPT is slow because they have to make an external API call to Bing or - even worse - to a scraping provider like SerpApi/Data4SEO/Oxylabs to crawl regular Google search results. That introduced two delays. OpenAI then has to fetch some of these potential source URLs in real time. That introduces another delay. And then OpenAI also uses a better (but slower) model than Google to generate the answer.

Over time, OpenAI should be able to catch up in terms of speed with their own web/search index.

If you try more complex questions, you might find AI Overviews less to your liking.

Google gets away with this because users are used to type simple queries - often just a few keywords. Any kind of AI answer is like magic.

OpenAI cannot do the same. Their users are used to having multi-turn conversations and receiving thoughtful answers to complex questions.

excitedrustle•40m ago

Interesting. I am still defaulting to ChatGPT when I anticipate having a multi-turn conversation.

But for questions where I expect a single response to do, Google has taken over.

Here's an example from this morning:

It's my first autumn in a new house, and my boiler (forced hot water heating) kicked on for the first time. The kickboards in the kitchen have Quiet-One Kickspace brand radiators with electric fans. I wanted to know what controls these fans (are they wired to the thermostat, detect radiator temp, etc?)

I searched "When does a quiet-one kickspace heater turn on". Google Overview answered correctly [1] in <1 second. Tried the same prompt to ChatGPT. Took 17 seconds to get the full (also correct, and similarly detailed) answer.

Both answers were equally detailed and of similar length.

[1] Confirmed correct by observing the operation of the unit.

Why a reachable chess position can have at most 218 playable moves

Giving LLMs Eyes on the Web

Emerging hemispheric asymmetry of Earth's radiation

OctaneGUI – a renderer agnostic multi-window multi-platform UI library for C++

RenderScholar – Scrape real papers from Google Scholar (no hallucinations)

Email was the user interface for the first AI recommendation engines

New Zealand's Institute of IT Professionals Collapses

A new generation of radiotherapies promises a more targeted attack on cancer

China's most infamous ghost town is now training ground for driverless trucks

Record Everything

Simple Hotkey Daemon for macOS, Ported to Zig

Show HN: Let an LLM roast your HN profile

Why young men are losing faith in science

Iran must move its capital from Tehran, says president as water crisis worsens

From Project to Market

Benchmark: Spark vs. Ray Data vs. Daft on Multimodal Workloads

How to reproduce and fix an I/O data race with Go and DTrace

Hey Siri. Block Reddit

Big trees in Amazon more climate-resistant than previously believed

Army says it's mitigated 'critical' cybersecurity deficiencies in NGC2 prototype

Jj: Commands and Revsets

Optimizing meshoptimizer to process billions of triangles in minutes

Context Engineering Lessons for Building AI Agents

Aristotle: IMO-Level Automated Theorem Proving

Perplexity Email Assistant, a Personal Assistant for Your Inbox

Be Worried

A Breath of Fresh Air with Brian Eno

Ants Trapped in a Soviet Nuclear Bunker Survived for Years

Phosphine Found in Brown Dwarfs, Stumping Scientists

Brainsomware: The evolutionary bug hackers are thankful for" [video]