Ask HN: How does ChatGPT decide which websites to recommend?

5•nworley•21h ago

For years, SEO has meant optimizing for Google’s crawler.

But increasingly, discovery seems to be happening somewhere else: ChatGPT Claude Perplexity AI-powered search and assistants

These systems don’t “rank pages” the same way search engines do. They select sources, summarize them, and recommend them directly.

What surprised me while digging into this: - AI models actively fetch pages from sites (sometimes user-triggered, sometimes system-driven) - Certain pages get repeatedly accessed by AI while others never do - Mentions and recommendations seem to correlate more with contextual coverage and source authority than traditional keyword targeting

The problem is that this entire layer is invisible to most builders.

Analytics tools show humans. SEO tools show Google. But AI traffic, fetches, and mentions are basically a black box.

I started thinking about this shift as: GEO (Generative Engine Optimization) or AEO (Answer Engine Optimization)

Not as buzzwords, but as a real change in who we’re optimizing for.

To understand it better, I ended up building a small internal tool (LLMSignal) just to observe: - when AI systems touch a site - which pages they read - when a brand shows up in AI responses

The biggest takeaway so far: If AI is becoming a front door to the internet, most sites have no idea whether that door even opens for them.

Curious how others here are thinking about: - optimizing for AI vs search - whether SEO will adapt or be replaced - how much visibility builders should even want into AI systems

Not trying to sell anything — genuinely interested in how people here see this evolving.

Comments

theorchid•15h ago

This is especially important when launching new SaaS projects. Google does not trust new domains for the first 6-12 months. But if you publish information about your project on other sites, the AI will recommend your site in its responses. Just post a few times on Reddit, and in a week, GPT will be giving out links to your SaaS product. AI doesn't need exact low-frequency or high-frequency keywords like SEO does. AI is good at understanding user queries and giving out the right SaaS that solves the user's problem. You don't need to create a blog on your website and try to rank it in search engines. It is enough to post articles on other websites with information about your project.

nworley•11h ago

This matches a lot of what I’ve been seeing too.

What stood out to me is that AI seems far less concerned with domain age than Google is. If there’s enough contextual discussion around a product (ie. Reddit threads, blog posts, docs, comparisons) then AI models seem willing to surface it surprisingly early.

That said, what I’m still trying to understand is consistency. I’ve seen cases where a product gets recommended heavily for a week, then effectively disappears unless that external context keeps getting reinforced.

So it feels less like “rank once and you’re good” (SEO) and more like “stay present in the conversation.” Almost closer to reputation management than classic content marketing.

Curious if you’ve seen the same thing, especially around how long external mentions keep influencing AI recommendations before they decay.

theorchid•15h ago

However, there is a lack of information when a user opens your website after interacting with AI.

Google Search Console shows the user's query if the query is popular enough and your website is in the search results. Bing shows all queries, even if they are not popular, and if your website is in the search results.

But if AI recommends your website when answering people's questions, you cannot find out what questions the user discussed, how many times your website was shown, and in what position. You can see the UTM tag in your website analytics (for example, GPT adds utm source), but that is the maximum amount of information that will be available to you. But if a user discussed a question with AI and only got your brand name, and then found your site in a search engine, you won't be able to tell that they found you with the help of AI advice.

nworley•10h ago

This is exactly what set me off in trying to figure out the visibility gap.

What’s strange is that we’re moving into a world where recommendations matter more than a click, but attribution still assumes a traditional search funnel. By the time someone lands on your site, the most important decision may have already happened upstream and you have no idea.

The UTM case you mentioned is a good example: it only captures direct "AI to site" clicks, but misses scenarios where AI influences the decision indirectly (brand mention to later search to visit). From the site’s perspective tho... yeah it looks indistinguishable from organic search. It makes me wonder whether we’ll need a completely new mental model for attribution here. Perhaps less about “what query drove this visit” and more about “where did trust originate.”

Not sure what the right solution is yet, but it feels like we’re flying blind during a pretty major shift in how people discover things.

quiqueqs•9h ago

This is why most of these AI search visibility tools focus on tracking many possible prompts at once. LLMs give 0 insight into what users are actually asking, so the only thing you can do is put yourself in the user’s shoes and try to guess what they might prompt.

Disclaimer: I've built a tool in this space (Cartesiano.ai), and this view mostly comes from seeing how noisy product mentions are in practice. Even for market-leading brands, a single prompt can produce different recommendations day to day, which makes me suspect LLMs are also introducing some amount of entropy into product recommendations (?)

nworley•2h ago

I don’t think there’s a clean solution yet but I’m not convinced brute force prompt enumeration scales either, given how much randomness is baked in. I guess that’s why I’ve started thinking about this less as prompt tracking and more as signal aggregation over time. Looking at repeat fetches, recurring mentions, and which pages/models seem to converge on the same sources. It doesn’t tell you what the user asked, but it can hint at whether your product is becoming a defensible reference versus a lucky mention.

From someone who's built a tool in this space, curious if you’ve seen any patterns that cut through the noise? Or if entropy is just something we have to design around.

Disclaimer: I've built a tool in this space as well (llmsignal.app)

marcwajsberg•9h ago

The attribution point is huge: the “decision” can happen in the model’s answer, and your analytics only see the last hop.

A practical mental model for recommendations is less “ranking” and more confidence:

Does the model have enough context to map your product to a problem? Are there independent mentions (docs, comparisons, forum threads) that look earned vs manufactured? Is there procedural detail that makes it easy to justify recommending you (“here’s the workflow / constraints / outcomes”)? For builders, a good AEO baseline is: Publish a strong docs/use-case page that answers “when should I use this vs alternatives?” Seed real-world context by participating in existing discussions (HN/Reddit/etc.) with genuine problem-solving and specifics. Track influence with repeatable prompt tests + lightweight surveys (“how did you hear about us?”) since last-click won’t capture it.

It feels like early SEO again: less perfect instrumentation, more building the clearest and most defensible reference for your category.

allinonetools_•6h ago

I have noticed something similar while building small tools — AI recommendations seem to favor clarity and can this answer a real task fast over classic SEO signals. Pages that explain what they do plainly and work without friction show up more often than heavily optimized ones.

nworley•2h ago

You bring up something I've been trying to figure out as well. It feels like AI favors pages that give a direct, honest answer and then makes that answer immediately available. Seemingly plain, readable HTML with no friction preforms well. If the intent is obvious at fetch time, it seems to matter more than how “optimized” the page is. It feels less like SEO and more like “can this page be understood immediately.” I feel as if the initial response is basically <div id="root"></div> + a big JS bundle, it feels like you’re betting the crawler will execute it and I’m not convinced they consistently do. Curious if you’ve run into that too? Have you seen AI recommendations skew toward SSR/static pages vs client-rendered apps, even when the content is technically “there” once the JS runs?

raw_anon_1111•5h ago

Even if I did know, the last thing the world needs is for SEO folks to figure out how to game LLMs if they haven’t already.

SEO has made web search unusable and practitioners are the scum of the earth.

But more practically like Raymond Chen said, if every app could figure out how to keep their windows always on top, what good would it do? The same with SEO.

nworley•2h ago

Right there with you because SEO has evolved to a place that incentivizes a lot of bad behavior, and the end result made search worse for everyone. I’m personally less interested in “gaming” LLMs than understanding what they already do. From my side, this feels closer to observability than optimization when trying to see whether AI systems are even reading or understanding a site, not how to trick them into ranking something low quality.

The Raymond Chen analogy brings up something interesting. If everyone forces themselves on top, the signal collapses. My hope is that AI systems end up rewarding genuinely useful, well explained things rather than creating another arms race...but I’m not naive about how incentives tend to play out.

A huge concern of mine has been the introduction of ads. Once ads enter LLM responses, it’s hard not to ask whether we’re just rebuilding the same incentive structure that broke search in the first place.

Ask HN: Ideas for small ways to make the world a better place

AI Regex Scientist: A self-improving regex solver

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

Ask HN: Why LLM providers sell access instead of consulting services?

Ask HN: Who wants to be hired? (February 2026)

Ask HN: Who is hiring? (February 2026)

Ask HN: What is the most complicated Algorithm you came up with yourself?

Tell HN: Another round of Zendesk email spam

Ask HN: Is Connecting via SSH Risky?

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

Ask HN: How does ChatGPT decide which websites to recommend?

Ask HN: Is it just me or are most businesses insane?

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

Ask HN: Mem0 stores memories, but doesn't learn user patterns

Ask HN: Is there anyone here who still uses slide rules?

Kernighan on Programming

Ask HN: How Did You Validate?

We built a serverless GPU inference platform with predictable latency

Test management tools for automation heavy teams

Ask HN: Cheap laptop for Linux without GUI (for writing)

How do you deal with SEO nowadays?

Ask HN: Have you been fired because of AI?

Ask HN: Does a good "read it later" app exist?

Ask HN: Are "provably fair" JavaScript games trustless?

Ask HN: Anyone have a "sovereign" solution for phone calls?

Ask HN: OpenClaw users, what is your token spend?

Ask HN: Has anybody moved their local community off of Facebook groups?

GitHub Actions Have "Major Outage"

Ask HN: How does ChatGPT decide which websites to recommend?

Comments

Ask HN: Ideas for small ways to make the world a better place

AI Regex Scientist: A self-improving regex solver

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

Ask HN: Why LLM providers sell access instead of consulting services?

Ask HN: Who wants to be hired? (February 2026)

Ask HN: Who is hiring? (February 2026)

Ask HN: What is the most complicated Algorithm you came up with yourself?

Tell HN: Another round of Zendesk email spam

Ask HN: Is Connecting via SSH Risky?

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

Ask HN: How does ChatGPT decide which websites to recommend?

Ask HN: Is it just me or are most businesses insane?

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

Ask HN: Mem0 stores memories, but doesn't learn user patterns

Ask HN: Is there anyone here who still uses slide rules?

Kernighan on Programming

Ask HN: How Did You Validate?

We built a serverless GPU inference platform with predictable latency

Test management tools for automation heavy teams

Ask HN: Cheap laptop for Linux without GUI (for writing)

How do you deal with SEO nowadays?

Ask HN: Have you been fired because of AI?

Ask HN: Does a good "read it later" app exist?

Ask HN: Are "provably fair" JavaScript games trustless?

Ask HN: Anyone have a "sovereign" solution for phone calls?

Ask HN: OpenClaw users, what is your token spend?

Ask HN: Has anybody moved their local community off of Facebook groups?

GitHub Actions Have "Major Outage"