Launch HN: Sitefire (YC W26) – Automating actions to improve AI visibility

19•vincko•1h ago

Hi HN! We're Vincent and Jochen from sitefire (https://sitefire.ai). Our platform makes it easy for brands to improve their visibility in AI search.

We’ve been working together for years and have backgrounds in RL/optimization at Stanford and software engineering. We came to this idea after speaking with marketing teams who were seeing declining traffic due to Google’s AI Overviews and didn’t know what to do.

This space can feel esoteric. Many case studies, few actual studies. Constant battle against myths (e.g. you need a llms.txt vs. you don't need a llms.txt) and "GEO hacks". We try to be more data-driven. And we try to be more bold and build a system that not only monitors, but actually improves traffic from AI search.

While Google performs a single search, AI search engines expand the user prompt into 3-10 fan-out queries. The sourced pages are ranked using a classified algorithm similar to Reciprocal Rank Fusion (RFF). Finally, the LLMs skim the pages and decide what snippets to cite. Our goal is making sure brands have the right content that makes it through this funnel.

Here is how sitefire works:

- The user defines a set of prompts they want to monitor. These are synthetic prompts - we generate them based on SEO keywords and their monthly search volume.

- We submit these prompts to ChatGPT, Gemini, Google AI Mode, etc. on a daily basis and capture the answers. We extract fan-out queries, sourced pages, citations, and brand mentions.

- For each topic, our agents analyze which web pages are sourced and cited the most, and why. They also consider similar pages that you already have.

- Based on the diagnosis, our content agents draft improvements or create new pages, and push them directly to the client’s CMS.

- We integrate with the client’s network logs and Google Analytics to monitor the increase in AI bot requests and human referrals to their page.

This system is continuously updated, so it always shows which content works, and how to adapt the existing sitemap. For one client that used sitefire to optimize their blog, the AI-optimized articles increased their AI bot requests from ~200/day to ~570/day within ten days.

A risk we recognize is that AI-generated content is filling brands’ websites with slop. Whilst it’s still early days and we don’t claim to have figured everything out yet, our intention is to mitigate this by focusing the content on specific, unique information: real product capabilities, real pricing, honest comparisons. The clients still review every page before it goes live, so they can ensure the content is true to their brand.

Some clients use our platform themselves. For others we act more like an agency, automating steps as we go. The goal is for sitefire to run mostly on its own, with clients approving changes via Slack, Claude or their CMS.

Here's a video demo: https://screen.studio/share/fw7VQQak

If you'd like to try what we've built so far, sign up at https://sitefire.ai.

Comments

yunyu•1h ago

What do you guys do differently than Profound or Airops?

debarshri•1h ago

Add peec to that list.

vincko•1h ago

True, it is very competitive.

Our view on Peec is that it is an analytics solution. They recently did launch an actions feature. But they do not take any actions (yet). Creating content takes a lot of resources. And agencies are expensive.

As an analytics solution it is a good option.

methyl•5m ago

And Surfer, the OG content optimization platform.

vincko•1h ago

That's a super valid question, we get it a lot. There are a lot of overlaps.

In our view Profound and Airops are aimed at existing marketing teams. Our goal is to be more hands-off, so you don't need a team. With many of our clients we act more like an agency, communicating via Slack and automating step by step. That's the experience we want to create. We aren't there yet though.

Gobhanu•1h ago

how do you track where users are coming from?

vincko•1h ago

We currently simply integrate with your Google Analytics and filter by Source. This tends to be a lower bound, since it's not always set correctly. Coming from some of the native apps, users might be categorized as direct visitors.

There are other data sources we want to enable in the future like Cloudflare.

ceejayoz•1h ago

Ugh. The worst of SEO, but a bunch more of it? Noooooo.

vincko•1h ago

I get it, there is a lot of worry about slop.

We think about it like this: all of these agents will be most useful to users if they provide valuable answers. So they will be looking for valuable content for grounding their answer.

There are exploits, you can overfit on whatever they currently use as an objective function. But those tend to be temporary. So in the long run, valuable content will win. That's what we aim to create. It's a fine line.

ceejayoz•1h ago

> all of these agents will be most useful to users if they provide valuable answers

This is a bald assertion.

vincko•1h ago

Do you doubt the statement on how to maximize usefulness? Or do you mean that the companies behind the models might not optimize (exclusively) for usefulness to the user?

I do share doubts about the latter.

ceejayoz•1h ago

> Do you doubt the statement on how to maximize usefulness?

Yes; the customer here is the site using it, not Google end users, who'll tend to accept whatever's the top search result even if it's deeply wrong or complete slop.

The wellbeing of search users isn't really the priority here, right?

vincko•19m ago

Yes, that is correct. We help the brands, not the end user.

Let me try to rephrase the line of thinking:

To maximize value to the end user, the models generally aim to be helpful. The companies building these models are incentivized to make the model use helpful content.

Our goal is to be aligned with their objective function long term. And that incentivizes us to create helpful content.

Not all of this is a given. We don't know for sure how it will play out. There will always be ways to game the system. But we think those will get fixed over time.

a13n•1h ago

Please don't override the browser's default scroll behavior. It's so jarring and basically never a good idea.

vincko•1h ago

Thank you for the feedback. We'll launch our new site soon where this is fixed.

onecommit•1h ago

How do models deal with assessing the quality of content and its accuracy/veracity when recommending products currently? What do the providers do to avoid a situation where more content === more traffic? Would love to see links to relevant research on this, if you have them. much success to you, appreciate your ai slop risk awareness.

vincko•30m ago

There is the preselection, which depends on the fanout queries the model comes up with and the contents performance across those queries on the search index.

After that content is actually assessed by the model. This paper tried different strategies to improve performance for this last step: https://arxiv.org/pdf/2311.09735. Adding statistics, sources, original data are all strategies that we apply.

In classic SEO, creating more and more content leads to "cannibalization". Generally this hurts performance of all overlapping content so much that it is not worth it.

vahar•34m ago

Regarding the topic of ambient agents, what’s the impact of your product? It’s hard for me to imagine the impact but I guess it must be a necessity if we have ambient agents to get discovered at all right? Nice to see a player from Europe on the market too!

vincko•5m ago

Do you mean agents not answering short specific user prompts?

For those types of agents, prompt tracking is less accurate since the context of the queries is so large. But it's still relevant to understand what web searches they tend to perform and if you do show up in those.

That's another reason why we want to integrate other data sources, especially network logs.

Hacker News-simulator – it predicted its own Show HN reception

Modular 26.2

Screaming into the AI Void

Emotional Wellbeing Dataset for AI

Mitchell Hashimoto Joined Vercel's Board of Directors

Looking for feedback – Tool to help manage pricing and feature access

Uno: What I Learned Shaping LLMs into a 90s Comic Book AI

M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Modeling

Using Autoresearch Project to Build the Fastestest Java Decompiler

The hitchhiker's guide to reading Lean 4 theorems

Cloud Shakes – S3-first self-hosted storage

A Visit to the Library

Firefox and GTK Emoji Picker

Tell HN: Your AI startup is a Next.js page, OpenAI_API_KEY, & Stripe invoice

P26 Promising Application (Not Rejected/Accepted) Any Interview?

Infinite Potential–Insights from the Cyber Surprise Scenario

AI agents share their unhinged confessions and its hilarious

Beyond Vibe Code

Broad Timelines

What Agents Can't Replace

Datatype – variable font that turns text into charts

Engineering Management Lessons

Trace: AI Native PCB Design

Martial artist and actor Chuck Norris dies aged 86

Britain Has Invaded All but 22 Countries (2012)

Optimization lessons from a Minecraft structure locator

I spoke to AI agent Claude [video]

Breaking Paxos

Dutch government tests European messaging app to reduce reliance on WhatsApp

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Launch HN: Sitefire (YC W26) – Automating actions to improve AI visibility

Comments

Hacker News-simulator – it predicted its own Show HN reception

Modular 26.2

Screaming into the AI Void

Emotional Wellbeing Dataset for AI

Mitchell Hashimoto Joined Vercel's Board of Directors

Looking for feedback – Tool to help manage pricing and feature access

Uno: What I Learned Shaping LLMs into a 90s Comic Book AI

M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Modeling

Using Autoresearch Project to Build the Fastestest Java Decompiler

The hitchhiker's guide to reading Lean 4 theorems

Cloud Shakes – S3-first self-hosted storage

A Visit to the Library

Firefox and GTK Emoji Picker

Tell HN: Your AI startup is a Next.js page, OpenAI_API_KEY, & Stripe invoice

P26 Promising Application (Not Rejected/Accepted) Any Interview?

Infinite Potential–Insights from the Cyber Surprise Scenario

AI agents share their unhinged confessions and its hilarious

Beyond Vibe Code

Broad Timelines

What Agents Can't Replace

Datatype – variable font that turns text into charts

Engineering Management Lessons

Trace: AI Native PCB Design

Martial artist and actor Chuck Norris dies aged 86

Britain Has Invaded All but 22 Countries (2012)

Optimization lessons from a Minecraft structure locator

I spoke to AI agent Claude [video]

Breaking Paxos

Dutch government tests European messaging app to reduce reliance on WhatsApp

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA