My article is an architecture breakdown of how Exa AI built web search that's better than Google — and what is the bare minimum cost to build web scale search today with napkin math.
Please check it out if you're curious about:
- How modern AI search engines like Exa, Perplexity, and Parallel Web Systems operate under the hood.
- Learning napkin math style estimation (technique popularized by legends like Jeff Dean)
- How vector compression tricks like matryoshka embeddings + binary quantization change the economics of billion scale search
Please take my estimates with a grain of salt since my goal is to just get in the right ballpark. Also feel free to comment/DM if you see any wrong or suboptimal assumptions :)
kshivendu•9m ago
Please check it out if you're curious about: - How modern AI search engines like Exa, Perplexity, and Parallel Web Systems operate under the hood. - Learning napkin math style estimation (technique popularized by legends like Jeff Dean) - How vector compression tricks like matryoshka embeddings + binary quantization change the economics of billion scale search
Please take my estimates with a grain of salt since my goal is to just get in the right ballpark. Also feel free to comment/DM if you see any wrong or suboptimal assumptions :)