It's pretty much the same process I would use in an unfamiliar code base. Just ctrl+f the file system till I find the right starting point.
(Well, I didn't overcome my laziness directly. I just switched from being lazy and not setting up vim and Emacs with the integrations, to trying out vscode where this was trivial or already built in.)
These corpora have a high degree of semantic ambiguity among other tricky and difficult to alleviate issues.
Other types of text are far more amenable to RAG and some are large enough that RAG will probably be the best approach for a good while.
For example: maintenance manuals and regulation compendiums.
LLMs have a similar issue with their context windows. Go back to GPT-2 and you wouldn't have been able to load a text file into its memory. Slowly the memory is increasing, same as it did for the early computers.
So if one were building say a memory system for an AI chat bot, how would you save all the data related to a user? Mother's name, favorite meals, allergies? If not a Vector database like pinecone, then what? Just a big .txt file per user?
Grep works great when you have thousands of files on a local filesystem that you can scan in milliseconds. But most enterprise RAG use cases involve millions of documents across distributed systems. Even with 2M token context windows, you can't fit an entire enterprise knowledge base into context. The author acknowledges this briefly ("might still use hybrid search") but then continues arguing RAG is obsolete.
The bigger issue is semantic understanding. Grep does exact keyword matching. If a user searches for "revenue growth drivers" and the document discusses "factors contributing to increased sales," grep returns nothing. This is the vocabulary mismatch problem that embeddings actually solve. The author spent half the article complaining about RAG's limitations with this exact scenario (his $5.1B litigation example), then proposes grep as the solution, which would perform even worse.
Also, the claim that "agentic search" replaces RAG is misleading. Recent research shows agentic RAG systems embed agents INTO the RAG pipeline to improve retrieval, they don't replace chunking and embeddings. LlamaIndex's "agentic retrieval" still uses vector databases and hybrid search, just with smarter routing.
Context windows are impressive, but they're not magic. The article reads like someone who solved a specific problem (code search) and declared victory over a much broader domain.
From there the model can handle 100–200 full docs and jot notes into a markdown file to stay within context. That’s a very different workflow than classic RAG.
You could expand grep queries with synonyms, but now you're reimplementing query expansion, which is already part of modern RAG. And doing that intelligently means you're back to using embeddings anyway.
The workflow works great for codebases with consistent terminology. For enterprise knowledge bases with varied language and conceptual queries, grep alone can't get you to the right candidates.
> You could expand grep queries with synonyms, but now you're reimplementing query expansion, which is already part of modern RAG.
in this scenario "you" are not implementing anything - the agent will do this on its own
this is based on my experience using claude code in a codebase that definitely does not have consistent terminology
it doesn't always work but it seemed like you were thinking in terms of trying to get things right in a single grep when it's actually a series of greps that are informed by the results of previous ones
Great point, but this grep in a loop probably falls apart (i.e. becomes non-performant) at 1000s of docs, not millions and 10s of simultaneous users
Generative AI is here to stay, but I have a feeling we will look back on this period of time in software engineering as a sort of dark age of the discipline. We've seemingly decided to abandon almost every hard won insight and practice about building robust and secure computational systems overnight. It's pathetic that this industry so easily sold itself to the illogical sway of marketers and capital.
Still, that single tender can be on the order of a billion tokens. Even if the LLM supported that insane context window, it's roughly 4GB that need to be moved and with current LLM prices, inference would be thousands of dollars. I detailed this a bit more at https://www.tenderstrike.com/en/blog/billion-token-tender-ra...
And that's just one (though granted, a very large) tender.
For the corpus of a larger company, you'd probably be looking at trillions of tokens.
While I agree that delivering tiny, chopped up parts of context to the LLM might not be a good strategy anymore, sending thousands of ultimately irrelevant pages isn't either, and embeddings definitely give you a much superior search experience compared to (only) classic BM25 text search.
thenewwazoo•3h ago
“This wasn’t just inconvenient; it was architecturally devastating.”
Ugh.
sebmellen•3h ago
Retr0id•3h ago
phainopepla2•3h ago
tptacek•3h ago
Please don't complain about tangential annoyances—e.g. article or website formats, name collisions, or back-button breakage. They're too common to be interesting.
https://news.ycombinator.com/newsguidelines.html
titanomachy•3h ago
sebmellen•3h ago
I think it's fair to point out that many articles today are essentially a little bit of a human wrapper around a core of ChatGPT content.
Whether or not this was AI-generated, the tells of AI-written text are all throughout it. There are some people who have learned to write like the AI talks to them, which is really not much of an improvement over just using the AI as your word processor.
bigwheels•3h ago
davkan•3h ago
bigwheels•3h ago
threecheese•3h ago
akerl_•1h ago
EnPissant•3h ago
IgorPartola•2h ago
The problem is that HN is one of the few places left where original thoughts are the main reason people are here. Letting LLMs write articles for us here is just not all that useful or fun.
Maybe quarantining AI related articles to their own thing a la Show HN would be a good move. I know it is the predominant topic here for the moment but like there is other interesting stuff too. And articles about AI written by AI so that Google’s AI can rank it higher and show it to more AI models to train on is just gross.
tom_•2h ago
serf•2h ago
xgulfie•3h ago
dymk•3h ago
tptacek•3h ago
serf•2h ago
having a company name pitched at you within the first two sentences is a pretty good give away.
tptacek•2h ago
dymk•2h ago
nbstme•2h ago
nbstme•2h ago
dymk•2h ago
momojo•3h ago
On the whole though, I still learned a lot.
nbstme•2h ago