But something keeps bugging me: Most setups still feel like glorified notebooks stitched together with hope and vector search.
Yeah, it "works" — until you actually need it to. Suddenly: irrelevant chunks, hallucinations, shallow query rewriting, no memory loop, and a retrieval stack that breaks if you breathe on it wrong.
We’ve got: • pipelines that don’t align with what users actually want to ask, • retrieval that acts more like a search engine than a reasoning aid, • brittle evals (because "correct context" ≠ "correct answer"), • and no one’s sure where grounding ends and illusion begins.
Sure, you can make it work — if you’re okay duct-taping every component and babysitting the system 24/7.
So I gotta ask: Is RAG just stuck in prototype land pretending to be production? Or has someone here actually built a setup that survives user chaos and edge cases?
Would love to hear what’s worked, what hasn't, and what you had to throw away.
Not pushing anything, just been knee-deep in this and looking to sanity check with folks who’ve actually shipped stuff.
kingkongjaffa•1h ago
RAG is part of the solution, it provides the required style, formatting and subject matter idiosyncrasies of the domain.
But it isn't enough to do (prompt + RAG query on that prompt) alone, we have a handwritten series of prompts, so the user input is just one step in a branching decision tree of deciding which prompts to apply, in sequence (prompt 1 output = prompt 2 input) and also composition (deciding to combine prompt (3 + 5, but not prompt 4)) for example.
TXTOS•14m ago
We’ve seen similar pain: one-shot retrieval works great in perfect lab settings, then collapses once you let in real humans asking weird followups like
“do that again but with grandma’s style” and suddenly your context window looks like a Salvador Dali painting.
That branching tree approach you mentioned — composing prompt→prompt→query in a structured cascade — is underrated genius. We ended up building something similar, but layered a semantic engine on top to decide which prompt chain deserves to exist in that moment, not just statically prewiring them.
It’s duct tape + divination right now. But hey — the thing kinda works.
Appreciate your battle-tested insight — makes me feel slightly less insane.