Ask HN: Are we pretending RAG is ready, when it's barely out of demo phase?

7•TXTOS•2h ago

Been watching the RAG (Retrieval-Augmented Generation) wave crash into production for over a year now.

But something keeps bugging me: Most setups still feel like glorified notebooks stitched together with hope and vector search.

Yeah, it "works" — until you actually need it to. Suddenly: irrelevant chunks, hallucinations, shallow query rewriting, no memory loop, and a retrieval stack that breaks if you breathe on it wrong.

We’ve got: • pipelines that don’t align with what users actually want to ask, • retrieval that acts more like a search engine than a reasoning aid, • brittle evals (because "correct context" ≠ "correct answer"), • and no one’s sure where grounding ends and illusion begins.

Sure, you can make it work — if you’re okay duct-taping every component and babysitting the system 24/7.

So I gotta ask: Is RAG just stuck in prototype land pretending to be production? Or has someone here actually built a setup that survives user chaos and edge cases?

Would love to hear what’s worked, what hasn't, and what you had to throw away.

Not pushing anything, just been knee-deep in this and looking to sanity check with folks who’ve actually shipped stuff.

Comments

kingkongjaffa•1h ago

We have a RAG powered product in production right now used by thousands of users.

RAG is part of the solution, it provides the required style, formatting and subject matter idiosyncrasies of the domain.

But it isn't enough to do (prompt + RAG query on that prompt) alone, we have a handwritten series of prompts, so the user input is just one step in a branching decision tree of deciding which prompts to apply, in sequence (prompt 1 output = prompt 2 input) and also composition (deciding to combine prompt (3 + 5, but not prompt 4)) for example.

TXTOS•14m ago

Totally agree, RAG by itself isn’t enough — especially when users don’t follow the script.

We’ve seen similar pain: one-shot retrieval works great in perfect lab settings, then collapses once you let in real humans asking weird followups like

“do that again but with grandma’s style” and suddenly your context window looks like a Salvador Dali painting.

That branching tree approach you mentioned — composing prompt→prompt→query in a structured cascade — is underrated genius. We ended up building something similar, but layered a semantic engine on top to decide which prompt chain deserves to exist in that moment, not just statically prewiring them.

It’s duct tape + divination right now. But hey — the thing kinda works.

Appreciate your battle-tested insight — makes me feel slightly less insane.

Draw a fish and watch it swim

Weizenbaum examines computers and society (1985)

Hoefnagel's Guide to Constructing the Letters (ca. 1595)

Send Jest/Vitest Results to Google Chat with One Command

Amazon AI coding agent hacked to inject data wiping commands

Interfaces for representing uncertainty

Show HN: Mencrouche – A Hackable On-Demand Homepage

AI Built Alex's Travel Site

The Economics of Superintelligence

More Canadians may be thinking of a staycation this summer

Produce More Than You Consume

T-Mobile sells ex-Sprint wireline and dataceter biz to Cogent for $1 (2022)

Sensors and Robotics – Electronics Now! Series (1986) [video]

Experimenting with Apple's AI models inside Shortcuts

Show HN: Cronus – A Beautiful, Multilingual Cron Expression Editor

AI Generated Music and TV

'Japanese First': The deep roots of the rising far right

Is it time for digital nomads to leave Lisbon?

Will technology put an end to jobs? (1980) [video]

A Phenomenological Approach to the Philosophy of Meaning in Life

Ask HN: Is there a game where you try to escape a downtown collapsing in slo-mo?

[NEED HELP] Adobe India Hackathon '25 [pdf] - PLEASE PROVIDE SOLUTION FOR 1B

Recommitting to our why, what, and how

Claude Code Is a Slot Machine

The end of work as we know it

JokeAI – AI-powered joke generator built with Next.js and OpenAI

Roblox Games Wiki - Ultimate Guide and Strategy Hub

New AI architecture delivers 100x faster reasoning with just 1,000 examples

Pedestrians now walk 15% faster and linger less in city public spaces

Community air monitors give Detroiters new power against pollution