I’ve been experimenting with RAG systems specifically for compliance-heavy technical documentation, and it feels like most RAG advice breaks down in this context.
If the response is wrong, users don’t just get confused - it may be costly legally and financially.
A few things that worked fine for general purpose chatbot RAG failed badly for docs:
Pure semantic search – "authentication" queries pulled login flows and unrelated security guidelines, because embeddings blurred intent. Users needed exact endpoints, not conceptually similar text.
Naive chunking – code blocks and parameter descriptions were split across chunks, producing syntactically valid but operationally wrong examples.
"Best effort" generation – when context was incomplete, the model just filled in the gaps with hallucinations and plausible defaults instead of refusing to answer.
I wrote up what finally worked for me as a production-style pipeline
(ingestion → hybrid retrieval → grounding → evaluation).
Curious if others have run into the same problems or solved this differently, especially for docs, APIs, or internal runbooks for highly regulated, compliance-heavy industries? What constraints mattered most in practice?
alex_fash•1h ago
If the response is wrong, users don’t just get confused - it may be costly legally and financially.
A few things that worked fine for general purpose chatbot RAG failed badly for docs:
Pure semantic search – "authentication" queries pulled login flows and unrelated security guidelines, because embeddings blurred intent. Users needed exact endpoints, not conceptually similar text.
Naive chunking – code blocks and parameter descriptions were split across chunks, producing syntactically valid but operationally wrong examples.
"Best effort" generation – when context was incomplete, the model just filled in the gaps with hallucinations and plausible defaults instead of refusing to answer.
I wrote up what finally worked for me as a production-style pipeline (ingestion → hybrid retrieval → grounding → evaluation).
Curious if others have run into the same problems or solved this differently, especially for docs, APIs, or internal runbooks for highly regulated, compliance-heavy industries? What constraints mattered most in practice?