We built an open-source "retrieval firewall" that scans chunks before they reach the LLM: – denies injection & secrets – flags/reranks PII, encoded blobs, untrusted URLs – audit log (JSONL) of all decisions – drop-in wrappers for LangChain and LlamaIndex retrievers
Install: pip install rag-firewall Repo: https://github.com/taladari/rag-firewall
Curious if others here handle retrieval-time risks, or just ingest/output filtering. Would love feedback and red-team payloads.