Existing observability tools either require a cloud signup, an enterprise contract, or heavy manual instrumentation of your code. I wanted something that stayed local and just worked.
The Solution: Three lines of code
Python import vectorlens vectorlens.serve() # Open http://127.0.0.1:7756 # Your RAG code runs as-is (OpenAI, Anthropic, Gemini, ChromaDB, FAISS, etc. are auto-intercepted) How it works technically:
Zero-Config Interception: It monkey-patches common LLM and Vector DB clients. You don't have to change your functions or wrap your calls; it intercepts the data flow automatically.
Local Hallucination Detection: It uses sentence-transformers (a 22MB model) to compare the LLM’s output sentences against the retrieved context. If the similarity is too low, it's flagged as a hallucination.
Perturbation Attribution: To figure out "why," it measures how the output changes when specific chunks are removed or modified. This gives you a clear score of which data points actually drove the response.
Fully Local: No data leaves your machine. The dashboard is a local React app updated via WebSockets.
Why use this over other tools?
Privacy: No cloud uploads or API keys for the debugger itself.
No Vendor Lock-in: Works with local models (Ollama/Mistral) just as easily as it does with GPT-4.
Speed: It runs detection in a background thread, so it doesn't block your main application logic.
I’m looking for feedback on the attribution accuracy and if there are specific Vector DBs you'd like to see supported next.