What makes it different? Standard RAG often picks the wrong chunks or gets confused by similar articles. Hermit uses a Multi-Joint Architecture:
Entity Extraction: It understands who or what you're asking about before searching. JIT Indexing: It dynamically indexes only the relevant articles into an ephemeral FAISS index for every query. Verification Gate: A final joint verifies the premise against the source text to kill hallucinations. It runs on GGUF models via llama-cpp-python and supports any ZIM file (Kiwix).
Check it out: [https://github.com/0nspaceshipearth/Hermit-AI] I'd love to hear your thoughts on the multi-joint pipeline approach!