I built askdocs-mcp as a way to give agents a more direct route to searching through a project's source-of-truth documents. My design constraints were that it run 100% locally, as some manuals are under NDA. It should start up fast, and let me experiment with different embedding & language models. It was built with ollama in mind, but if you can't run models locally, it will work with any OpenAI compatible endpoint.
Features:
- Incrementally builds and caches the set of docs. Initial start up can take a while as PDFs are chunked and ran through an embedding model, but after that, startup is near instant.
- Uses the filesystem as the database - you only need `ollama` running somewhere so the tool can access an embedding and natural language model.
- Provides a tool `ask_docs` for getting natural-language answers back about what the documentation says, which are annotated with page numbers the information came from. Those can be used with tool `get_doc_page` to retrieve the full page if the agent needs additional context.
Because I'm providing the exact set of documents that apply to my project, I see fewer hallucinations and rabbit-hole chasing. The agent isn't relying (as much) on its latent space to answer questions, and it avoids using a web search tool which might find subtly different part numbers or protocol versions. It saves precious context as well, because the parent agent gets a concise version of what it's looking for, instead of doing the "searching" itself by loading large chunks of the document into itself.I'm sure there are improvements that can be made e.g. document chunking or the "system prompt" the tool gives to the language model - I'd love to hear your feedback, especially if you find this useful. Thanks!