We wanted to share a new tool we’ve been working on.
Even when documentation is well-structured, sometimes it’s hard to find what you need. Information ends up spread across wikis, cloud drives, chat tools, ticketing systems, etc., and search can break down as info grows.
mAItion is our attempt to solve that. It’s a way to unify search and retrieval across the systems that teams already use. It's an open-source alternative to tools like Glean and GetGuru, but it’s designed to be flexible, lightweight, self-hostable, and easier to adapt.
What it does:
- Ingests data from all the connected sources and builds vector and BM25 index
- Converts the data into markdown (using MarkItDown, configurable)
- Lets you run Hybrid search across the data stored via API or MCP
- Lets you chat with the data using Open Web UI as a frontend (bundled) or connect other arbitrary chat UI
- Keeps the data fresh using scalable Celery workers (per source schedules)
- Deduplicates documents over sources during ingestions
- Supports local and remote models for embeddings and inference (inference is optional)
- Bundles connectors for local files, S3, Jira, Web, MediaWiki, etc.
- Multi-user support, groups, and generic permissions (through the Open Web UI front-end)
We tried to leverage as many open-source tools as possible to avoid reinventing the wheel and make the system more maintainable. But we wanted the flexibility to customize it as we needed. We were surprised that we could not find an existing RAG that would allow us to create the connectors we need, so we created our own.
At a high level, the core is a general-purpose LLamaIndex/pgvector powered RAG system plus Celery for workers. It continuously ingests data from different sources, and exposes retrieval and rephrase endpoints. It supports scheduled ingestion, citation-backed answers, and configurable connectors.
Because it uses LlamaIndex you can adapt any of the existing LlamaIndex DataLoaders to be used as a Connector.
The RAG system can be integrated with any frontend. We bundle a fairly recent version of Open WebUI as the chatbot interface in the mAItion setup, but it’s easy to connect to any other front-end or use with agents via MCP.
You can run it with Docker Compose, configure your connections, embedding/inference via a simple YAML file, and start querying your data quickly (the simplest connector to try this out is the Directory one that ingests files from a local directory)
We’re still actively building and improving it, and would really appreciate feedback from people here.
The following features are on our roadmap:
- GitHub, Gmail, Outlook, BitBucket, Box, Confluence, Notion and plenty of other connectors
- Documents ACL & Permissions enforcement
- Migration to a more recent version of pgvector
- Observability via Langfuse
- Reranking/HyDE
- Kubernetes deployment
- Two-way sync for the sources making possible to send data back to the data source
- Agentic consolidation of answers into your knowledge base
Comparing mAItion to existing open-source tools:
- Onyx (formerly Danswer): A highly developed codebase with lots of connectors out of the box. Its roots were proprietary code that later was released as open-source. It seems like a quality product, but the standard installation recommends 8 vCPU and 16 GB RAM. We wanted something lightweight.
- Omni: We found this recently. It seems nice, but our focus is on getting lots of connectors up at the same time. In Omni, each connector gets its own docker container so each connector adds more overhead. This has its advantages, but it’s not what we wanted. It’s also written partially in Rust, which is great, but we aren’t Rust experts.
The code and documentation:
- https://github.com/WikiTeq/mAItion - the mAItion (MIT License)
- https://github.com/wikiteq/rag-of-all-trades - the RAG (MIT License)
Happy to answer any questions!