You can find the source code on GitHub: https://github.com/Klavis-AI/klavis
We're addressing a couple of key problems with using MCPs. First, many available MCP servers lack native or used-based authentications, creating security vulnerabilities and adding complexity during development.
Second, many MCP servers are personal projects, not designed for the reliability needed in production.
Connecting to these servers usually requires writing custom MCP client code for the MCP protocol itself, which is a barrier, especially if you already have function calling systems in place.
Klavis AI aims to address these issues. To simplify access, we offer an API to launch production-ready, hosted MCP servers quickly via our API. The API also provides built-in OAuth and multi-tenancy auth support for MCP servers.
We also want to remove the need for developers to write MCP client code. You can use our API to interact with any remote MCP servers directly from your existing backend infrastructure. For faster prototyping or direct user interaction, we also provide open-source client interfaces for Web, Slack, and Discord.
The MCP servers and clients code is open source because we want to contribute to the MCP community.
For a quick start in the hosted verions, log in to our website and generate an API key. Then start calling our APIs directly. You can find more details in our doc: https://docs.klavis.ai
For a quick start in the open source version, go to our github repository and check out the detailed readme on each MCP server and client.
A little note about myself: my background includes working on the function calling for Google Gemini. During that time, I saw firsthand the challenges teams face when trying to connect AI agents to external tools. I want to bring my insights and energy to accelerate MCP adoption.
This is an early release, and we’d appreciate feedback from the community. What are your worst pain points related to MCPs, either as a developer or a general user? What other MCP servers or features would be most valuable to you?
We'll be around in the comments. Thanks for reading!
danenania•4h ago
- Is there a way to run it locally/self-host?
- Are there discovery endpoints to list the available servers?
- The 'Test & Eval' page is interesting to me, as I think unpredictability of results with multiple MCP prompts/sources interacting is generally a pretty big issue with MCP, so a good integrated eval system could be valuable. I see it's not launched yet, but it would be great to know more about how this will work.
wirehack•4h ago
Yes our github page has README for all MCP servers and clients. You can checkout https://github.com/Klavis-AI/klavis.
> Are there discovery endpoints to list the available servers?
Yes. https://docs.klavis.ai/api-reference/mcp-server/get-all-serv.... And we are adding more MCP servers.
> Testing and Eval
Yes it is early access now. If you are interested, shoot me an email at xiangkaiz@klavis.ai and we can talk more.
danenania•4h ago
wirehack•3h ago
danenania•3h ago
I would also personally consider just open sourcing everything (or at least core features) as I think you'd still have plenty of customers who would prefer a hosted solution. In my experience, there is surprisingly little overlap between the people who prefer cloud vs. self-hosting, so having a full open source option doesn't cannibalize the product side as much as you might think... people who prefer cloud will mostly still prefer it even if they could self-host instead, and people who prefer self-hosting will generally just look for another project rather than using your cloud service. Just my unsolicited 2 cents.
wirehack•3h ago
mlenhard•3h ago
How do you know what variations of a prompt trigger a given tool to be called or how many tools is too many before you start seeing degradation issues because of the context window. If you are building a client and not a server the issue becomes even more pronounced.
I even extracted the Claude electron source to see if I could figure out how they were doing it, but it's abstracted behind a network request. I'm guessing the system prompt handles tool call selection.
PS: I released an open source evals package if you're curious. Still a WIP, but does the basics https://github.com/mclenhard/mcp-evals
danenania•3h ago
I'm working on a coding agent, and MCP has been a frequently requested feature, but yeah this issue has been my main hesitation.
Getting even basic prompts that are designed to do one or two things to work reliably requires so much testing and iteration that I'm inherently pretty skeptical that "here are 10 community-contributed MCPs—choose the right one for the task" will have any hope of working reliably. Of course the benefits if it would work are very clear, so I'm keeping a close watch on it. Evals seem like a key piece of the puzzle, though you still might end up in combinatorial explosion territory by trying to test all the potential interactions with multiple MCPs. I could also see it getting very expensive to test this way.
mlenhard•3h ago
But agree that even basic prompts can be a struggle. You often need to name the tool in the prompt to get things to work reliably, but that's an awful user experience. Tool call descriptions play a pretty vital role, but most MCP servers are severely lacking in this regard.
I hope this a result of everything being so new and the tooling and models will evolve to solve these issues over time.
danenania•2h ago
It has momentum and clearly a lot of folks are working on these shortcomings, so I could certainly see it becoming the de facto standard. But the issues we're talking about are pretty major ones that might need a more fundamental reimagining to address. Although it could also theoretically all be resolved by the models improving sufficiently, so who knows.
Also, cool to hear that you came across Plandex. Lmk what you think if you try it out!
wirehack•1h ago
danenania•1h ago
1. Giving the model too many choices. If you have a lot of options (like a bunch of MCP servers) what you often see in practice is that it's like a dice roll which option is chosen, even if the best choice is pretty obvious to a human. This is even tough when you just have a single branch in the prompt where the model has to choose path A or B. It's hard to get it to choose intelligently vs. randomly.
2. Global scope. The prompts related to each MCP all get mixed together in the system prompt, along with the prompting for the tool that's integrating them. They can easily be modifying each other's behavior in unpredictable ways.
wirehack•19m ago
alasano•1h ago
The tools provided by the MCP server were definitely in context and there were only two or three servers with a small amount of tools enabled.
It feels too model dependant at the moment, this was Gemini 2.5 Pro which is normally state of the art but has lots of quirks for tool use it seems.
Agreed on hoping models are going to be trained to be better at using MCP.
danenania•50m ago
And then every time I try to add something new to the prompt, all the prompting for previously existing behavior often needs to be updated as well to account for the new stuff, even if it's in a totally separate 'branch' of the prompt flow/logic.
I'd anticipate that each individual MCP I wanted to add would require a similar process to ensure reliability.