I’ve been experimenting with how LLMs can dynamically use large sets of tools without needing to preload them all into context - and built one-mcp to explore that idea.
one-mcp is an open source MCP server built with FastAPI that acts as a semantic index + dynamic loader for API tools.
Instead of hardcoding or embedding every tool spec in an LLM session (which quickly blows up context size and costs), one-mcp lets the model search, load, and execute tools on demand — using embeddings and natural language queries.
What it does:
- Semantic Search for Tools: Query your API catalog with natural language ("how to get user info") and get relevant tool specs back.
- Upload/Manage/Delete Tools: JSON or file-based tool ingestion with a simple REST API or directly from MCP itself.
- MCP Integration: Works as a compliant MCP server — plug it into MCP-enabled clients or frameworks.
Example:
--- Upload tools ---
curl -X POST [http://localhost:8003/api/tools/upload-json](http://localhost:8003/api/tools/upload-json) -H "Content-Type: application/json" -d '{"tools": [{"name": "getUserProfile", "description": "Retrieves user profile"}]}'
--- Semantic search ---
curl -X POST [http://localhost:8003/api/tools/search](http://localhost:8003/api/tools/search) -H "Content-Type: application/json" -d '{"query": "how to get user info"}'
You can MCP itself as well for both these operations :)
Why this matters:
Most current LLM setups (even those with MCP or tool APIs) load all tools upfront - which inflates prompt/context sizes, wastes token budget, and increases inference costs.
With one-mcp, tools live in an external vector store - LLMs only "see" the few tools most relevant to the current user query. This architecture could make dynamic tool ecosystems more scalable.
I’d love your feedback..