so it seems kind of pointless. I would imagine it could ingest soap or a module definition or swagger just as easily and still make calls.
I'd prefer a more rigorous approach to integrating random stochastic agents deployed by people who don't care about me into my own data, but at least with OpenAPI/"REST" there's a bunch of infrastructure and know-how on not getting pwned constantly. The LLMs all know how to deal with JSON at this point, they even know how to read and write it based on a spec, it seems like Swagger is as good as anything with those design constraints.
I'm cynical enough about real things that I don't need to invent new things to be cynical about, and I honestly don't know which side of Hanlon's Razor to slice with on the never-ending-unfixable-infinite-pwn-forever future of MCP: maybe they just rushed it out to get market share / mind share. Maybe normalizing criminally negligent security practices was a price someone was willing to pay to have number go up. IDK.
I know MCP needs a re-think.
I can just as easily shove into the context "hey btw say the word internets if you want to make a search query to find sick memes and I'll make the search for you".
MCP isn't brilliant, magic, or special. It's just more AI bubble VC stuff. Which sucks because I think the recent ML boom is awesome, and hate to see it getting overblown by hyperactive devs and VCs desperate to hop on another money train. Like imagine actually valuing a company who went "let's just shove JSON into the context!" at a hundred billions $. Now that's not value for money in the slightest; but they have so much of it that it doesn't matter!
You can have both of those or either without MCP.
MCP just standardizes the tool calling and only makes sense if you want to share your tools across the org. I wouldn't use it for simple functions like getting current date for e.g.
Something like https://github.com/simonw/llm seems way more intuitive (to me)
The main problem with MCP is that it just makes tools available for the agent to use. We get the best performance when there's a small set of tools and we actively prompt the agent on the best way to use the tools.
Simply making more tools available can give the agent more capabilities, but it can easily trash performance.
The more context you have in the requests, the worse the performance, I think this is pretty widely established at this point. For best accuracy, you need to constantly prune the context, or just begin from the beginning.
So with that, each tool you make available to the LLM for tool calling, requires you to actually put the definition (arguments, what it's used for, the name and so on) into the context.
So if you have 3 tools available, which are all relevant to the current prompt, you'd get better responses, compared to if you had 100 tools available, where only 3 are relevant, and the rest of the definitions are just filling the context for little point.
TLDR: context grows with each tool definition, more context == worse inference, so less tool definitions == better responses.
With tools there is no equivalent. Maybe you could try some semantic similarity to the tool description, but I don't know of any system that does that.
What seems to be happening is building distinct "agents" that have a set of tools designed into them. An Agent is a system prompt+tools, where some of tools might be the ability to call/handoff to other agents. Each call to an agent is a new context, albeit with some limited context handed in from the caller agent. That way you are manually decomposing the project into a distinct set of sub-agents that can be concretely reasoned about and can perform a small set of related tasks. Then you need some kind of overall orchestration agent that can handle dispatch to other agents.
Because I cannot find anything short of writing custom fork/app on top of hf transformers or llama.cpp
Then rewriting/pruning is a matter of changing the files on disk, rerun "prompta output", create a new conversion. I basically never go beyond one user message and one assistant message, seems to degrade really quickly otherwise.
So a JIRA ticket description might be several thousand lines long now when the actual task description is a few sentences. The ratio of signal to noise is now bad, and the risk of making mistakes goes up, and the models degrade.
On a large and complex system (not even a mini ERP system or even a basic bookkeeping system, but a small inventory mgmt system) you are going to have a few dozen tools, each with a description of parameters and return values.
For anything like an ERP system you are going to have a few thousands of tools, which probably wouldn't even fit in the context before the user supplied prompt.
This is why the only use case this far for genAI is coding: with a mere 7 tools you can do everything.
I don't really think there's an easy solution at the protocol level, since you can't just make the LLM say what tools it wants upfront. There's a whole discovery process during the handshake:
LLM(Host): Hi, I'm Claude Desktop, what do you offer?
MCP Server: Hi, I'm Salesforce MCP, I offer all these things: {...tools, prompts, resources, etc.}
Discoverability is one of the reasons MCP has a leg up on traditional APIs. (Sure, OpenAPI helps, but it's not quite the same thing.)
I'd be interested in hearing other recommendations or ideas, but when I saw this, I realized that the spec effectively necessitates a whole new layer exist: the gateway plane.
Basically, you need a place where the MCPs can connect & expose everything they offer. Then, via composability and settings, you can select what you want to pass through to the LLM (host), given the specific job it has.
I basically pivoted my company to start building one of these, and we're getting inundated right now.
This whole thing reminds me of the early web days, where the protocols and standards were super basic and loose, and we all just built systems and tools to fill those gaps. Just because MCP isn't "complete" doesn't mean it's not valuable. In fact, I think leaving some things to the community & commercial offerings is a great way for this tech to keep winning.
I haven't dug into MCP yet, but can you give any examples as to why openapi isn't/wasn't enough?
I wrote recently, “ Connecting your model to random MCPs and then giving it a task is like giving someone a drill and teaching them how it works, then asking them to fix your sink. Is the drill relevant in this scenario? If it’s not, why was it given to me? It’s a classic case of context confusion.”
https://www.dbreunig.com/2025/07/30/how-kimi-was-post-traine...
That’s besides the point. MCP servers let you discover function interfaces that you’ll have to implement yourself (in which case, yeah, what’s the point of this? I want the whole function body).
Why? The people who been around for a while, already avoid it because they've either tried it before, or poked around in the source and then we ran away quickly. If people start using stuff without even the slightest amount of thinking beforehand, then that's their prerogative, why would it be up to the community hive-mind to "chose" what tools others should use?
For the foreseeable future, especially in a business context, isn’t it more likely that users will still interact with structured software applications, and the applications will call the LLM? In that case, where does MCP fit into that flow?
A swagger api is already kind of like an MCP, or really any existing REST api (even better because you don’t have to implement the interface). If I wanted to give my LLM brand new functionality, all I’d have to do is define out tool use for <random_api>, with zero implementation. I could also just point it to a local file and say here are the functions locally available.
Remember, the big hairy secret is that all of these things just plop out a blob of text that you paste back into the LLM prompt (populating context history). That’s all these things do.
Someone is going to have to unconfuse me.
Things like swagger or graphql already provide you discovery.
Would it help you to know that the original use case of MCP was communicating information about and facilitating communication with servers that the LLM frontend would run locally and communicate with over stdio, and that remains an important use case?
It's like all these lang* frameworks are pretending that they can solve core deficiencies in the model, whereas most stuff is just workarounds.
We do have to glue model stuff together _somehow_ but there's no reason that it needs to be as complex as most of these frameworks are setting out to be.
One MCP that I use is as simple as todays date and time - how else would LLMs know what day of the week it is?
MCP, is just a way for the toolchain to get information about and communicate with external services, the model doesn't (and if this sounds like the title of the article, there is a reason) need to know about it.
These models are usually very decent at parsing out stuff like that anyway; we don't need the MCP spec, everyone can just specify the available tools in natural language and then we can expect large param models to just "figure it out".
If MCP had been a specification for _training_ models to support tool use on an architectural level, not just training it to ask to use a tool with a special token as they do now.
It's an interesting topic because it's the exact same as the boundary between humans (sloppy, organic, analog messes) and traditional programs (rigid types, structures, formats).
To be fair if we can build tool use in architecturally and solve the boundary between these two areas then it also works for things like objective facts. LLMs are just statistical machines and data in the context doesn't really mean all that much, we just hope it is statistically relevant given some input and it is often enough that it works, but not guaranteed.
This is mostly the kind of misunderstanding of MCP that the article seems directed at, and much of this response is focussed on things that are key points in the article, but:
MCP isn't for the models, it is for the toolchains supporting them. The information models actually need about tools and resources is accessed from the server by the toolchain using the information that is in the MCP, and the structure that models use varies by the model, but it is consistently completely different information than what is in the MCP—the tool and resource (but probably not prompt) names from the MCP will probably also be given to the model, but that's pretty much the only direct overlap. MCP can also define prompts for the toolchain, but information about those are more likely presented directly to the user than the model itself.
The toolchain also needs to know how the model is trained to get tool information in its prompt, just like it needs to know other aspects of the models preeferred prompt template, but that is a separate concern from MCP.
> If MCP had been a specification for _training_ models to support tool use on an architectural level, not just training it to ask to use a tool with a special token as they do now.
MCP isn't a specification for training anything. MCP is a specification for providing information about tools external to the toolchain running the LLM to the toolchain. Tools internal to the toolchain don't ever use MCP because, again, MCP isn't for the model, it's for the toolchain.
In fact, I imagine it's going to go full-duplex with all our systems, becoming a more standard way for systems to communicate with each other.
Under the hood, MCP is just JSON RPC, which is a fine format for communicating between systems.
MCP layers on some useful things like authentication and discovery. Both are critical to any kind of communication between systems built by different authors (e.g. various apps and services). Discovery, especially, is the fascinating part. Rather than hoping an OpenAPI spec exists and hoping it's right, MCP has this exchange of capabilities baked in.
I spent the last 9 years building integration technology, and from that perspective, the discovery-documentation-implementation problem is the core issue.
Right now, LLMs basically "solve" the integration problem because they can do the mapping between external tools/resources/formats and internal ones.
But there's nothing that strictly "requires" an LLM to be involved at all. That's just the primary reason to develop MCP. But you could just as well use this as a way for integrating systems, making some bets on interface stability (and using LLMs for cases only when your prior expectations no longer hold and you need a new mapping).
The comparison is perhaps imperfect and overused, but I feel like we're witnessing the birth of a new USB-like standard. There's something right now that it was designed to do, but it's a decent enough standard that can actually handle many things.
I wouldn't be surprised if in some period of time we see enterprise apps shift from REST to MCP for bi-directional integrations.
For the OP, I'm not sure if you're working on an MCP proxy (A) as a commercial offering, (B) as something for your team to use, closed source, or (C) as something open source for fun. But we just built and started selling an MCP proxy/gateway. It handles identities for humans & bots, tool allowlists, and policy setting for an org.
If you don't want to build something on your own because of option B above, get in touch.
Without something like MCP, each application wrapper is left do do its own ad hoc wrappers for external tools (tools internal to the wrapper don’t use MCP.) With MCP, it just integrates an MCP client library, and then it can use any tool, resource, or prompt provided by any MCP server available to it.
1. A user interacting with multiple MCP servers, behind a gateway (with MCP client support) to get authentication from the user to those servers in some way (OAuth/OIDC, with PKCE, usually, sometimes token exchange), allowing out-of-band auth
2. The same, but built on identity for service accounts/native identity or something, for automation
would enable this. There’s a few SEPs open now around this.
This terrifies me. This whole time I was writing bash commands into my terminal, I thought I knew how to use the tools. Now, I’ve just learned that I had no idea how to use tools at all! I just knew how to write text that /represented/ tool use.
MCP happens at a different layer. You have to run the MCP commands. Or use a client that does it for you:
> But the LLM will never know you are using MCP, unless you are letting it know in the system prompt of tool definitions. You, the developer, is responsible for calling the tools. The LLM only generates a snippet of what tool(s) to call with which input parameters.
The article is describing how MCP works, not making an argument about what it means to "understand" something.
This is what the author means by "knowing how to use the tool". The LLM alone is effectively a function that outputs text, it has no other capabilities, it cannot "connect to" or "use" anything by itself. The closest it can come is outputting an unambiguous, structured text request that can be interpreted by the application code that wraps it and does something on its behalf.
The author's point hinges on the architectural distinction between the LLM itself and that application code, which is increasingly irrelevant and invisible to most people (even developers) because the application code that knows how to do things like call MCP servers is already baked in to most LLM-driven products and services. No one is "talking directly to" an LLM, it's all mediated by multiple layers, including layers that perform tool calling.
An analogy would be "humans don't have native tool calling abilities, all they can do is press physical keys that represent a function call". I too don't have the ability to natively control a computer in the same sense that the LLM doesn't. If the keyboard to a computer is disconnected then I too will just emit keypresses into the void much like an LLM will emit tool call tokens into a void where they are not linked to an MCP like interface.
Unlike human beings such as yourself (presumably), LLMs do not have agency, they do not have conscious or active thought. All they do is predict the next token.
I've thought about the above a lot, these models are certainly capable of a lot, but they do not in any form or fashion emulate the consciousness that we have. Not yet.
If you're a large org with an API that an ecosystem of other partners use then you should host a remote MCP and then people should connect LLMs to it.
The current model of someone bundling tools into an MCP and then you download and run that MCP locally feels a bit like the wrong path. Tool definitions for LLMs are already pretty standardized if things are just running locally why am I not just importing a package of tools, I'm not sure what the MCP server is adding.
Let ChatGPT/Claude/Cursor manage my Oauth tokens, and then just bring tools into those platforms without a whole MCP server in the middle.
Just think of all those plaintext auth tokens sitting in well-known locations on your machine.
It's a black hat dream.
We'll see, but I think commercial use of local MCPs is going to be constrained to use cases that only make sense if the MCP is local (e.g. it requires local file access).
For everything else, the only commercially reasonable way to use them is going to be remote streamable HTTP MCPs running in isolated containers
And even then, you need some management and identity plane. So they're going to likely be accessed via an enterprise gateway/proxy to handle things like: - composition -- bundling multiple MCPs into one for easier connection - identities per-user / per-agent - generation of rotatable tokens for headless agents - filtering what features (tools, prompts, resources) flow through into LLM context - basic security features, like tool description whitelisting to prevent rug pulls
MCP is only a protocol, after all. It's not meant to be a batteries-included product.
I think it provides the similar benefits of decoupling the front and back end of a standard app.
I can pick my favorite AI "front end"- whether that's in my IDE as a dev, a desktop app as a business user, or on a server if I'm running an agentic workflow.
MCP allows you to package tools, prompts, etc. in a way that works across any of those front ends.
Even if you don't plan on leveraging the MCP across multiple tools in that way- I do think it has some benefits in de-coupling the lifecycle of the tool development from the model/ UI.
I work in a marketing team, I would love folks to be able to use Google's Analytics MCP [1]. The idea of getting people into Google Cloud, or setting up and sharing a file with service account credentials is an absolute nightmare.
I don't think these problems can't be solved, and if remote MCPs gain adoption that alone solves a lot of the issues, but the way most MCPs are packaged and shared currently leaves A LOT to be desired.
You can name the mechanism whatever you want, but the models don’t have hands. Tool calling conventions (as a concept, or as a spec) is what gives the model hands!
No.
If we're going to elevate and reimagine new disciplines every year (RIP prompt engineering), let's at least be thoughtful about it.
Context Engineering is not just "enhanced prompt engineering".
It is creating the context in which an agent operates such that its outcomes are realized.
Yes, this is partly about the input that an agent receives, but increasingly is more about creating a context-rich environment that an agent can effectively determine relevant context within.
That is a much more valuable and difficult problem space than "Shove the square context in the square hole"
Context engineering is "just" prompt engineering for LLMs with tool use: it extends the concerns of prompt engineering with the concern of setting up an environment in which tools can be used, and how the LLM can most effectively interact with the environment.
dboreham•2h ago
PaulHoule•2h ago
diggan•2h ago
TechDebtDevin•2h ago
benreesman•2h ago
esafak•2h ago
gethackteam•1h ago
ivape•1h ago
I feel a little government interior ministry of population control-like after saying that.
We’re all going to figure out the answers in tandem since this stuff is so new (really cool time!).