MCP Specification – version 2025-06-18 changes

https://modelcontextprotocol.io/specification/2025-06-18/changelog

200•owebmaster•7mo ago

Comments

dend•7mo ago

I am just glad that we now have a simple path to authorized MCP servers. Massive shout-out to the MCP community and folks at Anthropic for corralling all the changes here.

jjfoooo4•7mo ago

What is the point of a MCP server? If you want to make an RPC from an agent, why not... just use an RPC?

refulgentis•7mo ago

Not everyone can code, and not everyone who can code is allowed to write code against the resources I have.

nsonha•7mo ago

you have to write code for MCP server, and code to consume them too. It's just the LLM vendor decide that they are going to have the consume side built-in, which people question as they could as well do the same for open API, GRPC and what not, instead of a completely new thing.

antupis•7mo ago

It is easier to communicate and sell that we have this MCP server that you can just plug and play vs some custom RPC stuff.

freeone3000•7mo ago

But MCP deliberately doesn’t define endpoints, or arguments, or return types… it is the definition of custom RPC stuff.

How does it differ from providing a non MCP REST API?

hobofan•7mo ago

The main alternative one would have for having a plug-and-play (just configure a single URL) non-MCP REST API would be to use OpenAPI definitions and ingesting them accordingly.

However, as someone that has tried to use OpenAPI for that in the past (both via OpenAI's "Custom GPT"s and auto-converting OpenAPI specifications to a list of tools), in my experience almost every existing OpenAPI spec out there is insufficient as a basis for tool calling in one way or another:

- Largely insufficient documentation on the endpoints themselves

- REST is too open to interpretation, and without operationIds (which almost nobody in the wild defines), there is usually context missing on what "action" is being triggered by POST/PUT/DELETE endpoints (e.g. many APIs do a delete of a resource via a POST or PUT, and some APIs use DELETE to archive resources)

- baseUrls are often wrong/broken and assumed to be replaced by the API client

- underdocumented AuthZ/AuthN mechanisms (usually only present in the general description comment on the API, and missing on the individual endpoints)

In practice you often have to remedy that by patching the officially distributed OpenAPI specs to make them good enough for a basis of tool calling, making it not-very-plug-and-play.

I think the biggest upside that MCP brings (given all "content"/"functionality") being equal is that using it instead of just plain REST, is that it acts as a badge that says "we had AI usage in mind when building this".

On top of that, MCP also standardizes mechanisms like e.g. elicitation that with traditional REST APIs are completely up to the client to implement.

ethbr1•7mo ago

REST for all the use cases: We have successfully agreed on what words to use! We just disagree on what they mean.

freeone3000•7mo ago

I can’t help but notice that so many of the things mentioned are not at all due to flaws in the protocol, but developers specifying protocol endpoints incorrectly. We’re one step away from developers putting everything as a tool call, which would put us in the same situation with MCP that we’re in with OpenAPI. You can get that badge with a literal badge; for a protocol, I’d hope for something at least novel over HATEOAS.

dend•7mo ago

The analogy that was used a lot is that it's essentially USB-C for your data being connected to LLMs. You don't need to fight 4,532,529 standards - there is one (yes, I am familiar with the XKCD comic). As long as your client is MCP-compatible, it can work with _any_ MCP server.

fennecfoxy•7mo ago

The whole USB C comparison they make is eyeroll inducing, imo. All they needed to state was that it was a specification for function calling.

My gripe is that they had the opportunity to spec out tool use in models and they did not. The client->llm implementation is up to the implementor and many models differ with different tags like <|python_call|> etc.

lsaferite•7mo ago

Clearly they need to try explaining it it easy terms. The number of people that cannot or will not understand the protocol is mind boggling.

I'm with you on an Agent -> LLM industry standard spec need. The APIs are all over the place and it's frustrating. If there was a spec for that, then agent development becomes simply focused on the business logic and the LLM and the Tools/Resource are just standardized components you plug together like Lego. I've basically done that for our internal agent development. I have a Universal LLM API that everything uses. It's helped a lot.

ethbr1•7mo ago

The comparison to USB C is valid, given the variety of unmarked support from cable to cable and port to port.

It has the physical plug, but what can it actually do?

It would be nice to see a standard aiming for better UX than USB C. (Imho they should have used colored micro dots on device and cable connector to physically declare capabilities)

fennecfoxy•7mo ago

Certainly valid in that just like various usb c cables supporting slightly different data rates or power capacities, MCP doesn't deal with my aforementioned issue of the glue between MCP client and model you've chosen; that exercise is left up to us still.

ethbr1•7mo ago

My gripe with USB C isn't really on the nature, but on the UX and modality of capability discovery.

If I am looking at a device/cable, with my eyes, in the physical world, and ask the question "What does this support?", there's no way to tell.

I have to consult documentation and specifications, which may not exist anymore.

So in the case of standards like MCP, I think it's important to come up with answers to discovery questions, lest we all just accept that nothing can be done and the clusterfuck in +10 years was inevitable.

A good analogy might be imagining how the web would have evolved if we'd had TCP but no HTTP.

fennecfoxy•7mo ago

100% agree but with private enterprise this is a problem that can never be solved; everyone wants their lock-in and slice of the cake.

I would say for all the technology we have in 2025, this has certainly been one of the core issues for decades & decades. Nothing talks to each other properly, nothing works with another thing properly. Immense effort has to be expended for each thing to talk to or work with the other thing.

I got a Macbook Air for light dev as a personal laptop. It can't access Android filesystem with a phone plugged in. Windows can do it. I know Apple's corporate reasoning, but just an example of purposeful incompatibility.

As you say, all these companies use standards like TCP/HTTP/Wifi/Bluetooth/USB/etc and they would be nowhere without them - but literally every chance they get they try to shaft us on it. Perhaps AI will assist in the future - tell it you want x to work with y and the model will hack on it until the fucking thing works.

vidarh•7mo ago

MCP is an RPC protocol.

fennecfoxy•7mo ago

Standardising tool use, I suppose.

Not sure why people treat MCP like it's much more than smashing tool descriptions together and concatenating to the prompt, but here we are.

It is nice to have a standard definition of tools that models can be trained/fine tuned for, though.

ethbr1•7mo ago

Also nice to have a standard(ish) for evolution purposes. I.e. +15 years from now.

lobsterthief•7mo ago

Agreed. Without standards, we wouldn’t have the rich web-based ecosystem we have now.

As an example, anyone who’s coded email templates will tell you: it’s hard. While the major browsers adopted the W3C specs, email clients (I.e. email renderers) never adopted the spec, or such a W3C email HTML spec didn’t exist. So something that renders correctly in Gmail looks broken in Yahoo mail in Safari on iOS, etc.

fennecfoxy•7mo ago

Standards are very important, especially extensible ones where proposals are adopted when they make sense - this means companies can still innovate but users get the benefit of everything just working.

But browsers/web ecosystem are still a bad example as we had decades of browsers supporting their own particular features/extensions. This has converged slightly pretty much because everything now uses Chrome underneath (bar Safari and Firefox).

But even so...if I write an extension while using Firefox, why can't I install that extension in Chrome? And vice-versa? Even bookmarks are stored in slightly different formats.

It is a massive pain to align technology like this but the benefits are huge. Like boxing developers in with a good library (to stop them from doing arbitrary custom per-project BS) I think all software needs to be boxed into standards with provisions for extension/innovation. Rather than this pick & choose BS because muh lock-in.

rco8786•7mo ago

Standardization. You spin up a server that conforms to MCP, every LLM instantly knows how to use it.

seanobannon•7mo ago

Completely agree - and yet auth is still surprisingly difficult to set up, so I built a library to simplify the setup. In this repo, there is:

- a Typescript library to set up ChatGPT-MCP compatible auth

- Source code for an Express & NextJS project implementing the library

- a URL for the demo of the deployed NextJS app that logs ChatGPT tool calls

Should be helpful for folks trying to set up and distribute custom connectors.

https://github.com/mcpauth/mcpauth

TOMDM•7mo ago

Maybe the Key Changes page would be a better link if we're concerned with a specific version?

https://modelcontextprotocol.io/specification/2025-06-18/cha...

lyjackal•7mo ago

Agree, thanks for the link. I was wondering what actually changed. The resource links and elicitation look like useful functionality.

tomhow•7mo ago

OK we've changed the URL and the title to that, thanks!

elliotbnvl•7mo ago

Fascinated to see that the core spec is written in TypeScript and not, say, an OpenAPI spec or something. I suppose it makes sense, but it’s still surprising to see.

lovich•7mo ago

Why is it surprising? I use typescript a lot, but I would have never even thought to have this insight so I am missing some language design decisions

fnordpiglet•7mo ago

Because type script is a language not a language agnostic specification for languages. As someone who uses typescript a lot that probably is irrelevant. But if I’m using Rust, say, then the typescript means I need to reimplement the spec from scratch. With OpenApi, say, I can code generate canonically correct stubs and implement from there.

pattobrien•7mo ago

There's an OpenAPI spec generated from the Typescript schemas in the official spec repo, so codegen of other SDKs is still possible.

_feus•7mo ago

What MCP is missing is a reasonable way to do async callbacks where you can have the mcp query the model with a custom prompt and results of some operation.

lherron•7mo ago

Isn't this what sampling is for?

https://modelcontextprotocol.io/docs/concepts/sampling

lsaferite•7mo ago

That was my thought as well.

My main disappointment with sampling right now is the very limited scope. It'd be nice to support some universal tool calling syntax or something. Otherwise a reasonably complicated MCP Server is still going to need a direct LLM connect.

refulgentis•7mo ago

Dumb question: in that case, wouldn't it not be an MCP server? It would be an LLM client with the ability to execute tool calls made by the LLM?

I don't get how MCP could create a wrapper for all possible LLM inference APIs or why it'd be desirable (that's an awful long leash for me to give out on my API key)

lsaferite•7mo ago

An MCP Server can be many things. It can be as simple as an echo server or as complex as a full-blown tool-calling agent and beyond. The MCP Client Sampling feature is an interesting thing that's designed to allow the primary agent, the MCP Host, to offer up some subset of LLM models that is has access to for the MCP Servers it connects with. That would allow the MCP Server to make LLM calls that are mediated (or not, YMMV) by the MCP Host. As I said above, the feature is very limited right now, but still interesting for some simpler use cases. Why would you do this? So you don't have to configure every MCP Server you use with LLM credentials? And the particulars of exactly what model gets used are under your control. That allows the MCP Server to worry about the business logic and not about how to talk to a specific LLM Provider.

refulgentis•7mo ago

I get the general premise but am uncertain as to if it's desirable to invest more in inverting the protocol, where the tool server becomes an LLM client. "Now you have 2 protocols", comes to mind - more concretely, it upends the security model.

manojlds•7mo ago

Make a remote agent a remote MCP

ashwinsundar•7mo ago

Why does MCP need to support this explicitly? Is it hard to write a small wrapper than handles async callbacks? (Serious question)

atlgator•7mo ago

The async callbacks are in your implementation. I wrote an MCP server so customers could use an AI model to query a databricks sql catalog. The queries were all async.

jjfoooo4•7mo ago

It's very hard for me to understand what MCP solves aside from providing a quick and dirty way to prototype something on my laptop.

If I'm building a local program, I am going to want tighter control over the toolsets my LLM calls have access to.

E.g. an MCP server for Google Calendar. MCP is not saving me significant time - I can access the same API's the MCP can. I probably need to carefully instruct the LLM on when and how to use the Google Calendar calls, and I don't want to delegate that to a third party.

I also do not want to spin up a bunch of arbitrary processes in whatever runtime environment the MCP is written in. If I'm writing in Python, why do I want my users to have to set up a typescript runtime? God help me if there's a security issue in the MCP wrapper for language_foo.

On the server, things get even more difficult to justify. We have a great tool for having one machine call a process hosted on another machine without knowing it's implementation details: the RPC. MCP just adds a bunch of opinionated middleware (and security holes)

refulgentis•7mo ago

I agree vehemently, I'm sort of stunned how...slow...things are in practice. I quit my job 2 years ago to do LLM client stuff and I still haven't made it to Google calendar. It's handy as a user to have something to plug holes in the interim.

In the limit, I remember some old saw about how every had the same top 3 rows of apps on their iPhone homescreen, but the last row was all different. I bet IT will be managing, and dev teams will make, their own bespoke MCP servers for years to come.

throwaway314155•7mo ago

If I understand your point correctly - the main bottleneck for tool-calling/MCP is the models themselves being relatively terrible at tool-calling anything but the tools they were finetuned to work with until recently. Even with the latest developments, any given MCP server has a variable chance of success just due to the nature of LLM's only learning the most common downstream tasks. Further, LLM's _still_ struggle when you give them too many tools to call. They're poor at assessing the correct tool to use when given tools with overlapping functionality or similar function name/args.

This is what people mean when they say that MCP should maybe wait for a better LLM before going all-in on this design.

refulgentis•7mo ago

Not in my opinion, works fine in general, wrote 2500 lines of tests for me over about 30 min tonight.

To your point that this isn't trivial or universal, there's a sharp gradient that you wouldn't notice if you're just opining on it as opposed to coding against it -- ex. I've spent every waking minute since mid-December on MCP-like territory, and it still bugs me out how worse every model is than Claude at it. It sounds like you have similar experience, though, perhaps not as satisfied with Claude as I am.

throwaway314155•7mo ago

A fair point I suppose. I'm not entirely inexperienced with it, but it does sound like you have more experience with it than I do.

> you wouldn't notice if you're just opining on it as opposed to coding against it

Maybe i'm being sensitive but that is perhaps not the way I would have worded that as it reads a bit like an insult. Food for thought.

cubancigar11•7mo ago

It is a protocol. If I have to list a bunch of files on my system I don't call a rest server. Same way mcp is not for you doing your stuff. It is for other people do to stuff on your server by the way of tools.

lsaferite•7mo ago

> It's very hard for me to understand what MCP solves

It's providing a standardized protocol to attach tools (and other stuff) to agents (in an LLM-centric world).

8n4vidtmkvmk•7mo ago

What I don't get is why all the MCPs I've seen so far are commands instead of using the HTTP interface. Maybe I'm not understanding something, but with that you could spin up 1 server for your org and everyone could share an instance without messing around with different local toolchains

vidarh•7mo ago

If there isn't an MCP proxy yet, I'm sure there will be soon. The problem with doing that more widely so far has been that the authentication story hasn't been there yet.

kasey_junk•7mo ago

It’s because auth was a pain and people were hiding that in the command. That’s getting better though.

hobofan•7mo ago

HTTP MCP servers is generally where the trend is moving, now that authentication/authorization in the spec is getting into shape.

That most MCP usage you'll find out in repositories in the wild is focused on local toolchains is mostly due to MCP on launch being essentially only available via the Claude Desktop client. There they also highlighted many local single-user use-cases (rather than organizational ones). That SSE support was also spotty in most implementations also didn't help.

visarga•7mo ago

The advantage of MCP over fixed flows controlled from backend is that the LLM does the orchestration itself, for example when doing web searches it can rephrase the query as needed and retry until it finds the desired information. When solving a bespoke problem in CLI it will use a number of tools, and call them in the order required by the task at hand. You can't do that easily with a fixed flow.

eadmund•7mo ago

It is mostly pointless complexity, but I’m going to miss batching. It was kind of neat to be able to say ‘do all these things, then respond,’ even if the client can batch the responses itself if it wants to.

lsaferite•7mo ago

I agree. JSON-RPC batching has always been my "gee, that's neat" feature and seeing it removed from the spec is sad. But, as you said, it's mostly just adding complexity.

neya•7mo ago

One of the biggest lessons for me while riding the MCP hype was that if you're writing backend software, you don't actually need to do MCP. Architecturally, they don't make sense. Atleast not on Elixir anyway. One server per API? That actually sounds crazy if you're doing backend. That's 500 different microservices for 500 APIs. After working with 20 different MCP servers, it then finally dawned on me, good ole' function calling (which is what MCP is under the hood) works just fine. And each API can be just it's own module instead of a server. So, no need to keep yourself updated on the latest MCP spec nor need to update 100s of microservices because the spec changed. Needless complexity.

leonidasv•7mo ago

I always saw MCPs as a plug-and-play integration for enabling function calling without incurring API costs when using Claude.

If you're using the API and not in a hurry, there's no need for it.

aryehof•7mo ago

It really is a standard protocol for connecting clients to models and vice versa. It’s not there to just be a container for tool calls.

throwaway314155•7mo ago

> One server per API? That actually sounds crazy if you're doing backend

Not familiar with elixir, but is there anything prohibiting you from just making a monolith MCP combining multiple disparate API's/backends/microservices as you were doing previously?

Further, you won't get the various client-application integrations with MCP merely using tool-calling; which to me is the "killer app" of MCP (as a sibling comment touches on).

(I do still have mixed feelings about MCP, but in this case MCP sorta wins for me)

neya•7mo ago

> just making a monolith MCP combining multiple disparate API

This is what I ended up doing.

The reason I thought I must do it the "MCP way" was because of the tons of YouTube videos about MCP which just kept saying how much of an awesome protocol it is, an everyone should be using it, etc. Once I realized it's actually more consumer facing than the backend, it made much more sense as why it became so popular.

mindwok•7mo ago

“each API can just be its own module instead of a server”

This is basically what MCP is. Before MCP, everyone was rolling their own function calling interfaces to every API. Now it’s (slowly) standardising.

neya•7mo ago

If you search for MCP integrations, you will find tons of MCP "servers", which basically are entire servers for just one vendor's API (sometimes just for one of their products, eg. YouTube). This is the go to default right now, instead of just one server with 100 modules. The MCP protocol itself is just to make it easier to communicate with the LLM clients that users can use and install. But, if you're doing backend code, there is no need to use MCP for it.

vidarh•7mo ago

There's also little reason not to. I have an app server in the works, and all the API endpoints will be exposed via MCP because it hardly requires writing any extra code since the app server already auto-generates the REST endpoints from a schema anyway and can do the same for MCP.

An "entire server" is also overplaying what an MCP server is - in the case where an MCP server is just wrapping a single API it can be absolutely tiny, and also can just be a binary that speaks to the MCP client over stdio - it doesn't need to be a standalone server you need to start separately. In which case the MCP server is effectively just a small module.

The problem with making it one server with 100 modules is doing that in a language agnostic way, and MCP solves that with the stdio option. You can make "one server with 100 modules" if you want, just those modules would themselves be MCP servers talking over stdio.

neya•7mo ago

> The problem with making it one server with 100 modules is doing that in a language agnostic way

I agree with this. For my particular use-case, I'm completely into Elixir, so, for backend work, it doesn't provide much benefit for me.

> it can be absolutely tiny

Yes, but at the end of the day, it's still a server. Its size is immaterial - you still need to deal with the issues of maintaining a server - patching security vulnerabilities and making sure you don't get hacked and don't expose anything publicly you're not supposed to. It requires routine maintenance just like a real server. Mulitply that with 100, if you have 100 MCP "servers". It's just not a scalable model.

In a monolith with 100 modules, you just do all the security patching for ONE server.

jcelerier•7mo ago

.. what's the security patching you have to do for reading / writing on stdio on your local machine?

ethbr1•7mo ago

Hehe.

vidarh•7mo ago

You will still have the issues of maintaining and patching your modules as well.

I think you have an overcomplicated idea of what a "server" means here. For MCP that does not mean it needs to speak HTTP. It can be just a binary that reads from stdin and writes to stdout.

neya•7mo ago

>I think you have an overcomplicated idea of what a "server"

That's actually true, because I'm always thinking from a cloud deployment perspective (which is my use case). What kind of architecture do you run this on, at scale on the cloud? You have very limited options if your monolith is on a serverless and is CPU/memory bound, too. So, that's where I was coming from.

vidarh•7mo ago

You'd run it on any architecture that can spawn a process and attach to stdin/stdout.

The overhead of spawning a process is not the problem. The overhead if your runtime is huge and/or slow to start could be, in which case you simply wouldn't run it on a serverless system - which is ludicrously expensive at scale anyway (my dayjob is running a consultancy where the big money-earner is helping people cut their cloud costs, and moving them off serverless systems that are entirely wrong for them is often one of the big savings; it's cheap for tiny systems, but even then often the hassle isn't worth it)

skeledrew•7mo ago

> each API can be just it's own module

That implies a language lock-in, which is undesirable.

neya•7mo ago

It is only undesirable if you are to expose your APIs for others to use via a consumer facing client. As a backend developer, I'm writing a backend for my app to consume, not for consumers (like MCP is designed for). So, this is better for me since I'm a 100% Elixir shop anyway.

rguldener•7mo ago

Agree, one MCP server per API doesn’t scale.

With something like https://nango.dev you can get a single server that covers 400+ APIs.

Also handles auth, observability and offers other interfaces for direct tool calling.

(Full disclosure, I’m the founder)

neya•7mo ago

Looks pretty cool, thanks for sharing!

czechdeveloper•7mo ago

Nango is cool, but pricing is quite high at scale.

rguldener•7mo ago

We offer volume discounts on all metrics.

Email me on robin @ <domain> and happy to find a solution for your use case

falcor84•7mo ago

There are quite a few competitors in this space, trying to figure the best way about this. I've been recently playing with the Jentic MCP server[0] that seems to do it quite cleanly and appears to be entirely free for regular usage.

[0] https://jentic.com/

saberience•7mo ago

Why do you even need to connect to 400 APIs?

In the end, MCP is just like Rest APIs, there isn't need for a paid service for me to connect to 400 Rest APIs now, why do I need a service to connect to 400 MCPs?

All I need for my users is to be able to connect to one or two really useful MCPs, which I can do myself. I don't need to pay for some multi REST API server or multi MCP server.

herval•7mo ago

Agentic automation is almost always about operating multiple tools and doing something with them. So you invariably need to integrate with a bunch of APIs. Sure, you can write your own MCP and implement everything in it. Or you can save yourself the trouble and use the official one provided by the integrations you need.

saberience•7mo ago

My point is though, that you don't need some 3rd party service to integrate hundreds of MCPs, it doesn't make any sense at all.

An "agent" with access to 400 MCPs would perform terribly in real world situations, have you ever tried it? Agents would best with a tuned well written system prompt and access to a few, well tuned tools for that particular agents job.

There's a huge difference between a fun demo of an agent calling 20 different tools, and an actual valuable use-case which works RELIABLY. In fact, agents with tons of MCP tools are currently insanely UNRELIABLE, it's much better to just have a model + one or two tools combined with a strict input and output schema, even then, it's pretty unreliable for actual business use-cases.

I think most folks right now have never actually tried making a reliable, valuable business feature using MCPs so they think somehow having "400 MCPs" is a good thing. But they haven't spent a few minutes thinking, "wait, why does our business need an agent which can connect to Youtube and Spotify?"

_boffin_•7mo ago

People want to not think and throw the kitchen sink at problems instead of thinking what they actually need.

com2kid•7mo ago

I'll go one further -

Forcing LLMs to output JSON is just silly. A lot of time and effort is being spent forcing models to output a format that is picky and that LLMs quite frankly just don't seem to like very much. A text based DSL with more restrictions on it would've been a far better choice.

Years ago I was able to trivially teach GPT 3.5 to reliably output an English like DSL with just a few in prompt examples. Meanwhile even today the latest models still have notes that they may occasionally ignore some parts of JSON schemas sent down.

Square peg, round hole, please stop hammering.

neya•7mo ago

I'm not sure about that. Newer models are able to output structured outputs perfectly and infact, if you combine it with changesets, you can have insanely useful applications since changesets also provide type-checking.

For example, in Elixir, we have this library: https://hexdocs.pm/instructor/

It's massively useful for any structured output related work.

fellatio•7mo ago

Any model can provide perfect JSON according to a schema if you discard non-conforming logits.

I imagine that validation as you go could slow things down though.

ethbr1•7mo ago

Expect this is a problem pattern that will be seen a lot with LLMs.

Do I look at whether the data format is easily output by my target LLM?

Or do I just validate clamp/discard non-conforming output?

Always using the latter seems pretty inefficient.

svachalek•7mo ago

The technical term is constrained decoding. OpenAI has had this for almost a year now. They say it requires generating some artifacts to do efficiently, which slows down the first response but can be cached.

felixfbecker•7mo ago

MCP doesn't force models to output JSON, quite the opposite. Tool call results in MCP are text, images, audio — the things models naturally output. The whole point of MCP is to make APIs digestable to LLMs

fennecfoxy•7mo ago

I think perhaps they're more referring to the tool descriptions...not sure why they said output

ljm•7mo ago

Unless the client speaks to each micro service independently without going through some kind of gateway or BFF, I'd expect you'd just plonk your MCP in front of it all and expose the same functionality clients call via API (or graphql or RPC or whatever), so it's basically just an LLM-specific interface.

No reason why you couldn't just use tool calls with OpenAPI specs though. Either way, making all your microservices talk to each other over MCP sounds wild.

Tokumei-no-hito•7mo ago

nothing prevents you from using a proxy server if you want it all from one entry point. idk of any clients hitting hundreds of MCPs anyways, that's a straw man.

unlike web dev where you client cab just write more components / API callers, AI has context limits. if you try to plug (i get it, exaggerated) 500 MCP servers each with 2-10 tools you will waste a ton of context in your sys prompt. you can use a tool proxy (one tool which routes to others) but that will add latency from the processing.

gotts•7mo ago

Very glad to see MCP Specification rapid improvement. With each new release I notice something that I was missing in my MCP integrations.

Aeolun•7mo ago

Funny that changes to the spec require a single approval before being merged xD

swalsh•7mo ago

Elicitation is a big win. One of my favorite MCP servers is an SSH server I built, it allows me to basically automate 90% of the server tasks I need done. I handled authentication via a config file, but it's kind of a pain to manage if I want to access a new server.

surfingdino•7mo ago

There's always https://www.fabfile.org/ no need to throw an LLM into a conversation that is best kept private.

dig1•7mo ago

Ansible, puppet, chef, salt, cfengine... There are tons of tools that are more precise and succinct for describing sensitive tasks, such as managing a single or a fleet of remote servers. Using MCP/LLMs for this is... :O

ethbr1•7mo ago

... the future?

There are security / reliability concerns, true, but finally getting technically close to Star Trek computers and then still doing things 'the way they've always been done' doesn't seem efficient.

swalsh•7mo ago

I don't know if you understand the role the LLM is playing here. The mechanism used to execute the command is not the relevant thing. The LLM autonomously executing commands has intelligence, it's not just a shell script. If I ask it to do a task, and it runs into an issue... LLMs like Claude can recognize the problem, and find a way to resolve it. Script failed because of a missing dependency, it'll install it. Need a config change, it'll do it. The SSH mcp is just the interface for the LLM to do the work.

You can give an LLM a github repo link, a fresh VPC, and say "deploy my app using nginx" and any other details you need... and it'll get it done.

troupo•7mo ago

They keep inventing new terms for things. This time it's "Elicitation" for user input.

spiritplumber•7mo ago

We need to work CLU and TRON in there too, then!

lsaferite•7mo ago

What would you have preferred it to be called?

> n. the process of getting or producing something, especially information or a reaction

Now, Sampling seems like an odd feature name to me, but maybe I'm missing the driver behind that name.

troupo•7mo ago

> What would you have preferred it to be called?

It's right there in my comment. You don't need to call user input "elicitation" just to pretend it's something it's not.

Same goes for sampling.

esafak•7mo ago

That one's even more egregious, coming from a field where statistics is central.

surfingdino•7mo ago

> structured tool output

Yeah, let's pretend it works. So far structured output from an LLM is an exercise in programmers' ability to code defensively against responses that may or may not be valid JSON, may not conform to the schema, or may just be null. There's a new cottage industry of modules that automate dealing with this crap.

sanxiyn•7mo ago

No? With structured outputs you get valid JSON 100% of the time. This is a non-problem now. (If you understand how it works, it really can't be otherwise.)

https://openai.com/index/introducing-structured-outputs-in-t...

https://platform.openai.com/docs/guides/structured-outputs

OtherShrezzing•7mo ago

From the 2nd link:

> Structured Outputs can still contain mistakes.

The guarantee promised in link 1 is not supported by the documentation in link 2. Structured Output does a _very good_ job, but still sometimes messes up. When you’re trying to parse hundreds of thousands of documents per day, you need a lot of 9s of reliability before you can earnestly say “100% guarantee” of accuracy.

hobofan•7mo ago

Whether it's a non-problem or not very much depends on how much the LLM API providers actually bother to add enforcement server-side.

Anecdotally, I've seen Azure OpenAI services hallucinate tools just last week, when I provided an empty array of tools rather than not providing the tools key at all (silly me!). Up until that point I would have assumed that there are server-side safeguards against that, but now I have to consider spending time on adding client-side checks for all kinds of bugs in that area.

fooster•7mo ago

Have you actually used this stuff at scale? The replies are often not valid.

sanxiyn•7mo ago

Yes I have.

surfingdino•7mo ago

Meanwhile https://dev.to/rishabdugar/crafting-structured-json-response... and https://www.boundaryml.com/blog/structured-output-from-llms

You are confusing API response payloads with Structured JSON that we expect to conform to the given Schema. It's carnage that requires defensive coding. Neither OpenAI nor Google are interested in fixing this, because some developers decide to retry until they get valid structured output which means they spend 3x-5x on the calls to the API.

evalstate•7mo ago

Structured Output in this case refers to the output from the MCP Server Tool Call, not the LLM itself.

fennecfoxy•7mo ago

Did they fix evil MCP servers/prompt injection/data exfiltration yet?

noodletheworld•7mo ago

No.

That (prompt injection) isn’t something you can fix until you come up with a way to split prompts.

That means new types models; there is no variant on MCP that can solve it with existing models.

ethbr1•7mo ago

It's hilarious how LLMs are relearning the idea that it might be a good idea to carry control signals out of band.

I guess the phone phreak generation wasn't around to say 'Maybe this is a bad idea...' (because the first thing users are going to do is try to hijack control via in band overrides)

whazor•7mo ago

It would make sense to have a 'resource update/delete' approval workflow, where a MCP server can have an o-auth link just to approve the particular step.

zackify•7mo ago

Yeah unfortunately doesn’t exist yet and it’s an open issue

disintegrator•7mo ago

The introduction of WWW-Authenticate challenge is so welcome. Now it's much clearer that the MCP server can punt the client to resource provider's OAuth flow and sit back waiting for an `Authorization: Bearer ...`.

practal•7mo ago

Does any popular MCP host support sampling by now?

evalstate•7mo ago

Yes. VSCode 1.101.0 does, as well as fast-agent.

Earlier I posted about mcp-webcam (you can find it) which gives you a no-install way to try out Sampling if you like.

martinald•7mo ago

I spent the last few days playing with MCP building some wrappers for some datasets. My thoughts:

1) Probably the coolest thing to happen with LLMs. While you could do all this with tool calling and APIs, being able to send to less technical friends a MCP URL for claude and seeing them get going with it in a few clicks is amazing.

2) I'm using the csharp SDK, which only has auth in a branch - so very bleeding edge. Had a lot of problems with implementing that - 95% of my time was spent on auth (which is required for claude MCP integrations if you are not building locally). I'm sure this will get easier with docs but it's pretty involved.

3) Related to that, Claude doesn't AFIAK expose much in terms of developer logs for what they are sending via their web (not desktop) app and what is going wrong. Would be super helpful to have a developer mode where it showed you request/response of errors. I had real problems with refresh on auth and it turned out I was logging the wrong endpoint on my side. Operator error for sure but would have fixed that in a couple of mins had they had better MCP logging somewhere in the webui. It all worked fine with stdio in desktop and MCP inspector.

4) My main question/issue is handling longer running tasks. The dataset I'm exposing is effectively a load of PDF documents as I can't get claude to handle the PDF files itself (I am all ears if there is a way!). What I'm currently doing is sending them through gemini to get the text, then sending that to the user via MCP. This works fine for short/simple documents, but for longer length documents (which can take some time to process) I return a message saying it is processing and to retry later.

While I'm aware there is a progress API, it still requires keeping the connection open to the server (which times out after a while with Cloudflare at least) - could be wrong here though?. It would be much better to be able to tell the LLM to check back in x seconds when you predict it will be done and the LLM can do other stuff in the background (which it will do), but then sorta 'pause execution' until the timer is hit.

Right now (AFIAK!) you can either keep it waiting (which means it can't do anything else in the meantime) with an open connection w/ progress, or you can return a job ID, but then it will just return a half finished answer which is often misleading as it doesn't have all the context yet. Don't know if this makes any sense, but I can imagine this being a real pain for tasks that take 10mins+.

jonathanhefner•7mo ago

Long-running tasks are an open topic of discussion, and I think MCP intends to address it in the future.

There are a few proposals floating around, but one issue is that you don't always know whether a task will be long-running, so having separate APIs for long-running tasks vs "regular" tool calls doesn't fully address the problem.

I've written a proposal to solve the problem in a more holistic way: https://github.com/modelcontextprotocol/modelcontextprotocol...

ethbr1•7mo ago

It'd be nice to more closely integrate MCP into something like Airflow, with hints as to expected completion time.

Real world LLM is going to be built on non-negligible (and increasingly complicated) deterministic tools, so might as well integrate with the 'for all possible workflows' use case.

jonathanhefner•7mo ago

I'm not familiar with Airflow, but MCP supports progress notifications: https://modelcontextprotocol.io/specification/2025-03-26/bas...

ethbr1•7mo ago

DAG workflow scheduling. https://github.com/apache/airflow

dgellow•7mo ago

Oh nice! I know what to do over the weekend: I will be updating my mcp oauth proxy :D

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Bye Bye Humanity: The Potential AMOC Collapse

SectorC: A C Compiler in 512 bytes (2023)

Haskell for all: Beyond agentic coding

Homeland Security Spying on Reddit Users

Speed up responses with fast mode

Software factories and the agentic moment

Brookhaven Lab's RHIC concludes 25-year run with final collisions

LLMs as the new high level language

Hoot: Scheme on WebAssembly

Stories from 25 Years of Software Development

Total Surface Area Required to Fuel the World with Solar (2009)

First Proof

Vocal Guide – belt sing without killing yourself

Why there is no official statement from Substack about the data leak

Vouch

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

FDA intends to take action against non-FDA-approved GLP-1 drugs

Al Lowe on model trains, funny deaths and working with Disney

Start all of your commands with a comma (2009)

Show HN: A luma dependent chroma compression algorithm (image compression)

The AI boom is causing shortages everywhere else

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

Learning from context is harder than we thought

Selection rather than prediction

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Where did all the starships go?

I write games in C (yes, C) (2016)

Unseen Footage of Atari Battlezone Arcade Cabinet Production

The silent death of good code

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Bye Bye Humanity: The Potential AMOC Collapse

SectorC: A C Compiler in 512 bytes (2023)

Haskell for all: Beyond agentic coding

Homeland Security Spying on Reddit Users

Speed up responses with fast mode

Software factories and the agentic moment

Brookhaven Lab's RHIC concludes 25-year run with final collisions

LLMs as the new high level language

Hoot: Scheme on WebAssembly

Stories from 25 Years of Software Development

Total Surface Area Required to Fuel the World with Solar (2009)

First Proof

Vocal Guide – belt sing without killing yourself

Why there is no official statement from Substack about the data leak

Vouch

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

FDA intends to take action against non-FDA-approved GLP-1 drugs

Al Lowe on model trains, funny deaths and working with Disney

Start all of your commands with a comma (2009)

Show HN: A luma dependent chroma compression algorithm (image compression)

The AI boom is causing shortages everywhere else

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

Learning from context is harder than we thought

Selection rather than prediction

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Where did all the starships go?

I write games in C (yes, C) (2016)

Unseen Footage of Atari Battlezone Arcade Cabinet Production

The silent death of good code

MCP Specification – version 2025-06-18 changes

Comments