Though from everything I've read online, Zed's edit prediction model is far, _far_ behind that of Cursor.
I'm a big fan of Zed but tbf I'm just using Claude Code + Nvim nowadays. Zed's problem with their Claude integration is that it will never be as good as just using the latest from Claude Code.
What is the core business plan then?
I understand that there's nothing you could do to protect me if I make a prompt that ends up using >$5 of usage but after that I would like Zed to reject anything except my personal API keys.
Usage pricing on something like aws is pretty easy to figure out. You know what you're going to use, so you just do some simple arithmetic and you've got a pretty accurate idea. Even with serverless it's pretty easy. Tokens are so much harder, especially when using it in a development setting. It's so hard to have any reasonable forecast about how a team will use it, and how many tokens will be consumed.
I'm starting to track my usage with a bit of a breakdown in the hope that I'll find a somewhat reliable trend.
I suspect this is going to be one of the next big areas in cloud FinOps.
I don't see how it matters to you that you aren't saturating your $200 plan. You have it because you hit the limits of the $100/mo plan.
It already is. There’s been a lot of talk and development around FinOps for AI and the challenges that come with that. For companies, forecasting token usage and AI costs is non-trivial for internal purposes. For external products, what’s the right unit economic? $/token, $/agentic execution, etc? The former is detached from customer value, the latter is hard to track and will have lots of variance.
With how variable output size can be (and input), it’s a tricky space to really get a grasp on at this point in time. It’ll become a solved problem, but right now, it’s the Wild West.
When you get Claude Code's $20 plan, you get "around 45 messages every 5 hours". I don't really know what that means. Does that mean I get 45 total conversations? Do minor followups count against a message just as much as a long initial prompt? Likewise, I don't know how many messages I'll use in a 5 hour period. However, I do understand when I start bumping up against limits. If I'm using it and start getting limited, I understand that pretty quickly - in the same way that I might understand a processor being slower and having to wait for things.
With tokens, I might blow through a month's worth of tokens in an afternoon. On one hand, it makes more sense to be flexible for users. If I don't use tokens for the first 10 days, they aren't lost. If I don't use Claude for the first 10 days, I don't get 2,160 message credits banked up. Likewise, if I know I'm going on vacation later, I can't use my Claude messages in advance. But it's just a lot easier for humans to understand bumping up against rate limits over a more finite period of time and get an intuition for what they need to budget for.
Edit: To be clear, I'm not talking about Zed. I'm talking about the companies make the models.
The hard work is the high level stuff like deciding on the scope of the project, how it should fit in to the project, what kind of extensibility the feature might need to be built with, what kind of other components can be extended to support it, (and more), and then reviewing all the work that was done.
Agree with the sentiment, but I do think there are edge cases.
e.g. I could see a place like openrouter getting away with a tiny fractional markup based on the value they provide in the form of having all providers in one place
At scale, OpenRouter will instead get you the lower high-volume fees they themselves get from their different providers.
Why does not justify charging a fraction of your spend on the LLM platform? This is pretty much how every service business operates.
I'm curious if they have plans to improve edit prediction though. It's honestly kind of garbage compared to Cursor, and I don't think I'm being hyperbolic by calling it garbage. Most of the time it's suggestions aren't helpful, but the 10-20% of the time it is helpful is worth the cost of the subscription for me.
I suspect I’m not alone on this. Zed is not the editor for hardcore agentic editing and that’s fine. I will probably save money on this transition while continuing to support this great editor for what it truly shines at: editing source code.
Saying that, token-based pricing has misaligned incentives as well: as the editor developer (charging a margin over the number of tokens) or AI provider, you benefit from more verbose input fed to the LLMs and of course more verbose output from the LLMs.
Not that I'm really surprised by the announcement though, it was somewhat obviously unsustainable
Why: LLMs are increasingly becoming multimodal, so an image "token" or video "token" is not as simple as a text token. Also, it's difficult to compare across competitors because tokenization is different.
Eventually prices will just be in $/Mb of data processed. Just like bandwidth. I'm surprised this hasn't already happened.
For autoregressive token-based multimodal models, image tokens are as straightforward as text tokens, and there is no reason video tokens wouldn’t also be. (If models also switch architecture and multimodal diffusion models, say, become more common, then, sure, a different pricing model more tied to actual compute cost drivers for that architecture are likely but... even that isn’t likely to be bytes.)
> Also, it's difficult to compare across competitors because tokenization is different.
That’s a reason for incumbents to prefer not to switch, though, not a reason for them to switch.
> Eventually prices will just be in $/Mb of data processed.
More likely they would be in floatint point operations expended processing them, but using tokens (which are the primary drivers for the current LLM architectures) will probably continue as long as the architecture itself is doninant.
There is very little value that a company that has to support multiple different providers, such as Cursor, can offer on top of tailored agents (and "unlimited" subscription models) by LLM providers.
I get that the pro plan has $5 of tokens and the pricing page says that a token is roughly 3-4 characters. However, it is not clear:
- Are tokens input characters, output characters, or both?
- What does a token cost? I get that the pricing page says it varies by model and is “ API list price +10%”, but nowhere does it say what these API list prices are. Am I meant to go to The OpenAI, Anthropic, and other websites to get that pricing information? Shouldn’t that be in a table on that page which each hosted model listed?
—
I’m only a very casual user of AI tools so maybe this is clear to people deep in this world, but it’s not clear to me just based on Zelda pricing page exactly how far $5 per month will get me.
presumably everyone is just aiming or hoping for inference costs to go down so much that they can do a unlimited-with-tos like most home Internet access etc, because this intermediate phase of having to count your pennies to ask the matrix multiplier questions isn't going to be very enjoyable or stable or encourage good companies to succeed.
I think Zed had a lot of good concepts where they could make paid AI benefits optional longer term. I like that you can join your devs to look at different code files and discuss them. I might still pay for Zed's subscription in order to support them long term regardless.
I'm still upset so many hosted models dont just let you use your subscription on things like Zed or JetBrains AI, what's the point of a monthly subscription if I can only use your LLM in a browser?
But as a mostly claude max + zed user happy to see my costs go down.
input_sh•1h ago
relativeadv•1h ago
input_sh•1h ago