He's trying to paint it as removing access to something everyone gets by default, but in reality it's removing special treatment that they were previously giving Windsurf.
The default Anthropic rate limit tiers don't support something Windsurf sized, and right now getting increased rate limits outside of their predefined tiers is an uphill battle because of how badly they're compute constrained.
As a Windsurf user, maybe Sonnet 3.x models are slower right now, but they don't require BYOK like 4 does. So this is a bit of an exaggeration, isn't it? Anthropic did not cut them off, they seem to have continued honoring existing quota on previous agreements.
What did Windsurf think was going to happen with this particular exit? Also, how embarrassing is this for OpenAI that it's even a big deal?
https://www.businessinsider.com/restaurant-accused-reselling...
Grocery and department stores routinely have brands that compete with those they resell — but they’re not cut off for that. Eg, Kroger operates its own bakery and resells bread.
What makes technology unlike those?
IMHO, there are 3 types of products in current LLM space (excluding hardware):
* Model makers - OpenAI, Claude, Gemini
* Infra makers - Groq, together.ai and etc
* Product makers - Cursor, Windsurf and others
If Level 1 can block Level 3 this easily, that's a problem for industry in my book. Because there will be no trust between different types of companies, when there is no trust, some companies become monopoly with a bad behavior, bad for customers/usersAnd for that to happen you need to be (a) an effective monopoly, (b) have a negative direct or indirect impact on consumers, (c) large enough for regulators to care about and (d) be in a regulatory environment that priorities this enforcement.
Anticompetitive practices are actions that reduce competitiveness in a market by entrenching your dominance over the (usually smaller) competition.
Not allowing your competitor to buy your product arguably increases competition? It pushes them to improve their own product to be as good as yours.
Amazon had a similar tactic, where it would use other sellers on its marketplace to validate market demand for products, and then produce its own cheap copies of the successes.
I think it's the subscription-based models that are tricky to make work in the long term, since they suffer from adverse selection. Only the heaviest users will pay for a subscription, and those are the users that you either lose money on or make unhappy with strict usage limits. It's kind of the inverse of the gym membership model.
Honestly, I think the subscriptions are mainly used as a demand moderation method for advanced features.
Many people believe that model providers are running at negative margin.
(I don't know how true it is.)
1. There is no point in providing paid APIs at negative margins, since there's no platform power in having a larger paid API share (paid access can't be used for training data, no lock-in effects, no network effects, no customer loyalty, no pricing power on the supply side since Nvidia doesn't give preferential treatment to large customers). Even selling access at break-even makes no sense, since that is just compute you're not using for training, or not selling to other companies desperate for compute.
2. There are 3rd-party providers selling only the compute, not models, who have even less reason to sell at a loss. Their prices are comparable to 1st-party providers.
3. Deepseek published their inference cost structure for R1. According to that data their paid API traffic is very lucrative (their GPU rental costs for inference are under 20% of their standard pricing, i.e. >80% operating margins; and the rental costs would cover power, cooling, depreciation of the capital investment).
Insofar as frontier labs are unprofitable, I think it's primarily due to them giving out vast amounts of free access.
[0] https://www.snellman.net/blog/archive/2025-06-02-llms-are-ch...
Classic fixed / variable cost fallacy: if you look at the steel and plastic in a $200k Ferrari, it’s worth about $10k. They have 95% gross margins! Outrageous!
(Nevermind the engine R&D cost, the pre-production molds that fail, the testing and marketing and product placement and…)
Training is a fixed cost, not a variable cost. My initial comment was on the unit economics, so fixed costs don't matter. But including the full training costs doesn't actually change the math that much as far as I can tell for any of the popular models. E.g. the alleged leaked OpenAI financials for 2024 projected $4B spent on inference, $3B on training. And the inference workloads are currently growing insanely fast, meaning the training gets amortized over a larger volume of inference (e.g. Google showed a graph of their inference volume at Google I/O -- 50x growth in a year, now at 480T tokens / month[0])
1. High volume providers get efficiencies that low volume do not. It comes from both more workload giving more optimization opportunities, and staffing to do better engineering to begin with. The result is break even for lower volume firms is profitable for higher volume, and as high volume is magnitudes more scale, this quickly pays for many people. By being the high-volume API, this game can be played. If they choose not to bother, it is likely because strategic views on opportunity cost, not inability.
That's not even the interesting analysis, which is what the real stock value is, or whatever corp structure scheme they're doing nowadays:
2. Growth for growths sake. Uber was exactly this kind of growth-at-all-costs play, going more into debt with every customer and fundraise. My understanding is they were able to tame costs and find side businesses (delivery, ...), with the threat becoming more about category shift of self-driving. By having the channel, they could be the one to monetize as that got figured out better.
Whether tokens or something else becomes what is charged for at the profit layers (with breakeven tokens as cost of business), or subsidization ends and competitive pricing dominates, being the user interface to chat and the API interface to devs gives them channel. Historically, it is a lot of hubris to believe channel is worthless, and especially in an era of fast cloning.
But paid-per-token APIs at negative margins do not provide scaling efficiencies! It's just the provider giving away a scarce resource (compute) for nothing tangible in exchange. Whatever you're able to do with that extra scale, you would have been able to do even better if you hadn't served this traffic.
In contrast, the other things you can use the compute for have a real upside for some part of the genai improvement flywheel:
1. Compute spent on free users gives you training data, allowing the models to be improved faster.
2. Compute spent on training allows the models to be trained, distilled and fine-tuned faster. (Could be e.g. via longer training runs or by being able to run more experiments.)
3. Compute spent on paid inference with positive margins gives you more financial resources to invest.
Why would you intentionally spend your scarce compute on unprofitable inference loads rather than the other three options?
> 2. Growth for growths sake.
That's fair! It could in theory be a "sell $2 for $1" scenario from the frontier labs that are just trying to pump up their revenue numbers to fund-raise from dumb money who don't think to at least check on the unit economics. OpenAI's latest round certainly seemed to be coming from the dumbest money in the world, which would support that.
I have two rebuttals:
First, it doesn't explain Google, who a) aren't trying to raise money, b) aren't breaking out genai revenue in their financials, so pumping up those revenue numbers would not help at all. (We don't even know how much of that revenue is reported under Cloud vs. Services, though I'd note that the margins have been improving for both of those segments.)
Second, I feel that this hypothetical, even if plausible, is trumped by Deepseek publishing their inference cost structure. The margins they claim for the paid traffic are high by any standard, and they're usually one of the cheaper options at their quality level.
1. You just negated a technical statement with... I don't even know what. Engineering opportunities at volume and high skill allow changing the margin in ways low volume and low capitalization provider cannot. Talk to any GPU ML or DC eng and they will rattle off ways here. You can claim these opportunities aren't enough, but you don't seem to be willing to do so.
2. Again, even if tokens are unprofitable at scale (which I doubt), market position means owning a big chunk of the distribution channel for more profitable things. Classic loss leader. Being both the biggest UI + API is super valuable. Eg, now that code as a vertical makes sense, they bought more UI here, and now they can go from token pricing closer to value pricing and fancier schemes - imagine taking on GitHub/Azure/Vercel/... . As each UI and API point takes off, they can devour the smaller players who were building on top to take over the verticals.
Seperately, I do agree, yes, the API case risks becoming (and staying) a dumb pipe if they fail to act on it. But as much as telcos hate their situation, it's nice to be one.
Maybe if you could name one of those potential opportunities, it'd help ground the discussion in the way that you seem to want?
Like, let's say that additional volume means one can do more efficient batching within a given latency envelope. That's an obvious scale-based efficiency. But a fuller batch isn't actually valuable in itself: it's only valuable because it allows you to serve more queries.
But why? In the world you're positing where these queries are sold at negative margins and don't provide any other tangible benefit (i.e. cannot be used for training), the provider would be even better off not serving those queries. Or, more likely, they'd raise prices such that this traffic has positive margins, and they receive just enough for optimal batching.
> You can claim these opportunities aren't enough, but you don't seem to be willing to do so.
Why I would claim that? I'm not saying that scaling is useless. I think it's incredibly valuable. But scale from these specific workloads is only valuable because these workloads are already profitable. If it wasn't, the scarce compute would be better off being spent on one of the other compute sinks I listed.
(As an example, getting more volume to more efficiently utilize the demand troughs is pretty obviously why basically all the major providers have some sort of batch/off-peak pricing plans at very substantial discounts. But it's not something you'd see if their normal pricing had negative margins.)
> Engineering opportunities at volume and high skill allow changing the margin in ways low volume and low capitalization provider cannot.
My point is that not all volume is the same. Additional volume from users whose data cannot be used to improve the system and who are unprofitable doesn't actually provide any economies of scale.
> 2. Again, even if tokens are unprofitable at scale (which I doubt),
If you doubt they're unprofitable at scale, it seems you're saying that they're profitable at scale? In that case I'd think we're actually in violent agreement. Scaling in that situation will provide a lot of leverage.
I'm disputing this two-fold:
- Software tricks like batching and hardware like ASICs mean what is negative/neutral for a small or unoptimized provider is eventually positive for a large, optimized provider. You keep claiming they cannot do this with positive margin some reason, or only if already profitable, but those are unsubstantiated claims. Conversely, I'm giving classic engineering principles why they can keep driving down their COGS to flip to profitability as long as they have capital and scale. This isn't selling $1 for $0.90 because there is a long way to go before their COGS are primarily constrained by the price of electricity and sand. Instead of refuting this... You just keep positing that it's inherently negative margin.
In a world where inference consumption just keeps going up, they can keep pushing the technology advantage and creating even a slight positive margin goes far. This is the classic engineering variant of buttoning margins before an IPO: if they haven't yet, it's probably because they are intentionally prioritizing market share growth for engineering focus vs cost cutting.
- You are hyper fixated on tokens, and not that owning a large % of distribution lets them sell other things . Eg, instead of responding to my point 2 here, you are again talking about token margin. Apple doesn't have to make money on transistors when they have a 30% tax on most app spend in the US.
I think deepseek instead just showed they haven't really bothered yet. They rather focus on growing, and capital is cheap enough for these firms that optimizing margins is relatively distracting. Obviously they do optimize, but probably not at the expense of velocity and growth.
And if they do seriously want to tackle margins, they should pull a groq/Google and go aggressively deep. Ex: fab something. Which... They do indeed fund raise on.
Like, yes, if somebody has 100k H100s and are only able to find a use for 10k of them, they'd better find some scale fast; and if that scale comes from increasing inference workloads by 10x, there's going to be efficiencies to be found. But I don't think anyone has an abundance of compute. If you've instead got 100k H100s but demand for 300k, you need to be making tradeoffs. I think loss-making paid inference is fairly obviously the worst way to allocate the compute, so I don't think anyone is doing it at scale.
> I think deepseek instead just showed they haven't really bothered yet.
I think they've all cared about aggressively optimizing for inference costs, though to varying levels of success. Even if they're still in a phase where they literally do not care about the P&L, cheaper costs are highly likely to also mean higher throughput. Getting more throughput from the same amount of hardware is valuable for all their use cases, so I can't see how it couldn't be a priority, even if the improved margins are just a side effect.
(This does seem like an odd argument for you to make, given you've so far been arguing that of course these companies are selling at a loss to get more scale so that they can get better margins.)
> - You are hyper fixated on tokens, and not that owning a large % of distribution lets them sell other things . Eg, instead of responding to my point 2 here, you are again talking about token margin. Apple doesn't have to make money on transistors when they have a 30% tax on most app spend in the US.
I did not engage with that argument because it seemed like a sidetrack from the topic at hand (which was very specifically the unit economics of inference). Expanding the scope will make convergence less likely, not more.
There's a very good reason all the labs are offering unmonetized consumer products despite losing a bundle on those products, but that reason has nothing at all to do with whether inference when it is being paid for is profitable or not. They're totally different products with different market dynamics. Yes, OpenAI owning the ChatGPT distribution channel is vastly valuable for them long-term, which is why they're prioritizing growth over monetization. That growth is going to be sticky in a way that APIs can't be.
Thanks, good discussion.
Re: Stickiness => distribution leadership => monetization, I think they were like 80/20 on UI vs API revenue, but as a leader, their API revenue is still huge and still growing, esp as enterprise advance from POCs. They screwed up the API market for coding and some others (voice, video?), so afaict are more like "one of several market share leaders" vs "leading" . So the question becomes: Why are they able to maintain high numbers here, eg, is momentum enough so they can stay tied in second, and if they keep lowering costs, stay there, and enough so it can stay relevant for more vertical flows like coding? Does bundling UI in enterprise mean they stay a preferred enterprise partner? Etc . Oddly, I think they are at higher risk of losing the UI market more so than the API market bc an organizational DNA change is likely needed for how it is turning into a wide GSuite / Office scenario vs simple chat (see: Perplexity, Cursor, ...). They have the position, but it seems more straightforward for them to keep it in API vs UI.
Everything depends on this actually being possible and I haven’t seen a lot of information on that so far.
DeepSeek‘s publication suggests that it is possible - specifically there was recently a discussion on batching. Google might have some secret sauce with their TPUs (is that why Gemini is so fast?)
And there are still Cerebras and Groq (why haven’t they taken over the world yet?), but their improvements don’t appear to be scale dependent.
Speculating that inference will get cheaper in the future might justify selling at a loss now to at least gain mind share, I guess.
antrophic said themselves that enterprise is where the money at, but you cant just serve enterprise on the get go right
this is where the B2C indirect influence comes
Model providers spend a ton of money. It is unclear if they will ever have high margins. Today they are somewhere between zero and negative big numbers.
OpenAI is acquiring Windsurf which is its most direct competitor.
Illustrates a risk of building a product with these AI coding tools. If your developers don't know how to build applications without using AI, then you're at the mercy of the AI companies. You might come to work one day and find that accidentally or deliberately or as the result of a merger or acquisition that the tools you use are suddenly gone.
The same can be said if your developers don't know how to build applications:
- without using syntax highlighting ...
- without using autocomplete ...
- without using refactoring tools ...
- without using a debugger ...
Why do we not care about those? Because these are commodity features. LLMs are also a commodity now. Any company with a few GPUs and bandwidth can deploy the free DeepSeek or QwQ models and start competing with Anthropic/OpenAI. It may or may not be as good as Claude 4, but it won't be a catastrophe either.
If I spend a ton of money money making the most amazing ceramic dinner plates ever and sell them to distributors for $10 each, and one distributor strikes gold in a market selling them at $100/plate, despite adding no value beyond distribution… hell yeah I’m cutting them off and selling direct.
I don’t really understand how it’s possible to see that in moral terms, let alone with the low-value partner somehow a victim.
Yeah because another internet provider did not have SpaceX reusable rocket technology
its not really quite the same you know
This wouldn't be remotely comparable. This is targeting of a competitor's employees, not targeting a competitor's subsidiaries.
If you want to go the Apple-Google route a better comparison would be that this is like Apple refusing to allow you to hook up an Air Tag on an Android phone. Which is something that they do, in fact, do.
It's same as: We can cut access anytime, "I think it would be odd for us to be selling Claude to <YOUR_COMPANY>"
I certainly don’t trust them to not kill whole products on demand
Yeah better to go with open technologies. Maybe use Groq for inference knowing you can switch over later if needed as you are using Llama or Deepseek.
We can all imagine all sorts or terrible futures. Many of us do. But there is no upside in being prematurely outraged.
What this change really says is that Anthropic doesn't want to burn VC money to help a competitor. And that is the reality of "I just want stuff to work". It won't just work because there's no stable business underneath it. Until that business can be found things will keep changing and get shittier.
The Windsurf team repeatedly stated that they're running at a loss so all this seems to have achieved is giving OpenAI an excuse to cut their 3rd party costs and drive more Windsurf users towards their own models.
Haven't found anything else that even comes close.
How did you come to this conclusion? It’s very much like he remarked: OpenAI acquired Windsurf, OpenAI is Anthropics direct competitor.
It doesn’t make strategic sense to sell Claude to OpenAI. OpenAI could train against Claude weights, or OpenAI can cut out Anthropic at any moment to push their own models.
The partnership isn’t long lasting so it doesn’t make sense to continue it.
OpenAI can always buy a Claude API subscription with a credit card if they want to train something. This change only prevents the Windsurf product from offering Claude APIs to their customers.
I don’t see how this is irrelevant. Windsurf is a first party product of their most direct competitor. Imagine a car company integrating the cloud tech of a different manufacturer
Maybe GitHub and Microsoft should kick out all competing company 3rd party integrations.
See where this leads...
nickthegreek•15h ago
sebmellen•15h ago
You can choose independence if you’re willing to use a slightly worse open weight model.