Is this exclusively referring to the ux or full functionality?
Because I can tell you straight away that cursor (Claude) vs copilot is not a 1% difference. Most people in my company pay their own cursor license even though we have copilot for available for free.
I think copilot could get there TBH. I love most Microsoft dev tools and IDEs. But it really isn’t there yet in my opinion.
I was referring to UX, as that is the main product. Cursor isn't providing their own models, or at least most people that I'm aware of are bringing their own keys.
I haven't used copilot extensively but my understanding is that they now have feature parity at the IDE level, but the underlying models aren't as good.
For use cases demanding the most intelligent model, yes they aren't.
However, there are cases that you just can't use best models due to latency. For example next edit prediction, and applying diffs [0] generated by the super intelligent model you decided to use. AFAIK, Cursor does use their own model for these, which is why you can't use Cursor without paying them $20/mo even if you bring your own Anthropic API key. Applying what Claude generated in Copilot is just so painfully slow to the point that I just don't want to use it.
If you tried Cursor early on, I recommend you update your prior now. Cursor had been redesigned about a year ago, and it is a completely different product compared to what they first released 2 years ago.
[0] We may not need a model to apply diff soon, as Aider leaderboard shows, recent models started to be able to generate perfect diff that actually applies.
I do sometimes use Composer (or Agent in recent versions), but it's being increasingly less useful in my case. Not sure why :(
My experience is that copilot is basically a better autocomplete, but anything beyond a three liner will deviate from current context making the answer useless - not following the codebase's convention, using packages that aren't present, not seeing the big picture, and so on.
In contrast, cursor is eerily aware of its surroundings, being able to point out that your choice of naming conflicts with somewhere else, that your test is failing because of a weird config in a completely different place leaking to your suite, and so on.
I use cursor without bringing my own keys, so it defaults to claude-3.5-sonnet. I always use it in composer mode. Though I can't tell you with full certainty the reasons for its better performance, I strongly suspect it's related to how it searches the codebase for context to provide the model with.
It gets to the point that I'm frequently starting tasks by dropping a Jira description with some extra info to it directly and watching it work. It won't do the job by itself in one shot, but it will surface entry points, issues and small details in such a way that it's more useful to start there than from a blank slate, which is already a big plus.
It can also be used as a rubber duck colleague asking it whether a design is good, potential for refactorings, bottlenecks, boy scouting and so on.
I use cursor for personal work though and it's night and day, even with the recent copilot agent mode additions. I told my CTO who asked about it if we should look into cursor and I told him straight up that in comparison copilot is basically useless.
This is true, but as a user of both and champion of Cursor - VS Code Copilot is quickly catching up.
OpenAI is predominantly a consumer AI company. Anthropic has also won over developer hearts and minds since Claude 3.5. Developers are also, proportionally, the largest uses of AI in an enterprise setting. OpenAI does not want to be pigeonholed into being the "ChatGPT company". And money spent now is a lot cheaper than money spent later.
But this is all just speculation anyways.
Given how heavily Apple has leaned into E2E over the years, I don't see this happening at all, beyond local on-device stuff.
It’s a shame we can’t have anything nice not get consumed but - such is the world.
This seems like a bold statement.
Like even in a cold capitalist analysis, the benefits to developer velocity, ease of new feature development, incident response, stability, customer trust, etc.
It doesn’t always; there are certainly areas of tech debt that bother me personally but I know aren’t worth the ROI to clean up. These become weekend projects if I want a fun win in my life, but nothing terrible happens if there’s a little friction.
Sometimes it’s because there are regular bugs and on-call becomes a drag on velocity.
Sometimes making code changes is difficult and there’s only one person who knows what going on, so you either have a bus factor risk or it limits flexibility on assigning projects / code review.
Sometimes the system’s performance is, or will be in the short–medium term, going to start causing incidents.
Sometimes incident recovery takes a long time. We had a pipeline that would take six–ten hours to run and couldn’t be restarted midway if it failed. Recovering from downtime was crazy!
Sometimes there’s a host of features whose development timelines would be sped up by more than it would take to burn down the tech debt to unlock them.
Sometimes a refactor would improve system performance enough to meaningfully affect the customer or reduce infra costs.
And then…
Sometimes you have career-driven managers and engineers who don’t want to or can’t make difficult long-term trade-offs, which is sometimes the way it is and you should consider switching teams or companies.
So I guess my question to you is: why should you burn this down?
Most of the criticisms of vibe coding are coming from SWEs who work on large, complicated codebases whose products are used by lots of people and for whom security, edge cases, maintainability, etc. are extremely important considerations. In this context, vibe coding is obviously a horrible idea, and we are pretty far away from AI being able to do more than slightly assist around the edges.
But small, bespoke scripts that will be used by exactly 1 person and are whose outputs are easily verified are actually _hugely_ important. Millions of things are probably done every single day where, if the person doing it had the skill to write up a small script, it would be massively sped up. But most people don't have that skill, and it's too expensive/there is too much friction to hire an actual programmer to solve it. AI can do these things.
Each specific instance isn't a big deal, and won't make much productivity difference, but in aggregate, the potential gains are massive, and AI is already far more than good enough to be completely creating these kinds of scripts. It is just going to take people a while to shift their perspective and start asking about what small tasks they do every day that could be scripted.
This is the true potential of "vibe coding". Someone who can't program, but knows what they need (and how to verify that it works), making something for their personal use.
> Low-code and no-code development platforms allow professional as well as citizen developers to quickly and efficiently create applications in a visual software development environment. The fact that little to no coding experience is required to build applications that may be used to resolve business issues underlines the value of the technology for organizations worldwide. Unsurprisingly, the global low-code platform market is forecast to amount to approximately 65 billion U.S. dollars by 2027. [0]
We could argue about the exact no-code TAM, but if you have a decent chance to create the market leader for the no-code replacement, $3B seems fair, doesn't it?
[0] https://www.statista.com/topics/8461/low-code-and-no-code-pl...
LLM-enabled Zapier or Make or n8n is the future, not everyone churning out Claude-written NextJS app after NextJS app.
There are many use cases for low-code. The two major ones I've dealt with are MVPs where tools like Bubble are used, and the other is creating corporate internal tools, where MS Power Platform is common.
Corporate IT departments are allergic to custom web apps, and have a much easier time getting a Power Platform project approved due to its easily understood security implications. That low-code use case is certainly going to be the last thing a tool like Windsurf conquers.
However, even without that use case, in an AI-heavy investment environment, $3B doesn't seem all that bad to me. However, I have zero experience with M&A.
Vibe coding is coding like a customer hiring a programmer is coding.
If all the code is written by AI it isn’t coding at all, it’s ordering.
Who is the contributor then? The AI or the prompt writer?
I mean I'd be more at ease if they would just contribute their prompt instead. And then, what value does that actually have? So many mixed feelings here.
At work I had a React dev merging Java code into a rather complex project. It was clearly heavily prompt assisted, and looked like the code the junior Java developer would have written. The difference is that the junior Java developer probably would have sweated a couple of days over that code, so she would know it inside out and could maintain it. The React dev would just write more prompts or ask the AI to do it.
If we're confident that prompting creates good code and solid projects, well then we don't need expensive developers anymore do we?
Are they easily verified though?
I have a bunch of people who are "vibe coding" in non-dev departments. It's amazing that it allows them to do things they otherwise couldn't, but I don't think it's accurate to say it's easily verified, unless we're talking about the most trivial tasks ("count the words in this text").
As soon as it gets a bit more complex (but far from "complex"), it's no longer verifiable for them except "the output looks kinda like what I expected". Might still be useful for things, but how much weight do you want to put on your sales-analysis if you've verified its accuracy by "looks intuitively correct"?
I would argue that the real money, and the gap right now, is in vibe tasking, not vibe coding.
There are millions of knowledge workers for whom the ability to synthesize and manipulate office artifacts (excel sheets, salesforce objects, emails, tableau reports, etc) is critical. There are also lots of employees who recognise that a lot of these tasks are "bullshit jobs", and a lot of employers that would like nothing more than to automate them away. Companies like Appian try to convince CEOs that digital process automation can solve this problem, but the difficult reality is that these tasks also require a bit of flexible thinking ("what do I put in my report if the TPS data from Gary doesnt show up in time?"). This is a far bigger and more lucrative market than the one made of people who need quick and dirty apps or scripts.
It's also one that has had several attempts over the years to solve it. Somewhere between "keyboard automation" (macro recording, AutoHotKey type stuff) and "citizen programming" (VB type tools, power automate) and "application oriented LLM" (copilot for excel, etc) there is a killer product and a vast market waiting to escape.
Amusingly, in my own experience, the major corps in the IT domain (msft, salesforce, etc etc) all seem to be determined to silo the experience, so that the conversational LLM interface only works inside their universe. Which perhaps is the reason why vibe tasking hasnt succeeded yet. Perhaps MCP or an MCP marketplace will force a degree of openness, but it's too early to say.
Almost everything I've seen achieved with vibe coding so far has been long since achievable with low / no code platforms. There is a great deal of value in the freedom that vibe coding gives (and for that reason, I am in favor of it) but the missing piece of this criticism of the criticism is that vibe coding is not the only way to write these simple scripts and it is the least reliable way.
Vibe coding as the future is an uninspired vision of the future. The future is less code, not more.
I think LLMs have a much better chance at this kind of software than Emacs or BASIC, but I also doubt it has any future: once AI is capable enough, you can just hide the programmatic layer entirely and tell the computer what to do.
Could you please elaborate? Is this how management (at least in your company) looks at code—as a ratio of how fast it's done over how many tests it passes?
"Vibe" coding is here to stay and it's only devs who don't know how to adapt that are wishfully hoping for otherwise.
I doubt that error free code is outnumbering code with errors in the training data.
Color me unimpressed - it converted some test files. It didn't design any architecture, create any databases, handle any security concerns or any of the other things programmers have to do/worry about on a daily basis. It basically did source to source translation, which has been around for 30+ years.
and we don't know what is the quality of end code, it is possible that tech debt created by migration is well higher than 1.5 eng/y.
We’ll see if Gemini 2.5 Flash is good enough, but it definitely doesn’t feel like Google is selling for a huge loss post-training.
Yes the training is a huge investment but are they really not going to do it? Doesn’t seem optional
For some projects (e.g. your internal-facing CRUD app), cheap code is acceptable. For a high scale consumer product, the cost of premium engineering resources is a rounding error on your profits, and even small marginal improvements can generate high value in absolute dollar terms.
I’m sure vibe coding will eat the lowest end of software development. It will also allow the creation of software that wouldn’t have been economically viable before. But I don’t see it notably denting the high end without something close to AGI.
Also what tech debt? If you have good engineers doing the vibe coding they are just way faster. And also faster at squashing bugs.
I was one-shotting whole features into our Rust code base with 2.5 last week. Absolutely perfect code, better than I could have written it in places.
Then later that week o3 solved a hard bug 2 different MLEs failed to solve as well as myself.
I have no idea why people think this stuff is bad, it’s utterly baffling to me
Oh, please. Even if every cent of VC funding dries up tomorrow we'd still have years of discovering how to use LLMs and "generative models" in general to do cool, useful stuff. And by "we" I mean everyone, at every level. The proverbial bearded dude in his mom's basement, the young college grad, phd researcher, big tech researcher, and everyone in the middle. The cat is out of the bag, and this tech is here to stay.
The various AI winters came because of many reasons, none that are present today. Todays tech is cool! It's also immediately useful (oAI, anthropic, goog are already selling billions of $ worth of tokens!). And it's highly transformative. The amount of innovation in the past 2 years is bonkers. And, for the first time, it's also accessible to "home users". Alpaca was to llama what the home computer was to computers. It showed that anyone can take any of the open models and train them on their downstream tasks for cheap. And guess what, everyone is doing it. From horny teens to business analysts, they're all using this, today.
Also, as opposed to the last time (which also coincided with the .com bubble), this time the tech is supported and mainly financed by the top tech firms. VCs are not alone in this one. Between MS, goog, AMZ, Meta and even AAPL, they're all pouring billions into this. They'll want to earn their money back, so like it or not, this thing is here to stay. (hell, even IBM! is doing gen ai =)) )
So no, AI winter is not coming.
I don't see how peak vibe coding in a few months follows that. Check revenue and growth figures for products like Lovable ($10m+ ARR) or Bolt.new ($30m+ ARR). This doesn't show costs (they might in fact be deep in red) but with story like that I don't see it crashing in 3-4 months.
On the user experience/expectation side, I can see how the overhyped claims of "build complete apps" hit a peak, but that will still leave the tools positioned strong for "quick prototyping and experimentation". IMHO, that alone is enough to prevent a cliff drop.
Even allowing for the peak in tool usage for coding specifically, I don't see how that causes "AI winter", since LLMs are now used in a wide variety of cases and that use is strongly growing (and uncorrelated to the whole "AI coding" market).
Finally, "costs will go up for all sorts of reasons" claim is dubious, since the costs per token are dropping even while the models are getting better (for a quick example, cost of GPT-4.1 is roughly 50% of GPT-4o while being an improvement).
For these reasons, if I could bet against your prediction, I'd immediately take that bet.
There's a rich irony to be saying this right after explaining how Google is dominating the market and how they're involved in an antitrust lawsuit for alleged illegal monopolistic practices.
And of course this willfully ignores the phase of capitalism we are in with the AI market right now. We all know how the story will end. Over time, AI companies will inevitably merge and the products will eventually enshittify. As companies like OpenAI look to exit they will go public or be acquired and need to greatly trim the fat in order to become profitable long-term.
We'll start seeing AI products incorporate things like advertising, raise their prices, and every other negative end state we've seen with every other new technology landscape. E.g., When I get a ride from Uber they literally display ads to me while I'm waiting for my vehicle. They didn't do that when they were okay with losing moeny.
And of course, "free market" capitalism isn't really free market at all in an enviornment where there are random tariffs being applied and removed on a whim to random countries.
I really don't understand why people feel like they need to defend capitalism like this. Capitalism doesn’t need a defender, if anything it constantly needs people restraining it.
The author frames Apple's choice as an own goal, but I'd rather see it as putting the failings of capitalism on display.
And yes Codeium/Windsurf focuses on enterprise customers more. As GP said they have an on-prem [0], a hybrid SaaS offering and enterprise features that just make sense (e.g. pooled credits). Their support team is more responsive (compared to Anysphere). Windsurf also "feels" more finished than Cursor.
[0] but ultimately if you want to "vibe-coding" you have to call Claude API
Codeium can be fine tuned. Though it’s trained on similar open source it does provide assurances that they do not inadvertently train on wrongly licensed software code.
https://windsurf.com/blog/copilot-trains-on-gpl-codeium-does...
The questions raised by the article (as I saw it) were price and timing. $3B is a lot. Is that overpaying for something with a known value but limited reach? Not to mention competitors with deep pockets. And the other question is - why now? What was to be gained by OpenAI by buying Windsuf now.
Codeium can be fine tuned. Though it’s trained on similar open source it does provide assurances that they do not inadvertently train on wrongly licensed software code.
https://windsurf.com/blog/copilot-trains-on-gpl-codeium-does...
1) I agree that the moat for these companies is thin. AFAICT, auto-complete, as opposed to agentic flows, is Cursor's primary feature that attracts users. This is probably harder than the author gives it credit for; figuring out what context to provide the model is a non-obvious problem - how do you tradeoff latency and model quality? Nonetheless, it's been implemented enough times that it's mostly just down to how good is the underlying model.
2) Speaking of models, I'm not sure it's been independently benchmarked yet, but GPT 4.1 on the surface looks like a reasonable contestant to back auto-complete functionality. Varun from Windsurf was even on the GPT 4.1 announcement livestream a few days ago, so it's clear Windsurf does intend to use them.
3) This is probably a stock deal, not a cash deal. Not sure why the author is so convinced this has to be $3B in cash paid for Windsurf. AFAIK that hasn't been reported anywhere.
4) If agentic flows do take off, data becomes a more meaningful moat. Having a platform like Cursor or Windsurf enables these companies to collect telemetry about _how_ users are coding that isn't possible just from looking at the repo, the finished product. It opens up interesting opportunities for RLHF and other methods to refine agentic flows. That could be part of the appeal here.
I didn't think about telemetry for RL, that's very interesting
Most recently, gemini 2.5 pro makes the agentic workflow usable, and how!
I haven’t heard about this before this post, but if they’re starting a “Social Media but with AI” site in 2025, can’t help but feel like they’re cooked.
AI will lead to far bigger work accomplished than one prompt or chat at a time. Bigger work flows on humans upgrading and interacting with AI will be a big critical category for that.
Otherwise, why spend 3 billion if you could have it cooked up by an AI coding agent for (almost) free?
Or called plausible deniability. They will always deny these reports.
At the end of the day, Windsurf has a private price tag which they know they will sell at.
If they were smart, they should consider selling the hype.
Beyond that, these IDEs have a potential path to “vibe coding for everyone” and could possibly represent the next generation of general office tooling. Might as well start with a dedicated product team vs spinning up a new one.
And so if you’re purchasing with equity in whole or in part, the critical question is, do you believe this product could be worth more than $3b in the future? That’s not at all a stretch.
Cursor is awfully cozy with Anthropic, as well, and so if I’m OpenAI, I don’t mind having a competitive product inserted into this space. This space, by the way, that is at the forefront of demonstrating real value creation atop your platform.
OAI spends gobs of money on Mercor and Windsurf telemetry gets them similar data. My guess is they saw their Mercor spend hitting close to 1B a year in the next 5 years if they did nothing to curb it
Before approaching Windsurf, OpenAI wanted to buy Cursor (which is what I predicted thought too [0]) first, then the talks failed twice! [1]
The fact they approached Cursor more than once tells you they REALLY wanted to buyout Cursor. But Cursor wanted more and were raising over $10B.
Instead OpenAI went to Windsurf. The team at Windsurf should think carefully and they should sell because of the extreme competition, overvaluation and the current AI hype cycle.
Both Windsurf and Cursor’s revenue can evaporate very quickly. Don’t get greedy like Cursor.
[0] https://news.ycombinator.com/item?id=43708867
[1] https://techcrunch.com/2025/04/17/openai-pursued-cursor-make...
soared•2h ago
dangus•2h ago
trollbridge•2h ago
asadotzler•2h ago
The biggest scumbag of them all, but hey "I use it for free."
dangus•2h ago
nativeit•2h ago
AndrewKemendo•2h ago
I haven’t found as good of an turnkey chat/search/gen interface as CGPT yet unfortunately.
Even self hosted deepseek on an Ada machine doesn’t get there cause the open source interfaces are still bad
soared•1h ago