Net-Negative Cursor

https://lukasatkinson.de/2025/net-negative-cursor/

74•todsacerdoti•8mo ago

Comments

nyrulez•8mo ago

Amazed at all the negative rhetoric around coding with LLMs on HN lately. The coding world is deeply split about their utility.

I think part of that comes from the difficulty of working with probabilistic tools that needs plenty of prompting to get things right, especially for more complex things. To me, it's a training issue for programmers, not a fundamental flaw in the approach. They have different strengths and it can take a few weeks of working closely to get to a level where it starts feeling natural. I personally can't imagine going back to the pre LLM era of coding for me and my team.

add-sub-mul-div•8mo ago

If you've sufficiently mastered performing the skill directly, the idea of going the "difficult" and circuitous route of asking a probabilistic tool in just the right way makes no sense. It would slow you down. It's also a training issue that I wouldn't be able to code well on a Dvorak keyboard, but I don't plan on making that switch either.

nyrulez•8mo ago

"Mastering" is a strong term and is even misleading, especially when talking about tools that give you leverage. I mean if someone masters running, does it mean you never use a car? There are thousands and thousands of instances in everyday programming where AI is going to be 10x-100x faster than any human, especially at the function level and even file/script level.

I can give you a concrete example since things sometimes can be so philosophical. The other day I needed a LIS code (Longest Increasing subsequence) with some very specific constraints. It would've honestly taken me a few hours to get it right as it's been a while I coded that kind of thing. I was able to generate the solution with o3 in around 10 minutes, with some back and forth. It wasn't one shot, but took me 2-3 iteration cycles. I was able to get highly performant code that worked for a very specific constraint. It used Fenwick trees (https://en.wikipedia.org/wiki/Fenwick_tree) which I honestly hadn't programmed myself before. It felt like a science fiction moment to me as the code certainly wasn't trivial. In fact I am pretty sure most senior programmers would fail at this task, let alone be fast at it.

As a professional programmer, I deal with 20 examples every day where using a quality LLM saved me significant time, sometimes hours per task. I still do manual surgery a bunch of times everyday but I see no need to write most functions anymore or do multi-file refactors myself. In a few weeks, you get very good at applying Cursor and all its various features intelligently, like an amazing pair programmer who has different strengths than you. I'll go so far as to say I wouldn't hire an engineer who isn't very adept at utilizing the latest LLMs. The difference is just so stark - it really is like science fiction.

Cursor is popular for a reason. Lot of incredible programmers still get incredible value out of it, it isn't just for vibe coding. Implying that Cursor can be a net negative to programmers based on an example is a lot of fear mongering.

codr7•8mo ago

I'd argue they have no idea what they're doing, if this is the kind of value they're so excited about.

norir•8mo ago

It is far game to criticize cursor's marketing copy and I don't think it is fear mongering to point out that it is not exactly confidence inspiring when the first thing they show you as an example of its utility is both low quality and wrong

add-sub-mul-div•8mo ago

> I mean if someone masters running, does it mean you never use a car?

It means you shouldn't run with weights on your shoes even if running with weights on shoes is a more efficient way for others to run.

LLM tech is popular because (1) people like taking shortcuts and (2) their bosses like the prospect of hiring fewer people.

tonyedgecombe•8mo ago

>It would've honestly taken me a few hours to get it right as it's been a while I coded that kind of thing. I was able to generate the solution with o3 in around 10 minutes, with some back and forth.

>which I honestly hadn't programmed myself before.

How can you be sure it is correct if you haven't mastered the data structure yourself?

alehlopeh•8mo ago

Trying to portray this article as being “negative rhetoric” is being super disingenuous. It makes salient points about a specific example.

nyrulez•8mo ago

I mean the headline "Net-Negative Cursor" is a pretty far reaching conclusion. The article does try to generalize on the implications from a code snippet for AI powered programming. The headline isn't "The example on Cursor's website is incorrect".

alehlopeh•8mo ago

Do you really look at the title of this piece and think “damn that’s a far reaching conclusion”? I look at it and think “here is an instance of Cursor not delivering on its marketing promises”

That said, this article is very obviously not rhetoric. It seems almost dumb to argue this point. Maybe we should ask an AI if it is or not. I mean, I don’t know the author nor do I have anything to gain from debating this, but you can’t just go calling everything “rhetoric” when it’s clearly not. Yes there’s plenty of negative rhetoric about LLMs out there. But that doesn’t make everything critical of LLMs negative rhetoric. I’m very much pro-AI btw.

nyrulez•8mo ago

"But then I look at what these tools actually manage to do, and am disillusioned: these tools can be worse than useless, making us net-negative productive." It starts from this premise right in the first paragraph. And goes on to illustrate an example that proves their point ("Let's pick one of the best possible examples of AI-generated code changes.").

anyways, it doesn't matter that much :) we could be both right.

sjdrc•8mo ago

Yeah you can't use the cherry picked example argument when it was chosen by the developers

naikrovek•8mo ago

Those of us that consider software engineering to be “engineering” do not like LLMs, you are correct. Engineering requires that you face reality, evaluate the problem, and choose a solution in a deterministic way, then later return to evaluate the efficacy of the solution, changing the solution if required.

Those of us that consider software development to be “typing until you more or less get the outcome that you want” love LLMs. Non-deterministic vibes all around.

This is also why executives love LLMs; executives speak words and little people do what was asked of them, generally, sometimes wrong, but are later corrected. An LLM takes instructions and does what was asked, generally, sometimes wrong, and is later corrected, but much faster than unreliable human plebs who get sick all the time and demand vacation and time to mourn deaths of other plebs.

nyrulez•8mo ago

I am not sure what you're implying. The first sentence makes no sense. LLMs aren't giving you non-deterministic code. The code is shown to you and have complete control over how it looks and operates. Not understanding the mechanics of how the code is generated by the LLM doesn't make the output non-deterministic.

If you choose to accept bad code, that's on you. But I am not seeing that in practice, especially if you learn how to give quality prompts with proper rules. You have to get good at prompts - there is no escaping that. Now programmers do suck at communicating sometimes and that might be an issue. But in my experience, it can write far higher quality code than most programmers if used correctly.

naikrovek•8mo ago

It all makes sense. You just don’t understand what I am saying, probably because I am not being 500% obviously clear on the internet, and without that no one knows what you’re getting at.

o11c•8mo ago

And don't forget "Those of us whose salary depends on selling snake oil will never admit that snake oil is a scam."

selcuka•8mo ago

> Non-deterministic vibes all around.

Curious. Do you write deterministic code? Because I don't think I can write the same code for any non-trivial task twice. Granted, I would probably remember which algorithm or design pattern I used before, and I can try and use the same methods, but you can also prompt that information to an LLM.

Another question: Can you hire software developers who write code in a deterministic way? If you give the same task to multiple developers with the same seniority level, do you always get the same output?

> "typing until you more or less get the outcome that you want”

For the record, I don't use LLMs for anything that is beyond auto-completion, but I think you are being unfair to them. They are actually pretty good at getting atomic tasks right when prompted properly.

naikrovek•8mo ago

Yes, I write deterministic code. Given the same input, I work hard to make sure the functions I write do the same work and give the same output every time.

Now, if you’re going to hold me to some comp-sci definition of “deterministic” then I don’t know if I do or not but I can tell you that I don’t think I’ve ever come across a problem in recent times where randomness was a desirable property.

Do I write code deterministically? I don’t know. I approach problems that look alike in like ways, though. LLMs definitely give you different results for the same problem on different days, which says that LLMs are not fully aware of the context of what you’re doing, or they are ignoring that context.

Any solution which does what is needed of it is going to be fine, so long as the solution doesn’t consume too much RAM, CPU, or time to implement, which is why we see a lot of distinct bridges in the world. And I would not trust an LLM to design a safe bridge. I would like for software engineering to adopt practices of other engineering fields. I want performance and reliability of the software I use to go WAY up. As long as we are using LLMs to author things, we will never get there, because they have no idea what they are doing. LLMs are made to make you think they know what they’re doing.

thegrim33•8mo ago

Yes, the coding world is deeply split about their utility, since the coding world encompasses everything from a 13 year old modifying CSS files to a senior engineer building satellite guidance systems, and everything inbetween. The 13 year old with no knowledge, doing simple, well documented, non-critical things, where quality engineering isn't a concern, would find AI assistance absolutely mindblowingly amazing, whereas the senior dev doing hard, critical, high-quality required, work at the other end of the spectrum may find it nearly useless.

saithound•8mo ago

Since developers working a tplaces like Parkes Observatory use LLMs regularly, it seems like experience ("13-year-olds" versus "senior engineers" at the two extremes) doesn't explain this gap as well as you imply.

The other hypotheses in this thread (e.g. that it's largely a matter of programming language) seem much more plausible.

ost-ing•8mo ago

Everyone can use these tools to deepen knowledge and enhance output.

But, there is a difference between using LLMs and relying on LLMs. The hype is geared toward this idea that we can rely on these tools to do all the work for us, we can fire everyone, but its bollocks.

It becomes an increasingly ridiculous proposition as the work becomes more specialized indepth, cross functional, regulated and critical.

You can use it to help at any level of complexity, but nobody is going to vibe code a flight control system.

saithound•8mo ago

FWIW I fully agree.

tensorturtle•8mo ago

When it comes to full on vibe coding (Claude Code with accept all edits), my criteria is whether I will be held responsible for the complexity introduced by the code. When I've been commissioned to write backend APIs, "the buck stops with me" and I will have to be able to personally explain potentially any architectural decision to technical people. On the other hand, for a "demo-only" NextJS web app that I was hired to do for non-technical people (meaning they won't ever look at the code), I can fully vibe code it. I don't even want to know what the complexity and decisions AI has made for me but as far as I am concerned this will be a secret forever.

codr7•8mo ago

Amazed?

Did you read the post? Have you read any of them?

Everything people claim about them as far as writing code goes is delusional, this is clearly the wrong tool.

moozilla•8mo ago

I'm surprised this doesn't get brought up more often, but I think the main explanation for the divide is simple: current LLMs are only good at programming in the most popular programming languages. Every time I see this brought up in the HN comments section and people are asked what they are actually working on that the LLM is not able to help with, inevitably it's using a (relatively) less popular language like Rust or Clojure. The article is good example of this, before clicking I guessed correctly it would be complaining about how LLMs can't program in Rust. (Granted, the point that Cursor uses this as an example on their webpage despite all of this is funny.)

I struggled to find benchmark data to support this hunch, best I could find was [1] which shows a performance of 81% with Python/Typescript vs 62% with Rust, but this fits with my intuition. I primarily code in Python for work and despite trying I didn't get that much use out of LLMs until the Claude 3.6 release, where it suddenly crossed over that invisible threshold and became dramatically more useful. I suspect for devs that are not using Python or JS, LLMs have just not yet crossed this threshold.

[1] https://terhech.de/posts/2025-01-31-llms-vs-programming-lang...

imiric•8mo ago

As someone working primarily with Go, JS, HTML and CSS, I can attest to the fact that the choice of language makes no difference.

LLMs will routinely generate code that uses non-existent APIs, and has subtle and not-so-subtle bugs. They will make useless suggestions, often leading me on the wrong path, or going in circles. The worst part is that they do so confidently and reassuringly. I.e. if I give any hint to what I think the issue might be, after spending time reviewing their non-working code, then the answer is almost certainly "You're right! Here's the fix..."—which either turns out to be that I was wrong and that wasn't the issue, or their fix ends up creating new issues. It's a huge waste of my time, which would be better spent by reading documentation and writing the code myself.

I suspect that vibe coding is popular with developers who don't bother reviewing the generated code, either due to inexperience or laziness. They will prompt their way into building something that on the surface does what they want, but will fail spectacularly in any scenario they didn't consider. Not to speak of the amount of security and other issues that would get flagged by an actual code review from an experienced human programmer.

theptip•8mo ago

> Let's pick one of the best possible examples of AI-generated code changes. An example so good, that the Cursor Editor uses it to advertise on their front page.

I can see where the author is coming from, but “marketing image is dumb” is by far the least interesting critique possible for these tools.

Give me an argument for why SWE bench is flawed, or an analysis of what areas didn’t improve in Claude4.

Meanwhile I’m vibe coding with Claude and having a great time. Sure, I wouldn’t use it for anything high stakes, but vs. Claude 3.7 I’m seeing a much higher success rate on tens to hundreds of lines of code.

It’s really, really easy to get a better code sample than the one shown here.

deivid•8mo ago

> It’s really, really easy to get a better code sample than the one shown here.

The problem is that in a team setting, you can't control the type of "code sample" that your teammates send for review.

It's frequently the case that some teammates will request review for this kind of slop, while outsourcing all the "thinking" to the reviewer.

As a revwiewer, I now have to review much more code (LLMs absolutely increase the amount of code per hour you can put out), which is often unfiltered LLM output, so it needs more careful and thorough review.

Essentially, a careless developer with access to an LLM can now perform a cheap DoS on reviewers.

imiric•8mo ago

> Essentially, a careless developer with access to an LLM can now perform a cheap DoS on reviewers.

Exactly. And not just a careless developer, but one intentionally doing so.

Recently [this project][1] made it to the front page and amassed quite a few upvotes, positive feedback, and GitHub stars. Yet anyone giving it a cursory look would be able to tell that it's AI-generated slop, that is completely unnecessary at best, and possible malware at worst.

Not only that, but the author[2] fired off several AI slop PRs to popular projects in a single day (May 12th), which is exactly the cheap DoS you mention. The author of cURL wrote about AI-generated bogus security reports last year[3], and the work they add to open source maintainers who are already stretched thin. So this is a real problem that most AI proponents are ignoring.

[1]: https://news.ycombinator.com/item?id=44009321

[2]: https://github.com/dipampaul17

[3]: https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-f...

eranation•8mo ago

I think the article misses the point. Yes you can find countless examples of a tool being used badly or even a tool being used correctly and providing bad results. But while people spend time looking for flaws in a tool, (in what feels to me a bit as thinly disguised denialism of the inevitable) I prefer to ship features faster to my customers by using the tools to boost my productivity, not finding anecdotes that feel good where I caught the tool doing something stupid.

selcuka•8mo ago

Exactly. I don't think LLMs are fully there yet, but that shouldn't stop us from taking advantage of the limited things they already do well.

imiric•8mo ago

> while people spend time looking for flaws in a tool [...] I prefer to ship features faster to my customers by using the tools to boost my productivity

I'm curious: do you review the AI generated code? In your experience, what is the percentage of times you've found it to be correct and shippable? I.e. not just that it compiles and does what you asked it to do, but that it doesn't contain security or performance issues, that it doesn't include useless or dead code, that is idiomatic, etc. Do you add tests for that code, written by yourself or the AI, and if so, how exhaustive are they?

eranation•8mo ago

Yes of course, and the fact you have to ask these days (I would have asked the same) show how bad the situation is.

I review a lot, revert a lot, reevaluate my life choices at times, it’s like supervising a toddler with 20 years of experience. It definitely contains security issues at times, perhaps less obvious ones, but subtle and harder to detect ones (the issues move up the stack).

It’s almost never correct and shippable on the first go, but it still saves me and my team a lot of time on boilerplate and heavy lifting, and let us focus on creating value for customers. It’s still software engineering, just the actual coding part is much more productive (Coding has never been the main activity that software engineers spend most of their time on. They design, gather requirements, define scope, architect, plan, test, review, gather feedback, iterate, etc. Now they simply have more time to do these activities)

imiric•8mo ago

We've had similar experiences then, but I've arrived at a different conclusion. For me having to constantly babysit it, review and debug its output, and go through many rounds of back and forth without getting not just quality code, but working code at all, takes more time and effort than if I had done the research and written the code myself.

You're right that actually typing code is not our main activity. It's the last part of the process. To get there we need to understand the domain and the problem, do some research and come up with a design and implementation plan, and maybe do some exploratory work. LLMs can certainly help with writing boilerplate code, and it's what I've found them most useful for as well. They're also useful as slightly better rubber ducks, to get ideas I might not have thought about, or to assist with design, architecture, testing, etc. All this is fine, if we ignore the hallucination issues.

My problem is when I start asking them to write anything more sophisticated. Even when I give them a fairly detailed implementation plan (that they've helped me write), or point out an issue in great detail, more often than not they will lead me down wrong rabbit holes, produce non-working code I have to fix, and just waste my time. There have been very few cases where they've actually suggested a fix that turned out to be correct, or where what they've produced is usable, even after many iterations.

> Now they simply have more time to do these activities

That's the thing. I still have to do all those things you mentioned, while also babysitting an extremely confident junior engineer. This is why I feel that these tools are making me _less_ productive overall.

ttoinou•8mo ago

I don't think AI LLM are very good at Rust, no ? I tried multiple times with Sonnet 3.5 last year to produce web interface with Tauri and it got in infinite loops with async functions. On the other hand setting up shaders in Rust worked out of the box. But I couldn't ask too much before the AI was looping over bad code over and over.

Do this in C++ and I'm pretty sure you won't have any issue

zackangelo•8mo ago

I've had excellent experience with several models writing Rust. Wonder if there's just a particular issue with Tauri? I'm primarily writing code on top of the Candle ML framework.

energy123•8mo ago

Any learned tricks to getting good performance out of LLMs with Rust in particular that would differ from using LLMs to generate Python code?

Which models are the best for Rust? How are you finding Gemini 2.5 Pro?

woah•8mo ago

> A useful AI-driven development tool wouldn't just pick one solution, but explain the problem space and let the programmer make an informed choice.

Then just ask it to do that

ollysb•8mo ago

This is a prompting issue. You're still responsible for constraining the behaviour of the system. For the validation example asking for tests that show validation passing and failing would have surfaced the issue.

joshstrange•8mo ago

> And this is not a cherry-picked example by me. This is the first thing Cursor shows potential customers to demonstrate how good this AI-powered tooling allegedly is

I think this perfectly encompasses "vibe coding" [0], "Hey, it looks cool. Ship it. Don't worry about what's under the hood!".

[0] I'm using this term to specifically mean people using LLMs to write code with with doing very little or no checking of the code it writes, just what the website/app looks like.

latk•8mo ago

Author here. Yep, that's close to my thinking. I don't actually believe that Cursor (or similar tools) are completely shit.

But I worry that the Cursor team perhaps doesn't care whether their product actually delivers value. That they just want to sell the appearance of productivity.

This, to me, is a much bigger concern than everyday performance of their tool. Tools can be improved, organizational culture usually not.

But this is wild speculation. I didn't want to write this as the conclusion of the actual article, which tried to be more factual and to take their marketing at face value.

timzaman•8mo ago

Skill issue

A Bid-Based NFT Advertising Grid

AI readability score for your documentation

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

I inhaled traffic fumes to find out where air pollution goes in my body

X said it would give $1M to a user who had previously shared racist posts

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

A Bid-Based NFT Advertising Grid

AI readability score for your documentation

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

I inhaled traffic fumes to find out where air pollution goes in my body

X said it would give $1M to a user who had previously shared racist posts

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Net-Negative Cursor

Comments