does it make a mistakes? yes sometimes but you can verify with tests or with lean.
Their efficacy opens up a lot more possibility, but given they’re not AGI (without getting into a definitions debate) a lot of the magic is gone. Nothing fundamentally changed. I still use them a lot and they’re great, but it’s not a new paradigm (which I would then call magic).
I think the key point here too is LLMs demo like magic. You see the happy path and you think we have AGI. You show me of 10 years ago the happy path and I’d be floored until I talked to me of now and got the whole story.
The public discourse about LLM assisted coding is often driven by front end developers or rather non-professionals trying to build web apps, but the value it brings to prototyping system concepts across hardware/software domains can hardly be understated.
Instead of trying to find suitable simulation environments and trying to couple them, I can simply whip up a gui based tool to play around with whatever signal chain/optimization problem/control I want to investigate. Usually I would have to find/hire people to do this, but using LLMs I can iterate ideas at a crazy cadence.
Later, implementation does of course require proper engineering.
That said, it is often confusing how different models are hyped. As mentioned, there is an overt focus on front end design etc. For the work I am doing, I found Claude 4.5 (both models) to be absolutely unchallenged. Gemini 3 Pro is also getting there, but long term agentic capability still needs to catch up. GPT 5.1/codex is excellent for brainstorming in the UX, but I found it too unresponsive and intransparent as a code assistant. It does not even matter if it can solve bugs other llms cannot find, because you should not put yourself into a situation where you don't understand the system you are building.
This is magical to me.
I love cursor, I use it to deploy docker packages and fix npm issues etc too :p
I use some guardrails, like SonarQube as static code analyzer and of course some default linters. Checks and balances
VerifiedReports•2mo ago
codyswann•2mo ago
I try to fit the former under the banner of "Prompt-to-app Tools" and the latter as "Autonomous AI Engineering"
VerifiedReports•2mo ago
The ascendancy of non-descriptive, jargon for everything is irritating as hell. If something is supposed to mean "AI-generated code" then it needs to contain at least the important word from that description. Sad that this has to be explained now.