In my case, I ended up accruing $100/day w/ Claude Code (on github workflows) so Max x20 was an easy decision.
Pro seems targeted at a very different use case. Personally, I’ve never used the chat enough to break even. But someone who uses it several times per day might.
ETA: I get that the benefits transfer between the two, just with different limits. I still think it’s pretty clear which kind of usage each plan is intended for.
I wish for a vetting tool. Have an LLM examine the code then write a spec of what it reads and writes, & you can examine that before running it. If something in the list is suspect.. you’ll know before you’re hosed not after :)
LLMs are NOT THOROUGH. Not even remotely. I don't understand how anyone can use LLMs and not see this instantly. I have yet to see an LLM get a better failure rate than around 50% in the real world with real world expectations.
Especially with code review, LLMs catch some things, miss a lot of things, and get a lot of things completely and utterly wrong. It takes someone wholly incompetent at code review to look at an LLM review and go "perfect!".
Edit: Feel free to write a comment if you disagree
If you build an intuition for what kinds of asks an LLM (agent, really) can do well, you can choose to only give it those tasks, and that's where the huge speedups come from.
Don't know what to do about prompt injection, really. But "untrusted code" in the broader sense has always been a risk. If I download and use a library, the author already has free reign of my computer - they don't even need to think about messing with my LLM assistant.
Because people in general are not thorough. I've been playing around with Claude Code and before that, Cursor. And both are great tools when targeted correctly. But I've also tried "Vibe" coding with them and it is obvious where people get fooled - it will build a really nice looking shell of a product that appears to be working, but then you step into using it past the surface layer and issues start to show. Most people don't look past the surface layer, and instead keep digging in having the agent build on the crappy foundation, until some time later it all falls apart (And since a lot of these people aren't developers, they have also never heard of source control.)
If the first llm wasn’t enough, the second won’t be either. You’re in the wrong layer.
Not a professional developer (though Guillermo certainly is) so take this with a huge grain of salt, but I like the idea of an AI "trained" on security vulnerabilities as a second, third and fourth set of eyes!
While I suspect that's gonna work good enough on synthetic examples for naive and uninformed people to get tricked into trusting it... At the very least, current LLMs are unable to provide enough stability for this to be useful.
It might become viable with future models, but there is little value in discussing this approach currently. At least until someone actually made a PoC thats at least somewhat working as designed, without having a 50-100% false positive quota.
You can have some false positives, but it has to be low enough for people to still listen to it, which currently isn't the case.
If you don't know what you're doing, you are going to make more security mistakes. Throwing LLMs into it doesn't increase your "know what you're doing" meter.
Most of these tools are not that exciting. These are similar-looking TUIs around third-paty models/LLM calls.
What is the difference between this, and https://opencode.ai? Or any of the half a dozen tools that appeared on HN in the past few weeks?
Could work I think (be wary of sending .env to the web though)
Is it just better prompting? Better tooling?
for pretty much every other tool i've used, you walk away from it with the overwhelming feeling that whoever made this has never actually worked at a company in a software engineering team before
i realize this isn't an answer with satisfactory evidence-based language. but I do believe that there's a core `product-focus` difference between claude with other tools
We may finally get to the devs doing lock-in using ultra complex syntax languages in a much more efficient way using LLMs.
I have already some ideas for some target c++ code to port to C99+.
it improves over existing tools
Either it's that serving AI as a business model is impossible to run at a profit, which I easily demonstrated is not the case. If it's just serving the model, then yes, it works, and there's tons of businesses doing just that and operating at a profit.
Or is that's the expense of evening running a GPU to serve a model is not worth the value that the model running on the GPU is capable of making, which is demonstrably not true, given that people are paying anywhere from dozens to hundreds of dollars a month, and there is an eventual payback period for both the cost of the hardware and electricity there.
https://www.reuters.com/technology/chinas-deepseek-claims-th...
These companies are unprofitable because of balance sheet shenanigans. See “Hollywood Accounting”.
There is absolutely no way they are not turning massive profit. They are serving relatively similar models to open source at 5-50x the price.
GLM 2.5 is $0.60 in, $2.20 out and it’s basically equivalent to Claude Opus.
Opus is $15 in and $75 out.
No way they’re operating at a massive loss.
> after transistors were invented
But we don't have "transistors" yet, what's your point exactly?
margarina72•6mo ago
1: https://aider.chat/
KronisLV•6mo ago
Ofc some might prefer the pure CLI experience, but mentioning that because it also supports a lot of providers.
linsomniac•6mo ago
ripley12•6mo ago
esafak•6mo ago
Compare with https://github.com/sst/opencode/pulse/monthly
andretti1977•6mo ago