Edit: It says on the Jetbrains website:
“The AI Assistant plugin is not bundled and is not enabled in IntelliJ IDEA by default. AI Assistant will not be active and will not have access to your code unless you install the plugin, acquire a JetBrains AI Service license and give your explicit consent to JetBrains AI Terms of Service and JetBrains AI Acceptable Use Policy while installing the plugin.”
Which, of course, is to donate money to Sama so he can create AGI and be less lonely with his robotic girlfriend, I mean...change the world for the better somehow. /s
Then you can think about automated labs. If things pan out, we can have the same thing in chemistry/bio/physics. Having automated labs definitely seems closer now than 2.5 years ago. Is cost relevant when you can have a lab test formulas 24/7/365? Is cost a blocker when you can have a cure to cancer_type_a? And then _b_c...etc?
Also, remember that costs go down within a few generations. There's no reason to think this will stop.
On one of the systems I'm developing I'm using LLMs to compile user intents to a DSL, without every looking at the real data to be examined. There are ways; increased context length is bad for speed, cost and scalability.
I managed a deep learning team at Capital One and the lock-in thing is real. Replit is an interesting case study for me because after a one week free agent trial I signed up for a one year subscription, had fun the their agent LLM-based coding assistant for a few weeks, and almost never used their coding agent after that, but I still have fun with Replit as an easy way to spin up Nix based coding environments. Replit seems to offer something for everyone.
And everything, I mean everything after the title is only a downhill:
> saying "this car is so much cheaper now!" while pointing at a 1995 honda civic misses the point. sure, that specific car is cheaper. but the 2025 toyota camry MSRPs at $30K.
Cars got cheaper. The only reason you don't feel it is trade barrier that stops BYD from flooding your local dealers.
> charge 10x the price point > $200/month when cursor charges $20. start with more buffer before the bleeding begins.
What does this even mean? The cheapest Cursor plan is $20, just like Claude Code. And the most expensive Cursor plan is $200, just like Claude Code. So clearly they're at the exact same price point.
> switch from opus ($75/m tokens) to sonnet ($15/m) when things get heavy. optimize with haiku for reading. like aws autoscaling, but for brains.
> they almost certainly built this behavior directly into the model weights, which is a paradigm shift we’ll probably see a lot more of
"I don't know how Claude built their models and I have no insider knowledge, but I have very strong opinions."
> 3. offload processing to user machines
What?
> ten. billion. tokens. that's 12,500 copies of war and peace. in a month.
Unironically quoting data from viberank leaderboard, which is just user-submitted number...
> it's that there is no flat subscription price that works in this new world.
The author doesn't know what throttling is...?
I've stopped reading here. I should've just closed the tab when I saw the first letter in each sentence isn't capitalized. This is so far the most glaring signal of slop. More than the overuse of em-dash and lists.
This has been working great for the occasional use, I'd probably top up my account by $10 every few months. I figured the amount of tokens I use is vastly smaller than the packaged plans so it made sense to go with the cheaper, pay-as-you-go approach.
But since I've started dabbling in tooling like Claude Code, hoo-boy those tokens burn _fast_, like really fast. Yesterday I somehow burned through $5 of tokens in the space of about 15 minutes. I mean, sure, the Code tool is vastly different to asking an LLM about a certain topic, but I wasn't expecting such a huge leap, a lot of the token usage is masked from you I guess wrapped up in the ever increasing context + back/forth tool orchestration, but still
michaelbuckbee•37m ago
Not every problem needs a SOTA generalist model, and as we get systems/services that are more "bundles" of different models with specific purposes I think we will see better usage graphs.
mustyoshi•15m ago
But we're still in the hype phase, people will come to their senses once the large model performance starts to plateau
simonjgreen•14m ago
alecco•10m ago
This shouldn't be that expensive even for large prompts since input is cheaper due to parallel processing.
nateburke•6m ago
In the food industry is it more profitable to sell whole cakes or just the sweetener?
The article makes a great point about replit and legacy ERP systems. The generative in generative AI will not replace storage, storage is where the margins live.
Unless the C in CRUD can eventually replace the R and U, with the D a no-op.