A cursor style "tab" model, but trained on jetbrains IDEs with full access to their internals, refactoring tools and so on would be interesting to see.
We need to adapt to new ways of thinking and ways of working with new tooling. It is a learning curve of sorts. What we want is to solve problems, the new tooling enables us to solve problems better by letting us free up our thinking by reducing blockers and toil tasks, giving us more time to think about higher level problems.
We need to adapt to new ways of thinking and ways of working with new tooling. It is a learning curve of sorts. What we want is to solve problems, the new tooling enables us to solve problems better by letting us free up our thinking by reducing blockers and toil tasks, giving us more time to think about higher level problems.
I remember this same sentiment towards AI when I was growing up, but towards cell phones...
speak for yourself, i want to understand everything and be elbow deep in the code
It is just now, I don't have to do that to actually build something meaningful, my ability to build is increased by some factor, and it is only increasing.
And coding LLM's have become a great teacher for me, and I learn much faster, for when I do want to dig deeper into the code, I can ask very nuanced questions about what certain code is doing, or how it works and it does a fairly good job of explaining it. Similar to how a real person would if I were in meat space at an office. Which I don't get that opportunity anymore in this remote life.
Now to directly push on your perspective, I'm not so sure why you make the conclusion that you don't have opportunity for feedback given you've moved to a remote office culture. I am giving you a form of feedback in this instance. Yes it is at my whim and not guaranteed if our interests don't align, however this is a cost of collaboration. It is a bit grim to see the ushering of "coding LLM" as proper replacement here, when you are doing no-more than bootstrapping introspection. This isn't to detract from the value you've found in the tool, I only question why you've written off the collaboration element of unique human experiences interlocking on common ground.
Sure. But the same for NFTs.
We'll see which one this winds up being.
The value of AI coding is that it can eliminate some of the labor of programming, which is the overwhelming majority of cost.
These value propositions are nothing alike.
This describes OpenAI’s valuation pretty well.
Said the assembly senior specialist when first confronted with this newfangled fortran compiler shit.
Congrats to the team. Can’t wait to try it.
I would probably never run a second agent unless I expected the task to take at least two hours, any more than that and the cost of multitasking for my brain is greater than any benefit, even when there are things that I could theoretically run in parallel, like several hypotheses for fixing a bug.
IIRC Thorsten Ball (Writing an Interpreter in Go, lead engineer on Amp) also said something similar in a podcast – he's a single-tasker, despite some of his coworkers preferring fleets of agents.
I've recently described how I vibe-coded a tool to run this single background agent in a docker container in a jj workspace[0] while I work with my foreground agent but... my reviewing throughput is usually saturated by a single agent already, and I barely ever run the second one.
New tools keep coming up for running fleets of agents, and I see no reason to switch from my single-threaded Claude Code.
What I would like to see instead, are efforts on making the reviewing step faster. The Amp folks had an interesting preview article on this recently[1]. This is the direction I want tools to be exploring if they want to win me over - help me solve the review bottleneck.
So instead of interactively making one agent do a large task you make small agents do the coding while you focus on the design.
The obvious answer to this is that it is not feasible to retry each past validation for each new change, which is why we have testing in the first place. Then you’re back at square one because your test writing ability limits your output.
Unless you plan on also vivecoding the tests and treating the whole job as a black box, in which case we might as well just head for the bunkers.
Yes, that is exactly what I mean. You ask the Wizard of Oz for something, and you hear some sounds behind the curtain, and you get something back. Validate that, and if necessary, ask Oz to try again.
"The obvious answer to this is that it is not feasible to retry each past validation for each new change"
It is reasonably feasible because the job of Production Development and QA has existed, developers just sat in the middle. Now we remove the developer, and move them over to the role of combined Product + QA, and all Product + QA was ever able to even validate was developer output (which, as far as they were ever concerned, was an actual black box since they don't know how to program).
The developer disappears when they are made to disappear or decide to disappear. If the developer begins articulating ideas in language like a product developer, and then validates like a QA engineer, then the developer has "decided" to disappear. Other developers will be told to disappear.
The existential threat to the developer is not when the company mandate comes down that you are to be a "Prompt Engineer" now, it is when the mandate comes down that you need to be a Product Designer now (as in, you mandated not to write a single. line. of. code.) . In which case vast swaths of developers will not cut it on a pure talent level.
If yes, the QA is manuallish (considering manual == no automate by AI) and we’re still bottlenecked, so speeding up the engineer was a loss for nothing.
If no, because QA is also AI, then you have a product with no humans eyes on it being tested by another system with no human eyes of it. So effectively nobody knows what it does.
If you think LLMs are anywhere near that level of trust I don’t know what you’re smoking. They’re still doing things like “fixing” tests by removing relevant non passing cases every day.
My CTO is currently working on the ability to run several dockerised versions of the codebase in parallel for this kind of flow.
I’m here wondering how anyone could work on several tasks at once at a speed where they can read, review and iterate the output of one LLM in the time it takes for another LLM to spit an answer for a different task.
Like, are we just asking things as fast as possible and hoping for a good solution unchecked? Are others able to context switch on every prompt without a reduction in quality? Why are people tackling the problem of prompting at scale as if the bottleneck was token output rather than human reading and reasoning?
If this was a random vibecoding influencer I’d get it, but I see professionals trying this workflow and it makes me wonder what I’m missing.
Maybe code husbandry?
Imagine one agent just does docstrings - on commit, build an AST, branch, write/update comments accordingly, push and create a merge request with a standard report template.
Each of these mini-agents has a defined scope and operates in its own environment, and can be customized/trained as such. They just run continuously on the codebase based on their rules and triggers.
The idea is that all these changes bubble up to the developer for approval, just maybe after a few rounds of LLM iteration. The hope is that small models can be leveraged to a higher quality of output and operate in an asynchronous manner.
I can’t help but feel it’s like texting and driving, where people are overvaluing their ability to function with reduced focus. But obviously I have zero data to back that up.
There's pretty much no way anyone context switching that fast is paying a lick of attention. They may be having fun, like scrolling tiktok or playing a videogame just piling on stimuli, but I don't believe they're getting anything done. It's plausible they're smarter than me, it is not plausible they have a totally different kind of brain chemistry.
Then I take it a step further and create core libraries that are structured like standalone packages and are architected like third-party libraries with their own documentation and public API, which gives clear boundaries of responsibility.
Then the only somewhat manual step you have is to copy/paste the agent's notes of the changes that they made so that dependent systems can integrate them.
I find this to be way more sustainable than spawning multiple agents on a single codebase and then having to rectify merge conflicts between them as each task is completed; it's not unlike traditional software development where a branch that needs review contains some general functionality that would be beneficial to another branch and then you're left either cherry-picking a commit, sharing it between PRs, or lumping your PRs together.
Depending on the project I might have 6-10 IDE sessions. Each agent has its own history then and anything to do with running test harnesses or CLI interactions gets managed on that instance as well.
My take on this is that the better these things get eventually we will be able to infer and quantify signals that provide high confidence scores for us to conduct a better review that requires a shorter decision path. This is akin to how compilers, parsers, linters, can give you some level of safety without strong guarantees but are often "good enough" to pass a smell test.
I wish we'll get a model that's not necessarily intelligent, but at least competent at following instructions and is very fast.
I overwhelmingly prefer the workflow where I have an idea for a change and the AI implements it (or pushes back, or does it in an unexpected way) - that way I still have a general idea of what's going on with the code.
I prefer to use a single agent without pauses and catch errors in real time.
Multiple agent people must be using pauses, switching between agents and checking every result.
I already use this workflow myself, just multiple terminals with Claude on different directories. There’s like 100 of these “Claude with worktrees in parallel” UIs now, would have expected some of the common jetbrains value adds like some deep debugger integration or some fancy test runner view etc. The only one I see called out is Local History and I don’t see any fancy diff or find in files deep integration to diff or search between the agent work trees and I don’t see the jetbrains commit, shelf, etc. git integration that we like.
I do like the cursor-like highlight and add to context thing and the kanban board sort of view of the agent statuses, but this is nothing new. I would have expected at the least that jetbrains would provide some fancier UI that lets you select which directories or scopes should be auto approved for edit or other fancy fine grained auto-approve permissions for the agent.
In summary it looks like just another parallel Claude UI rather than a Jetbrains take on it. It also seems like it’s a separate IDE rather than built on the IntelliJ platform so they probably won’t turn it into a plugin in the future either.
https://blog.jetbrains.com/codecanvas/2025/10/jetbrains-is-s...
I'm team JetBrains4Life when it comes to IDEs, but their AI offerings have been a pretty mixed bag of mixed messages. And this one requires a separate subscription at that when I'm already paying for their own AI product.
- syntax errors displaying persistently even after being fixed (frequently; until restarted; not seen very recently)
- files/file tree not detecting changes to files on disk (frequent; until restarted; not seen very recently)
- cursor teleporting to specific place on the screen when ctrl is pressed (occasionally; until restarted)
- and most recently: it not accepting any mouse/keyboard input (occasionally; until killed))
gavinray•2mo ago
gfody•2mo ago
buster•2mo ago
davey48016•2mo ago