We really need the best of both worlds: IDE (powerful like Intellij) + ADE (multitasking code)
And how does it compare to other tools like Conductor?
The ADE is best for steering multiple agents and reviewing their changes, especially once you care about isolated worktrees, diffs, artifacts, and landing changes cleanly.
When you need deep code navigation, the best answer is usually to open the worktree in your IDE. IDEs are already world-class at navigation and refactoring, so there’s no reason to rebuild that badly inside an agent UI.
Compared with Conductor, a few differences:
- Conductor relies mostly on the safety model of the underlying harnesses; ctx can run work in VM/container-isolated environments with explicit network policy.
- ctx has a local merge queue for landing changes from multiple agent worktrees onto each other.
- Conductor is a local Mac app; ctx also works with Linux and is designed for the “local app + remote Linux runtime” model for devapp/VPS.
- Conductor is focused mainly on Claude Code and Codex; ctx is meant to be a broader environment around multiple harnesses.
There are also substantial UX differences, but those are easier to judge by trying them.
Similarly I built a self-host able replit-like server with RAG but it's more end-user focused than developer focused...
With something like Cursor, you can use models from OpenAI, Anthropic, etc, but they still run through Cursor’s own agent harness.
With ctx, you bring the existing harness itself — Copilot, Claude Code, Codex, and so on — and it keeps its own auth/billing/session model. ctx is the layer around that: worktrees, review, runtime boundaries, merge queue, etc.
This reduces tool calls (and thus saves times and tokens) because instead of „trying“ / „guessing“ names repeatedly, tools like claude code typically get useful search results on the first try.
Claude for example may search for „dbal“ via regex, but the function name is „sql“ - semantic search will find that while for regex, claude would try 3 additional guesses before it actually finds what its looking for. Hope this helps!
So Claude Code, Codex, OpenCode, etc keep their normal tools/capabilities rather than being reimplemented inside a new proprietary agent. If a harness has its own indexing/code-search story, you still get that; if it doesn’t, ctx doesn’t provide additional tools like codebase indexing.
The only additional tools we do provide are orchestration-related: - local merge queue for agents (submit your diff and make sure it lands cleanly on others) - agnostic subagents (for example, a Claude Code primary agent can invoke a Codex subagent)
That’s also not how I think about ctx. The UI is a workbench around agents, not a replacement for IntelliJ/VS Code. If you need deep code navigation, refactors, debugger-heavy work, etc., the right answer is usually to open the same worktree in your IDE.
ctx includes surfaces for diff review and an integrated terminal, but not code editing or a full-fledged IDE. It's not a fork of VSCode.
The added isolation does come with some friction though, which is kind of by design.
The workflow is like this:
1. an agent works in its own worktree
2. its changes are green in isolation
3. it submits that work to the local merge queue
4. the queue replays the change on top of the latest target branch and runs verification
5. if it conflicts or fails after replay, the merge is rejected
6. the agent can then pull in the new upstream state, resolve the conflict or test failure, and resubmit
We've found that agent-driven conflict resolution via a merge queue works really well in practice. It's almost necessary because of the increase in velocity of changes.
Regarding sandboxing approach, containers are primary right now. We do this natively on Linux and with Apple Virtualization Framework (AVF) on Mac. So yes, there is a VM involved on Mac, but it’s not exposed as a separate top-level mode.
TLDR: use an ADE if you need multiple agents working concurrently on your code base. Otherwise IDE with an agent plugin is probably fine.
My setup is that I run `/merge`[1] , which will first have the agent rebase changes on base, and on conflicts, it's instructed to understand both sides before resolving, which helps it merge them cleanly. I haven't resolved conflicts manually in months and also haven't had any issues with agents resolving them incorrectly. A solved problem as far as I'm concerned.
One thing we found works really well is having the agent read the other agent’s plan document when it hits a merge conflict, not just the diff. A lot of conflicts are hard to resolve correctly without the intent behind the change.
When I am working with Claude I am often doing it from the root directory of a workspace of dozens of repos. I work with Claude to come up with a plan for implementing a feature and it investigates and plans.That plan often encompasses multiple repositories. Claude then turns large scale plans into smaller issues, or tickets as artifacts.
There are basically two ways to approach it:
- If one repo is primary and the others are mostly reference material, use workspace attachments. That lets the agent work in one repo while still being able to read the others. I do this a lot with dependency/source repos. - If the work genuinely spans multiple repos, just initialize the workspace at the parent directory that contains all of them. The harness still sees the same filesystem layout it normally would, so Claude/Codex/etc. can plan and work across repos the same way.
The main caveat is that some features are naturally more repo-specific. Merge queue is the obvious example, since landing and replay are much cleaner when there is one target repo/branch model.
- "I tried 47 agentic AI cli tools posted on HN in the last month. Here are the shocking results"
luca-ctx•2h ago
The multi-thread, worktree-based interface will probably look familiar. The parts HN may care more about are the containerized workspaces, remote-host model, and local merge queue for multi-agent work.
xrd•1h ago
My solution has been to create a new VM which inherits a Claude cli and Gemini CLI pre installed.
That way I can configure at a host level all the permissions I want and it is less likely the agent will access full sets of files and even worse delete things. I know this limits what I can do, but I am exhausted my understanding and auditing the different options for each agent.
I can install a new agent on that VM and then try it, but it is hard to justify the effort to test each one.
What am I getting from your tool for example? Worktree support is somewhat common, right? Does this give me multi agent support that Gemini and Claude do not, does that mean collaboration across team members? Is your approach better, or safer, than what I'm doing? How do I verify those claims?
Can I use your tool with local models like gemma 4 and ollama/llama.cpp: I have 3 24gb Nvidia cards and would like to try a three agent approach, one to write the code, one to write tests, one to architect. I obviously can't use local models with Gemini and Claude cli.
I'm just riffing on my concerns, and thanks for listening.