You need to rewrite your CLI for AI agents

https://justin.poehnelt.com/posts/rewrite-your-cli-for-ai-agents/

73•justinwp•12h ago

Comments

justinwp•12h ago

Title is a little clickbait, but not really! :)

---

Human DX optimizes for discoverability and forgiveness. Agent DX optimizes for predictability and defense-in-depth. These are different enough that retrofitting a human-first CLI for agents is a losing bet.

dang•2h ago

Related ongoing thread:

Google Workspace CLI - https://news.ycombinator.com/item?id=47255881 - March 2026 (136 comments)

smy20011•2h ago

Are we reinventing RPC again? Calling CLI program with JSON sounds like RPC call. The schema feels likes something LSP can provided for such function.

Maybe asking agent to write/execute code that wraps CLI is a better solution.

tayo42•1h ago

Doesn't powershell have structured input and output?

CamperBob2•1h ago

Confused? You won't be, after this week's episode of https://en.wikipedia.org/wiki/SOAP.

Everything old is new again...

lejalv•1h ago

> The real question: what does it actually look like to build for this?

What was the not-so-real question? Or the surreal question?

I know it's becoming tiresome complaining of slop in HN. But folks! Put a bit of care in your writing! It is starting to look as if people had one more agent skill "write blogpost", with predictable results, as we are not a Python interpreter putting up with meh-to-disgusting code but actual humans with real lives and a sense of taste in communication

redanddead•1h ago

blogslop

vasco•59m ago

The content is good, who cares. What gets more tiresome is complaints like this. The post was upvoted so people like it.

jeppeb•1h ago

The article states that agents work better with JSON than documented flags - that seems counterintuitive . How is this assumption validated?

justinwp•25m ago

Try building a CLI with a complex JSON as flags approach. :)

peddling-brink•14m ago

I just did the opposite and am seeing better results.

Claude was struggling to use the ‘gh’ command to reliably read and respond to code review line level comments because it had to use the api. I had it write a few simple command line tools and a skill for invoking it, instantly better results.

YMMV

computerfriend•1h ago

> Humans rarely typo a traversal.

I don't think this is true?

MattGaiser•1h ago

The typos are a bit different, but that’s one reason I hate the command line as a human.

You want me to hand type a file name? I’ll flip a letter or skip one!

bewuethr•1h ago

I like to let tab completion write my filenames for me whenever possible.

esseph•46m ago

I tab complete and fish shell also gives a nice little arrow selectable menu if options

devmor•1h ago

This is a very silly premise to begin with.

If AI agents are so underdeveloped and useless that they can’t parse out CLI flags, then the answer is not to rewrite the CLI.

You either give the agents an API layer or you don’t use them because they’re not mature enough for the problem space.

lynx97•1h ago

I love how AI gave the command-line and TUI interfaces a kind of Second Renaissance. It is not just AI that loves CLIs. It is especially blind people like me, who still use a lot of text-mode tools for their implicit accessibility. I gave codex a whirl recently, and hey! No accessibility problems at all. Just works. A few years back, that would have been released as a GUI-only program and would have locked me out completely[1]. A blessing that text oriented interaction is becoming important again!!!

1: Strictly speaking, there are ways to access some GUI programs on Linux with a screen reader. However, frankly, most are not really a joy to use. The speed of interaction I get from a TUI is simply unmatched. Whenever I work with a true GUI, no matter if Windows, Mac or Linux, it feels like I am trying to run away from a monster in a dream. I try to run, but all I manage to do is wobble about...

utopiah•33m ago

A second renaissance? The entire time internet had been running on CLI. All modern services relying om containers rely on images based on CLI. There is no renaissance needed because it never stopped.

lynx97•28m ago

Sure, server-side. What I am talking about is user-facing programs. That is why I mentioned codex as an example.

rsanheim•1h ago

No. Nope. Agents do just fine with all sorts of CLIs. Old standards, new custom stuff, whatever.

The CLIs I’ve seen agents struggle with are those that wrap an enormous, unwieldy, poorly designed API under one namespace. All of Google Workspace apis, for example.

jsunderland323•1h ago

I'm working on a CLI now.

The pattern I used was this:

1) made a docs command that printed out the path of the available docs

$ my-cli docs

- README.md

- DOC1.md

- dir2/DOC2.md

2) added a --path flag to print out a specific doc (tried to keep each doc less than 400 lines).

$ my-cli docs --path dir2/DOC2.md

# Contents of DOC2.md

3) added embeddings so I could do semantic search

$ my-cli search "how do I install x?"

[1] DOC1.md

"You can install x by ..."

[2] dir2/DOC2.md

"after you install..."

You then just need a simple skill to tell the agent about the docs and search command.

I actually love this as a pattern, it work really well. I got it to work with i18n too.

danw1979•1m ago

I really like this - especially the embedded search. What do the embeddings and model cost you in terms of binary size ?

sheept•1h ago

This feels completely speculative: there's no measure of whether this approach is actually effective.

Personally, I'm skeptical:

- Having the agent look up the JSON schemas and skills to use the CLI still dumps a lot of tokens into its context.

- Designing for AI agents over humans doesn't seem very future proof. Much of the world is still designed for humans, so the developers of agents are incentivized to make agents increasingly tolerate human design.

- This design is novel and may be fairly unfamiliar in the LLM's training data, so I'd imagine the agent would spend more tokens figuring this CLI out compared to a more traditional, human-centered CLI.

gck1•56m ago

Yeah, people seem to forget one of the L's in LLM stands for Language, and human language is likely the largest chunk in training data.

A cli that is well designed for humans is well designed for agents too. The only difference is that you shouldn't dump pages of content that can pollute context needlessly. But then again, you probably shouldn't be dumping pages of content for humans either.

Smaug123•13m ago

It's not obvious that human language is or should be the largest amount of training data. It's much easier to generate training data from computers than from humans, and having more training data is very valuable. In paticular, for example, one could imagine creating a vast number of debugging problems, with logs and associated command outputs, and training on them.

magospietato•46m ago

Surely the skill for a cli tool is a couple of lines describing common usage, and a description of the help system?

sheept•38m ago

Sure, but the post itself brags,

> gws ships 100+ SKILL.md files

Which must altogether be hundreds of lines of YAML frontmatter polluting your context.

justinwp•26m ago

You don't need to install all of them.

danw1979•7m ago

Claude Code, at least, will only load a SKILL.md file into context when it’s invoked by the user or LLM itself, i.e. in demand.

jauntywundrkind•47m ago

And for more persistent services, worth considering using varlink, for your agents sake and just if you need two cli thinks to chat. https://varlink.org/

The systemd universe is moving this way from dbus, and there doesn't seem to be a ton of protest against giving up dbus for json over unix sockets. There's really not that many protocols that are super pleasant for conversing with across sockets.

inkdust2021•44m ago

There are so many diffierent CLI, why not use them or basic on them to rewrite?

Lio•40m ago

“Need”?

Why would I need to do that?

Someone else might ”want” me to do that but it’s not a ”need” I have. I don’t see the article making a good case for that.

antisol•38m ago

Why are you calling it "AI" if it can't even parse a goddamn man page?

theshrike79•13m ago

It can, but it pollutes the context with useless information costing you money.

danirod•37m ago

With all due respect, but if humans can figure out how new unseen programs work by using -h and seeing what options exist and what they do, I am sure robots can figure it out too, or else they weren’t that intelligent to begin with.

theshrike79•14m ago

The difference is that humans will remember the options listed by -h after a few times of using the tool.

AIs don't. If they don't reach for the --help switch every time they'll attempt the statistical average, which may or may not work.

For super-common or popular tools like `gh` the usage is already in the training data though.

utopiah•36m ago

That's how artificial this "intelligence" is, when LLMs can't even use text based tools full of txt based documentation formatted coherently without those very tools being adapted.

bonoboTP•12m ago

It looks like an AI generated fluff article without any evidence. People also did this for image generators as if you needed these arcane templates to prompt them, but actually the latest models are great at figuring out what you want from messy human input. Similarly LLMs can use regular CLI just fine. But how do you write a hype FOMO article about the fact that actually you don't need to do anything...

bayindirh•28m ago

Haha, no.

I write my tools for humans, without help or use of AI. If the AI agent wants to use my tool so bad, they need to rise to that level. I'll not crouch on my knees to meet it.

If I ever write a tool for AI interaction, I'd give it a well-defined API, to make it even easier for the agent.

donpark•28m ago

I suspect this CLI madness is just a phase, but if the trend continues, I trust AI agents to handle the necessary rewrites.

ZeroGravitas•27m ago

Some of this seems a bit overhyped.

I like CLI tools with json output that can be piped through jq. I've seen llms do that with existing tools.

The human needs and llm needs seem to overlap, especially if the human is using scripts and piping between tools.

The number of times it implies you didn't need to validate "human" input until llms arrived is scary too.

I'm also surprised to hear them say llms shouldn't Google, as that seems an area that Google themselves could optimize their search service for their llms and get an advantage.

Finally, I wonder if just using older, smaller llms is a valid fuzzing approach for clis (or anything else llms might be controlling). Or do you need a high powered llm trained to generate adversarial input?

danw1979•4m ago

> The number of times it implies you didn't need to validate "human" input until llms arrived is scary too.

I took away a completely different message: humans and LLMs make different mistakes that require different validation.

948382828528•25m ago

I do not.

climike•22m ago

cliwatch.com, creating some benchmarks, reach out if you are interested :)

_heimdall•12m ago

I strongly disagree here. Yes, build CLIs. No, don't target then at agents.

Build for humans, including good man pages or `--help` docs as needed.

If LLMs are worth the name AI they will understand how to discover and use Unix-style commands. In my experience, this is exactly the case and I need only say "I use tool X for use case) Y."

bingemaker•10m ago

If AI agents need CLIs, then whats stopping them from using APIs directly. I see CLIs as good wrappers over APIs, and nothing more. What more will CLIs provide which `curl -X POST` can't/won't provide?

Google Workspace CLI

You Just Reveived

MacBook Neo

The Self-Help Trap: What 20 Years of "Optimizing" Has Taught Me

Relax NG is a schema language for XML

Building a new Flash

Show HN: Poppy – a simple app to stay intentional with relationships

AMD will bring its "Ryzen AI" processors to standard desktop PCs for first time

Something is afoot in the land of Qwen

What Python's asyncio primitives get wrong about shared state

Dario Amodei calls OpenAI’s messaging around military deal ‘straight up lies’

NRC issues first commercial reactor construction approval in 10 years [pdf]

Dulce et Decorum Est (1921)

Humans 40k yrs ago developed a system of conventional signs

Picking Up a Zillion Pieces of Litter

Moss is a pixel canvas where every brush is a tiny program

Malm Whale

NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute

“It turns out” (2010)

Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic

The L in "LLM" Stands for Lying

Qwen3.5 Fine-Tuning Guide

The View from RSS

Chaos and Dystopian news for the dead internet survivors

Relicensing with AI-Assisted Rewrite

Libre Solar – Open Hardware for Renewable Energy

Raspberry Pi Pico as AM Radio Transmitter

Glaze by Raycast

Was Windows 1.0's lack of overlapping windows a legal or a technical matter?

Roboflow (YC S20) Is Hiring a Security Engineer for AI Infra