Unix philosophy and filesystem access makes Claude Code amazing

https://www.alephic.com/writing/the-magic-of-claude-code

414•noahbrier•4mo ago

Comments

xorvoid•4mo ago

Let's do this. But entirely local. Local obsidian, local LLM, and all open source. That's the future I want.

smm11•4mo ago

Apple then.

p_ing•4mo ago

That rules out the open source part.

eadmund•4mo ago

Local Org mode, local LLM, all orchestrated with Emacs, all free software.

If only I were retired and had infinite time!

gchamonlive•4mo ago

Open-weights only are also not enough, we need control of the dataset and training pipeline.

The average user like me wouldn't be able to run pipelines without serious infrastructure, but it's very important to understand how the data is used and how the models are trained, so that we own the model and can assess its biases openly.

tsimionescu•4mo ago

Good luck understanding the biases in a petabyte of text and images and video, or whatever the training set is.

gchamonlive•4mo ago

Do you disagree it's important to have access to the data, ease of assessment notwithstanding?

scottyah•4mo ago

It is an interesting question. Of course everyone should have equal access to the data in theory, but I also believe nobody should be forced to offer it for free to others and I don't think I want to spend tax money having the government host and distribute that data.

I'm not sure how everyone can have access to the data without necessitating another taking on the burden of providing it.

gchamonlive•4mo ago

I think torrent is a very good way to redistribute this type of data. You can even selectively sync and redistribute.

I'm also not saying anyone should be forced to disclose training data. I'm only staying that a LLM that's only openweight and not open data/pipeline barely fits the opensource model of the stack mentioned by OP.

tsimionescu•4mo ago

I view it as more or less irrelevant. LLMs are fundamentally black boxes. Whether you run the black box locally or use it remotely, whether you train it yourself or use a pretrained version, whether you have access to the training set or not, it's completely irrelevant to control. Using an LLM means giving up control and understanding of the process. Whether it's OpenAI or the training data-guided algorithm that controls the process, it's still not you.

Now, running local models instead of using them as a SaaS has a clear purpose: the price of your local model won't suddenly increase ten fold once you start depending on it, like the SaaS models might. Any level of control beyond that is illusory with LLMs.

gchamonlive•4mo ago

I on the other hand think it's irrelevant if a technology is a blackbox or not. If it's supposed to fit the opensource/FOSS model of the original post having access to precursors is just as important as having access to the weights.

It's fine for models to have open-weights and closed data. It's only barely fitting the opensource model IMHO though.

tsimionescu•4mo ago

The point of FOSS is control. You want to have access to the source, including build instructions and everything, in order to be able to meaningfully change the program, and understand what it actually does (or pay an expert to do this for you). You also want to make sure that the company that made this doesn't have a monopoly on fixing it for you, so that they can't ask you for exorbitant sums to address an issue you have.

An open weight model addresses the second part of THIS, but not the first. However, even an open weight model with all of the training data available doesn't fix the first problem. Even if you somehow got access to enough hardware to train your own GPT-5 based on the published data, you still couldn't meaningfully fix an issue you have with it, not even if you hired Ilya Sutskever and Yann LeCun to do it for you: these are black boxes that no one can actually understand at the level of a program or device.

guy_5676•4mo ago

I'm not an expert on this tech, so I could be talking out my ass, but what you are saying here doesn't ring completely true to me. I'm an avid consumer of stable-diffusion based models. The community is very easily able to train adaptations to the network that push it in a certain direction, to the point you consistently get the model to produce specific types of output (e.g. perfectly replicating the style of a well known artist).

I have also seen people train "jailbreaks" of popular open source LLMs (e.g. Google Gemma) that remove the condescending ethical guidelines and just let you talk to the thing normally.

So all in all I am skeptical of the claim that there would be no value in having access to the training data. Clearly there is some ability to steer the direction of the output these models produce.

fragmede•4mo ago

Golden Gate Claude, and abliterated models, plus Deepseek's censoring of Tianamen Square, combined with Grok's alternate political views imply that these boxes are somewhat translucent, especially to leading experts like Ilya Sutskever. In order for Grok to hold alternative views, and to produce NSFW dialog while ChatGPT refuses to implies that there's additional work that happens during training to align models. Getting access to the source used to train the models would let us see into that model's alignment. It's easy enough to ask ChatGPT how to make cocaine, and get a refusal, but what else is lying in wait that hasn't been discovered yet? It's hard to square the notion that these are black boxes that no understands whatsoever, when the original LLama models, which also contain the same refusal, have been edited, at the level of a program, into abliterated models which happily give you a recipe. Note: I am not Pablo Escobar and cannot comment on the veracity of said recipe, only that it no longer refuses.

https://www.anthropic.com/news/golden-gate-claude

https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-a...

visarga•4mo ago

> having access to precursors is just as important as having access to the weights

They probably can't give you the training set as it would amount to publication of infringing content. Where would you store it, and what would you do with it anyway?

gchamonlive•4mo ago

If it's infringing content, it's not open and it's not FOSS. For a fully open stack for local LLMs you need open data too.

noahbrier•4mo ago

Seems like this might be possible with opencode? Haven't played much.

buzzy_hacker•4mo ago

LLMs are making open source programs both more viable and more valuable.

I have many programs I use that I wish were a little different, but even if they were open source, it would take a while to acquaint myself with the source code organization to make these changes. LLMs, on the other hand, are pretty good at small self-contained changes like tweaks or new minor features.

This makes it easier to modify open source programs, but also means that if a program isn't open source, I can't make these changes at all. Before, I wasn't going to make the change anyway, but now that I actually can, the ability to make changes (i.e. the program is open source) becomes much more important.

blks•4mo ago

So you’re just storing bunch of forks of open source projects with some AI-generated changes applied to them?

nwsm•4mo ago

Is local not infeasible for models of useful size (at least on a typical dev machine with <= 64GB RAM and a single GPU)

CountGeek•4mo ago

Maybe this is of interest https://laurentcazanove.com/blog/obsidian-rag-api

BirAdam•4mo ago

LM Studio + aider

xnx•4mo ago

No mention/comparison to Gemini CLI? Gemini CLI is awesome and they just added a kind of stealth feature for Chrome automation. This capability was first announced as Project Mariner, and teased for eventual rollout in Chrome, but it's available right now for free in Gemini CLI.

noahbrier•4mo ago

Tbf I haven't played much with it, but I have generally found that I don't like the permission model on Gemini CLI or Codex anywhere near as much as Claude Code.

sega_sai•4mo ago

In my experience of trying to do things with gemini cli and claude code, claude code was always significantly smarter. gemini cli makes so many mistakes and then tries hard to fix them (in my applications at least).

nharada•4mo ago

I do really like the Unix approach Claude Code takes, because it makes it really easy to create other Unix-like tools and have Claude use them with basically no integration overhead. Just give it the man page for your tool and it'll use it adeptly with no MCP or custom tool definition nonsense. I built a tool that lets Claude use the browser and Claude never has an issue using it.

lupusreal•4mo ago

The light switch moment for me is when I realized I can tell claude to use linters instead of telling it to look for problems itself. The later generally works but having it call tools is way more efficient. I didn't even tell it what linters to use, I asked it for suggestions and it gave me about a dozen of suggestions, I installed them and it started using them without further instruction.

I had tried coding with ChatGPT a year or so ago and the effort needed to get anything useful out of it greatly exceeded any benifit, so I went into CC with low expectations, but have been blown away.

fragmede•4mo ago

The lightbulb moment for me was to have it make me a smoke test and to tell to run the test and fix issues (with the code it generated) until it passes. iterate over all features in the Todo.md (that I asked it to make). Claude code will go off and do stuff for I dunno, hours?, while I work on something else.

peteybeachsand•4mo ago

genius i gotta try this

joshstrange•4mo ago

Hours? Not in my experience. It will do a handful of tasks then say “Great! I’ve finished a block of tasks” and stop. and honestly, you’re gonna want to check its work periodically. You can’t even trust it to run litters and unit test reliably. I’ve lost count of how many times it’s skipped pre-commit checks or committed code with failing tests because it just gives up.

antonvs•4mo ago

I once had the Gemini CLI get into a loop of failures followed by self-flagellation where it ended saying something like "I'm sorry I have failed you, you should go and find someone capable of helping you."

visarga•4mo ago

I saw on X someone posted a screenshot where Gemini got depressed after repeated failure, apologized and actually uninstalled itself. Honorable seppuku.

linhan_dot_dev•4mo ago

I've been using this method for several weeks. The issue I'm currently facing is that Claude Code incorrectly believes it has completed the task and then stops.

Let me illustrate with a specific, simple example: fixing linter or compiler errors. The problems I solve with this method are all verifiable via the command line (this can usually be documented in CLAUDE.md). Claude Code will continuously adjust the code based on the linter's output until all errors are resolved. This process often takes quite some time. I typically do this after completing a feature development. If Claude Code mistakenly thinks it has finished the task during one of these checks, it will halt the entire process. I then have to restart it using the same prompt to continue the task.

Therefore, I'm looking for an external tool to manage Claude Code. I haven't found one yet. I've seen some articles suggesting the use of a subagents approach, where tools like Gemini CLI or Codex could launch Claude Code. I haven't thoroughly explored this method yet.

maleldil•4mo ago

I have a Just task that runs linters (ruff and pyright, in my case), formatter, tests and pre-commit hooks, and have Claude run it every time it thinks it's done with a change. It's good enough that when the checks pass, it's usually complete.

simlevesque•4mo ago

A tip for everyone doing this: pipe the linters' stdout to /dev/null to save on tokens.

maleldil•4mo ago

Why? The agent needs the error messages from the linters to know what to do.

jamie_ca•4mo ago

If you're running linters for formatting etc, just get the agent to run them on autocorrect and it doesn't need to know the status as urgently.

maleldil•4mo ago

That's just one part of it. I want the LLM to see type checking errors, failing test outputs, etc.

ozim•4mo ago

Errors shouldn’t be on stdout ;)

dymk•4mo ago

“Errors” printed by your linter aren’t errors, they’re reports

joshstrange•4mo ago

This is the best way to approach it but if I had a dollar for each time Claude ran “—no-verify” on the git commits it was doing I’d have 10’s of dollars.

Doesn’t matter if you tell it multiple times in CLAUDE.md to not skip checks, it will eventually just skip them so it can commit. It’s infuriating.

I hope that as CC evolves there is a better way to tell/force the model to do things like that (linters, formatters, unit/e2e tests, etc).

35mm•4mo ago

I’ve found the same issue and also with Rust sometimes skips tests if it thinks they’re taking too long to compile, and says it’s unnecessary because it knows they’ll pass.

pixl97•4mo ago

Even AI understands it's Friday. Just push to to production and go home for the weekend.

mediaman•4mo ago

We should have a finish hook that, when the AI decides it's run, runs the hook, and gives it to the LLM, and it can decide whether the problem is still there.

Students don't get to choose whether to take the test, so why do we give AI the choice?

evertedsphere•4mo ago

a wrapper script?

theshrike79•4mo ago

(I code mostly in Go)

I have a `task build` command that runs linters, tests and builds the project. All the commands have verbosity tuned down to minimum to not waste context on useless crap.

Claude remembers to do it pretty well. I have it in my global CLAUDE.md sot I guess it has more weight? Dunno.

libraryofbabel•4mo ago

As an extension of this idea: for some tasks, rather than asking Claude Code to do a thing, you can often get better results from asking Claude Code to write and run a script to do the thing.

Example: read this log file and extract XYZ from it and show me a table of the results. Instead of having the agent read in the whole log file into the context and try to process it with raw LLM attention, you can get it to read in a sample and then write a script to process the whole thing. This works particularly well when you want to do something with math, like compute a mean or a median. LLMs are bad at doing math on their own, and good at writing scripts to do math for them.

A lot of interesting techniques become possible when you have an agent that can write quick scripts or CLI tools for you, on the fly, and run them as well.

derefr•4mo ago

It's a bit annoying that you have to tell it to do it, though. Humans (or at least programmers) "build the tools to solve the problem" so intuitively and automatically when the problem starts to "feel hard", that it doesn't often occur to the average programmer that LLMs don't think like this.

When you tell an LLM to check the code for errors, the LLM could simply "realize" that the problem is complex enough to warrant building [or finding+configuring] an appropriate tool to solve the problem, and so start doing that... but instead, even for the hardest problems, the LLM will try to brute-force a solution just by "staring at the code really hard."

(To quote a certain cartoon squirrel, "that trick never works!" And to paraphrase the LLM's predictable response, "this time for sure!")

libraryofbabel•4mo ago

As the other commenter said, these days Claude Code often does actually reach for a script on its own, or for simpler tasks it will do a bash incantation with grep and sed.

That is for tasks where a programmatic script solution is a good idea though. I don't think your example of "check the code for errors" really falls in that category - how would you write a script to do that? "Staring at the code really hard" to catch errors that could never have been caught with any static analysis tool is actually where an LLM really shines! Unless by "check for errors" you just meant "run a static analysis tool", in which case sure, it should run the linter or typechecker or whatever.

derefr•4mo ago

Running “the” existing configured linter (or what-have-you) is the easy problem. The interesting question is whether the LLM would decide of its own volition to add a linter to a project that doesn’t have one; and where the invoking user potentially doesn’t even know that linting is a thing, and certainly didn’t ask the LLM to do anything to the project workflow, only to solve the immediate problem of proving that a certain code file is syntactically valid / “not broken” / etc.

After all, solving an immediate problem that seems like it could come up again, by “taking the opportunity” to solve the problem from now on by introducing workflow automation to solve the problem, is what an experienced human engineer would likely do in such a situation (if they aren’t pressed for time.)

theshrike79•4mo ago

I've had multiple cases where it will rather write a script to test a thing than actually adding a damn unit test for it :)

kuboble•4mo ago

I have the opposite experience.

I used claude to translate my application and I asked him to translate each text in the application to his best abilities.

That worked great for one view, but when I asked him to translate the rest of the application in the same fashion he got lazy and started to write a script to substitute some words instead of actually translating sentences.

coldtea•4mo ago

>It's a bit annoying that you have to tell it to do it, though.

https://www.youtube.com/watch?v=kBLkX2VaQs4

lloeki•4mo ago

> Humans (or at least programmers) "build the tools to solve the problem" so intuitively and automatically when the problem starts to "feel hard", that it doesn't often occur to the average programmer that LLMs don't think like this.

Hmm. My experience of "the average programmer" doesn't look like yours and looks more like the LLM :/

I'm constantly flabbergasted as to how way too many devs fumble through digging into logs or extracting information or what have you because it simply doesn't occur to them that tools can be composed together.

visarga•4mo ago

Cursor likes to create one-off scripts, yesterday it filled a folder with 10 of them until it figured out a bug. All the while I was thinking - will it remember to delete the scripts or is it going to spam me like that?

phito•4mo ago

> Humans (or at least programmers) "build the tools to solve the problem" so intuitively and automatically

From my experience, only a few rare devs do this. Most will stick with (broken/wrong) GUI tools they know made by others, by convenience.

silisili•4mo ago

I've noticed Claude doing this for most tasks without even asking it to. Maybe a recent thing?

alecco•4mo ago

Yes. But not always. It's better if you add a line somewhere reminding it.

posix86•4mo ago

Cursor does this for me already all the time though, give that another shot maybe. For refactoring tasks in particular; it uses regex to find interesting locations , and the other day after maybe 10 of slow "ok now let me update this file... ok now let me update this file..." it suddenly paused, looked at the pattern so far, and then decided to write a python script to do the refactoring & executed it. For some reason it considered its work done even though the files didn't even pass linters but thats' polish.

alexchantavy•4mo ago

+1, cursor and Claude code do this automatically for me. Take a big analysis task and they’ll write python scripts to find the needles in the haystacks that I’m looking through

visarga•4mo ago

Yeah, I had Cursor refactor a large TypeScript file today and it used a script to do it. I was impressed.

miki123211•4mo ago

Codex is a lot better at this. It will even try this on its own sometimes. It also has much better sandboxing (which means it needs approvals far less often), which makes this much faster.

calgoo•4mo ago

Same here, I have a SQLite db that I have let it look over and extract data. I let it build the scripts then I run them as they would timeout if not and I don't want Claude sitting waiting for 30 min. So I do all the data investigations with Claude as a expert who can traverse the data much faster then me.

ozgrakkurt•4mo ago

How is this better than calling `cargo clippy` or similar commands yourself?

s900mhz•4mo ago

Claude can then proceed to fix the issues for you

mcqueenjordan•4mo ago

Presumably cargo clippy --fix was the intention. Not all things are fixable, though, which is where LLMs are reasonable for -- the squishy hard-to-autofix things.

hboon•4mo ago

How does Claude Code use the browser in your script/tool? I've always wanted to control my existing Safari session windows rather than a Chrome or a separate/new Chrome instance.

wraptile•4mo ago

Most browsers these days expose a control API (like ChromeDevtools Protocol MCP [1]) that open up a socket API and can take in json instructions for bidirectional communication. Chrome is the gold standard here but both Safari and Firefox have their own driver.

For you existing browser session you'd have to start it already with open socket connection as by default that's not enabled but once you do the server should able to find an open local socket and connect to it and execute controls.

worth nothing that this "control browser" hype is quite deceiving and it doesn't really work well imo because LLMs still suck at understanding the DOM so you need various tricks to optimize for that so I would take OP's claims with a giant bag of salt.

Also these automations are really easy to identify and block as they are not organic inputs so the actual use is very limited.

- https://github.com/ChromeDevTools/chrome-devtools-mcp/

raffraffraff•4mo ago

It's extremely handy too! If you try to use web automation tools like selenium or playwright on a website that blocks them, starting chrome browser with the debug port is a great way to get past Cloudflare's "human detector" before kicking off your automation. It's still a pain in the ass but at least it works and it's only once per session

wraptile•4mo ago

Note that while --remote-debugging-port itself cannot be discovered by cloudflare once you attach a client to it that can be detected as Chrome itself changes it's runtime to accomodate the connection even if you don't issue any automation commands. You need to patch the entire browser to avoid these detection methods and that's why there are so many web scraping/automation SAAS out there with their own browser versions as that's the only way to automate the web these days. You can't just connect to a consumer browser and automate undetected.

cess11•4mo ago

Isn't this what SeleniumBase does?

raffraffraff•4mo ago

True, it fails to get past the Cloudflare check if my playwright script is connected to the browser. But since these checks only happen on first visit to the site I'm ok with that.

kristopolous•4mo ago

I updated this thing that searches manpages better recently for the LLM era:

https://github.com/day50-dev/Mansnip

wrapping this in an STDIO mcp is probably a smart move.

I should just api-ify the code and include the server in the pip. How hard could this possibly be...

lelandfe•4mo ago

You know, I have heard some countries are making mansnip illegal these days

mac-attack•4mo ago

Definitely searched apt on Debian before I installed the pip pkg. On a somewhat related note, I also thought something broke when `uv tool install mansnip` didn't work.

kristopolous•4mo ago

Thanks I'll get on both of those. It's a minor project but I should make it work

the_real_cher•4mo ago

It will navigate and know how to fill out forms?

jmull•4mo ago

> Anyone who can't find use cases for LLMs isn't trying hard enough

That's an interesting viewpoint from an AI marketing company.

I think the essential job of marketing is to help people make the connection between their problems and your solutions. Putting all on them in a kind of blamey way doesn't seem like a great approach to me.

noahbrier•4mo ago

That's fair. But it's what I believe. I spend a lot of time inside giant companies and there are too many people waiting for stone tablets to come down the mountain with their use cases instead of just playing with this stuff.

Gerardo1•4mo ago

> That's fair. But it's what I believe.

That response suggests you aren't interested in discussion or conversation at all.

It suggests that your purpose here is to advertise.

noahbrier•4mo ago

No, that’s not what it means. You can read what I’ve written about this for the last three years and I’ve been very consistent. In the enterprise too many people are waiting to be told things and whether it’s good for my business or not I’d rather be honest about how I feel (you need to just use this stuff).

Gerardo1•4mo ago

>No, that’s not what it means.

That's fair but it's what I believe.

...see?

Being consistent with stating your beliefs isn't the same as engaging with and about those beliefs.

Advertising isn't conversation. Evangelism isn't discussion.

noahbrier•4mo ago

I'm trying to engage with you on this, but I'm really not sure what you're getting at. You originally stated "I think the essential job of marketing is to help people make the connection between their problems and your solutions. Putting all on them in a kind of blamey way doesn't seem like a great approach to me."

I agree that's the job of marketing, but I'm not someone who markets AI, I'm someone who helps large marketing organizations use it effectively. I agree that if my goal was to market it that wouldn't be an effective message, but my goal is for folks who work in these companies to take some accountability for their own personal development, so that's my message. Again, all I can do is be honest about how I feel and to be consistent in my beliefs and experiences working with these kinds of organizations.

tom_•4mo ago

That was somebody else that said that.

lupusreal•4mo ago

I agree, I was very skeptical until I started playing around with these tools and repeatedly got good results with almost no real effort.

Online discussion with randos about this topic is almost useless because everybody is quick to dismiss the other side as hopelessly brainwashed by hype, or burying their heads in the sand for fear of the future of their jobs. I've had much better luck talking about it with people I've known and had mutual respect with before all this stuff came out.

ploxiln•4mo ago

I've seen a bunch of big companies have edicts sent down from the top, "all employees should be using LLMs, if you're not then you're not doing your job". But many employees just don't have much that it applies to. Like, I spend a huge amount of time reviewing PRs. (Somebody, who actually knows stuff, has to do it.) Some of the more data-sci guys have added LLM review bots to some repos, but they're rather dumb and useless.

(And then the CISO sends some security tips email/slack announcement which is still dumb and useless even after an LLM added a bunch of emojis and fun language to it.)

I've always been an old-fashioned and slow developer. But it still seems to me, if most "regular" "average" developers churn out code that is more of a liability than an asset, if they can do that 10x faster, it doesn't really change the world. Most stuff still ends up waiting, in the end, for some slow work done right, or else gets thrown away soon enough.

zhengyi13•4mo ago

I think that a lot of very basic LLM use cases come down to articulating your need. If you're not the sort of person who's highly articulate, this is likely to be difficult for you.

I'm personally in the habit of answering even slightly complex questions by first establishing shared context - that is, I very carefully ensure that my conversational partner has exactly the same understanding of the situation that I do. I do this because it's frequently the case that we don't have a lot of overlap in our understanding, or we have very specific gaps or contradictions in our understanding.

If you're like many in this industry, you're working in a language other than what you were raised in, making all of this more difficult.

jmull•4mo ago

I do understand about enterprise decision-making.

I think it's the pengiun approach to risk management -- they know they need to jump in the water to get where they need to go, but they don't know where the orcas are. So they jostle closer and closer to the edge, some fall in, and the rest see what happens.

BTW, I probably shouldn't have only commenting on the small part at the end that annoyed me. I'm fascinated by the idea that LLMs make highly custom software feasible, like your "claudsidian" system... that people will be able to get the software they want by describing it rather than being limited to finding something preexisting and having to adapting to it. As you point out, the unix philosophy is one way -- simple, unopinionated, building blocks an LLM can compose out of user-level prompts.

eikenberry•4mo ago

> I think it's the pengiun approach to risk management -- they know they need to jump in the water to get where they need to go, but they don't know where the orcas are. So they jostle closer and closer to the edge, some fall in, and the rest see what happens.

Great way to describe the culture of fear prevalent at large companies.

skydhash•4mo ago

I read the whole thing and could still not figure out what they’re trying to solve. Which I’m pretty sure goes against the Unix philosophy. The one thing should be clearly defined to be able to declare that you solve it well.

noahbrier•4mo ago

What the company is trying to solve or what I'm solving with Claude Code?

skydhash•4mo ago

I read the title, I read the article and there’s nothing in the article that supports the claim made in the title.

Also about a tool being overly conplex. Something like find, imagemagick, ffmpeg,… are not complex in themselves. They’re solving a domain that itself is complex. But the tools are quite good the evidence is their stability where they’ve barely changed across decades.

noahbrier•4mo ago

This is my point. Those tools are great specifically because of the simplicity of how they expose their functionality.

mmphosis•4mo ago

and yet the tools are still difficult to use. I could Read The Fine Manual, web search, stackoverflow, post a question on a Bulletin Board, or ask the Generative Artificial Inference robot. A lot of this seems like our user interface preferences. For example, my preference is that I just intuitively know that -i followed by a filepath is the input file but why can't I just drag the video icon onto ffmpeg? What might be obvious to me is not necessarily exposed functionality that someone else can see right away.

skydhash•4mo ago

What you’re asking is the equivalent of “Why can’t I just press a button and have a plane takeoff, fly, and land by itself”. You can have a plane that does that, but only in a limited context. To program the whole decision tree for all cases is not economical (or feasible?).

ffmpeg does all things media conversion. If you don’t want to learn how to use it, you find someone that does (or do the LLM gamble) or try to find a wrapper that have a simpler interface and hope the limited feature set encompasses your use cases.

A cli tool can be extremely versatile. GUI is full of accidental complexities, so unless your selling point is intuitiveness, it’s just extra work.

articulatepang•4mo ago

What you’re solving with Claude Code. All I could gather was … something with your notes. Would you mind clearly stating 2-5 specific problems that you use Claude Code to solve with your notes?

noahbrier•4mo ago

I was on a podcast last week where I went into a ton of detail: https://every.to/podcast/how-to-use-claude-code-as-a-thinkin...

Basically I have it sitting over the top of my notes and assisting with writing, editing, researching, etc.

articulatepang•4mo ago

Thanks, I’ll take a look.

I love obsidian for the same basic reason you do: it’s just a bunch of text files, so I can use terminal tools and write programs to do stuff with them.

So far I mostly use LLMs to write the tools themselves, but not actually edit the notes. Maybe I can steal some of your ideas!

noahbrier•4mo ago

I started a repo if you want to play: https://github.com/heyitsnoah/claudesidian

clickety_clack•4mo ago

I disagree actually. Saying things like “everyone else managed to figure it out” is a way of creating FOMO. It might not be the way you want to do it, marketing doesn’t have to be nice (or even right) to work.

dsr_•4mo ago

I don't want to work with people who think that's good marketing, or people who are convinced by it.

FOMO is for fashions and fads, not getting things done.

clickety_clack•4mo ago

I’m responding to a comment that talks about whether that quote is good marketing so I’m just talking specifically about whether it might work from a marketing point of view.

I probably wouldn’t do it myself either, but that’s not really relevant to whether it works or not.

mrguyorama•4mo ago

"Good marketing" doesn't have to mean "Marketing that is maximally effective"

Filling food with opioids would be great for business, but hopefully you understand how that is not "good business"

clickety_clack•4mo ago

True, but you are arguing about the merit of the actual product, which neither I nor the comment I responded to were talking about at all. Marketing tactics can be applied to good and bad products, and FOMO is a pretty common one everywhere, from "limited remaining" to "early adopters lock in at $10/mo for life" to "everyone else is doing it".

mrguyorama•4mo ago

No, I am not arguing about the merits of the product, I am explicitly saying that using FOMO as a marketing tactic is shitty and bad and should make a person who does that feel bad.

I do not care that it is common. I want it to be not common.

I do not care that bad marketing tactics like this can be used to sell "good" products, whatever that means.

ryandrake•4mo ago

It's a totally backwards way to build a product.

You're supposed to start with a use case that is unmet, and research/build technology to enable and solve the use case.

AI companies are instead starting with a specific technology, and then desperately searching for use cases that might somehow motivate people to use that technology. Now these guys are further arguing that it should be the user's problem to find use cases for the technology they seem utterly convinced needs to be developed.

Frenchgeek•4mo ago

Is "finding a way to remove them, with prejudice, from my phone" a valid use case for them? I'm tired of Gemini randomly starting up.

(Well, I recently found there is a reason for it: I'm left handed and unlocking my phone with my left hand sometimes touch the icon stupidly put by default on the lock screen. Not that it would work: My phone is usually running with data disabled.)

dannyobrien•4mo ago

Another everything-is-new-again: https://github.com/steveyegge/efrit is Steve Yegge's drive-emacs-with-LLMs (I saw this mentioned via a video link elsewhere: https://www.youtube.com/watch?v=ZJUyVVFOXOc )

boredumb•4mo ago

I've done exactly this with MCP { "name": "unshare_exec", "description": "Run a binary in isolated Linux namespaces using unshare", "inputSchema": { "type": "object", "properties": { "binary": {"type": "string"}, "args": {"type": "array", "items": {"type": "string"}} }, "required": ["binary"], "additionalProperties": false } }

It started as unshare and ended up being a bit of a yakshaving endeavor to make things work but i was able to get some surprisingly good results using gemma3 locally and giving it access to run arbitrary debian based utilities.

all2•4mo ago

Would you be willing to share the sweater? Or the now-naked yak?

I'm curious to see what you've come up with. My local LLM experience has been... sub-par in most cases.

frumplestlatz•4mo ago

My experience has been the opposite — a shell prompt is too many degrees of freedom for an LLM, and it consistently misses important information.

I’ve had much better luck with constrained, structure tools that give me control over exactly how the tools behave and what context is visible to the LLM.

It seems to be all about making doing the correct thing easy, the hard things possible, and the wrong things very difficult.

AlexCornila•4mo ago

This more like … let’s change the way we code so LLMs and AI coding assist can reduce the error rate and improve reliability

sneak•4mo ago

I implore people who are willing and able to send the contents and indices of their private notes repository to cloud based services to rethink their life decisions.

Not around privacy, mind you. If your notes contain nothing that you wouldn’t mind being subpoenaed or read warrantlessly by the DHS/FBI, then you are wasting your one and only life.

warkdarrior•4mo ago

So your goal in your one and only life is to write notes in your code repo that you don't want subpoenaed?

sneak•4mo ago

“An idea that is not dangerous is unworthy of being called an idea at all.” —Oscar Wilde

blibble•4mo ago

LLMs are one large binary that does everything (maybe, if you are lucky today)

exact opposite of the unix philosophy

dragonwriter•4mo ago

> LLMs are one large binary that does everything

Well, no, they aren't, but the orchestration frameworks in which they are embedded sometimes are (though a lot of times a whole lot of that everything is actually done by separate binaries the framework is made aware of via some configuration or discovery mechanism.)

disconcision•4mo ago

wait until you find out about human programmers...

rhubarbtree•4mo ago

They write the individual tools that do the one specific thing?

kfarr•4mo ago

They’re one gigantic organic blob?

floatrock•4mo ago

sure, but that's not what we're talking about here.

the article is framing LLM's as a kind of fuzzy pipe that can automatically connect lots of tools really well. This ability works particularly well with unix-philosophy do-one-thing tools, and so being able to access such tools opens a superpower that is unique and secretly shiny about claudecode that browser-based chatgpt doesn't have.

btbuildem•4mo ago

It's more like a fluid/shapeless orchestrator that fuzzily interfaces between human and machine language, arising momentarily from a vat to take the exact form that fits the desired function, then disintegrates until called upon again.

tclancy•4mo ago

>The filesystem is a great tool to get around the lack of memory and state in LLMs and should be used more often.

This feels a bit like rediscovering stateless programming. Obviously the filesystem contents can actually change, but the idea of an idempotent result when running the same AI with the same command(s) and getting the same result would be lovely. Even better if the answer is right.

lockedinsuburb•4mo ago

If it's consistent one way or the other it would be great: consistently wrong, correct it, consistently right, reward it. It's the unpredictability and inconsistency that's a problem.

imiric•4mo ago

There's something deeply hypocritical about a blog that criticizes the "SaaS Industrial Complex"[1], while at the same time praising one of the biggest SaaS in existence, while also promoting their own "AI-first" strategy and marketing company.

What even is this? Is it all AI slop? All of these articles are borderline nonsensical, in that weird dreamy tone that all AI slop has.

To see this waxing poetic about the Unix philosophy, which couldn't be farther from the modern "AI" workflow, is... something I can't quite articulate, but let's go with "all shades of wrong". Seeing it on the front page of HN is depressing.

[1]: https://www.alephic.com/no-saas

Kim_Bruning•4mo ago

You know how people used to say the CLI is dead?

Now, due to tools like claude code, CLI is actually clearly the superior interface.

(At least for now)

It's not supposed to be an us vs them flamewar, of course. But it's fun to see a reversal like this from time to time!

GuB-42•4mo ago

I don't remember any advanced computer user, including developers saying that the CLI is dead.

The CLI has been dead for end-users since computers became powerful enough for GUIs, but the CLI has always been there behind the scenes. The closest we have been to the "CLI is dead" mentality was maybe in the late 90s, with pre-OSX MacOS and Windows, but then OSX gave us a proper Unix shell, Windows gave us PowerShell, and Linux and its shell came to dominate the server market.

MisterTea•4mo ago

> I don't remember any advanced computer user, including developers saying that the CLI is dead.

Obviously not around during the 90's when the GUI was blowing up thanks to Windows displacing costly commercial Unix machines (Sun, SGI, HP, etc.) By 2000 people were saying Unix was dead and the GUI was the superior interface to a computer. Visual Basic was magic to a lot of people and so many programs were GUI things even if they didn't need to be. Then the web happened and the tables turned.

nyrikki•4mo ago

That is a bit of a simplification, many users found value in wysiwyg, there was an aborted low-code visual programming movement.

Microsoft drank early OOP koolaid and thus powershell suffered from problems that were well covered by the time etc…

Ray Norda being pushed out after WordPerfect bought Novell with their own money and leveraged local religious politics in addition to typical corporate politics killed it.

Intel convinced major UNIX companies to drop their CPUs for IA-64 which was never delivered, mainly because the core decision was incompatible with the fundamental limitations of computation etc…

The rise of Linux, VMs and ultimately the cloud all depended on the CLI.

Add in Microsoft anticompetitive behavior plus everything else and you ended up with a dominant GUI os provider with a CLI that most developers found challenging to use.

I worked at some of the larger companies with large windows server installations and everyone of them installed Cygwin to gain access to tools that allowed for maintainable configuration management tools.

There are situations like WordPerfect which had GUI offerings be delayed due to the same problem that still plague big projects today, but by the time the web appeared Microsoft had used both brilliant and extremely unethical practices to gain market dominance.

The rise of technology that helped with graphics like vesa local bus and GPUs in the PC space that finally killed the remaining workstation vendors was actually concurrent with the rise of the web.

Even with that major companies like SGI mainly failed because they dedicated so many resources to low end offerings that they lost their competitiveness on the high end, especially as they fell into Intels trap with Itanium too.

But even that is complicated way beyond what I mentioned above.

ryandrake•4mo ago

This is kind of my "CLI bigotry" showing, but I think programming was always, quite naturally, a command-line occupation, but there was this brief period starting in the late 90s where a sizable number of practitioners were seduced by things like Visual Studio and Eclipse, and went over to the dark side. But nature is slowly restoring itself, and we're moving back to software development being text based tools inside a bunch of terminals, like god intended.

anthk•4mo ago

Automation and scripting won. The GUI was useful for the end user and, yes, VB was everywhere, but for tedious tasks the CLI it's still far superior and the TUI responsiveness was unmatched against the most powerful GUI.

discreteevent•4mo ago

> like god intended

Meanwhile John Carmack was using an IDE the whole time - Maybe he was just in a different realm.

I tend to agree with the trend of the parents comment. The CLI came along with the horde, like the english language or javascript.

woooooo•4mo ago

Eclipse and IDEA are just different tools for managing text. Yes, they run in GUI windows instead of a tty but that doesn't change their nature, they're not "visual programming".

soperj•4mo ago

> OSX gave us a proper Unix shell

BSD/Mach gave us that, OSX just included it in their operating system.

heavyset_go•4mo ago

There was a period in the early-mid 2000s where CLIs were considered passe and an emblem of the past. Some developers relied solely on graphical IDEs on GUI-oriented operating systems, and the transition to Linux everywhere broke that trend. Some people didn't take Linux seriously because it was CLI oriented.

Gud•4mo ago

I don’t remember that at all.

Maybe in some circles.

ok123456•4mo ago

Windows and Java developers used Visual Studio and Eclipse. Everyone else used command-line tools.

jd115•4mo ago

Funny, I've totally missed this.

heavyset_go•4mo ago

I was in an area with a lot of Windows and Java shops where this mentality percolated.

You don't remember the period where Linux was considered a joke compared to NT or "real" unices? Maybe I was just around a lot of elitists.

lupusreal•4mo ago

I think it might loop back around pretty quick. I've been using it to write custom GUI interfaces to streamline how I use the computer, I'm working piecemeal towards and entire desktop environment custom made to my own quirky preferences. In the past a big part of the reason I used the terminal so often for basic things was general frustration and discomfort using the mainstream GUI tools, but that's rapidly changing for me.

phito•4mo ago

Then it'll loop back again to CLI (or even direct system calls...) once human input won't be necessary anymore.

lupusreal•4mo ago

Maybe so! The way I see it going in the next few years is CLI tools remain valuable because they're useful to these coding agents. At the same time, users increasingly use GUIs which have been custom made or modified for them personally. We might even eventually get to the point where user-facing software becomes ephemeral, created on on the fly to meet the user's immediate needs.

theshrike79•4mo ago

My main problem with GUI tooling is that keyboard use is an afterthought in too many of them

With CLI and TUI tools it's keyboard first and the mouse might work if it wasn't too much of a hassle for the dev.

And another issue with GUI tooling is the lack of composability. With a CLI I can input files to one program grab the output and give it to another and another with ease.

With GUI tools I need to have three of them open at the same time and manually open each one. Or find a single tool that does all three things properly.

itissid•4mo ago

A few days ago I read an article from humnanlayer. They mentioned shipping a weeks worth of collaborative work in less than a day. That was one data point on a project.

- Has anyone found claude code been able to documentation for parts of the code which does not:

(a). Explode in maintenance time exponentially to help claude understand and iterate without falling over/hallucinating/design poorly?

(b). Use it to make code reviewers life easy? If so how?

I think the key issue for me is the time the human takes to *verify*/*maintain* plans is not much less than what it might take them to come up with a plan that is detailed enough that many AI models could easily implement.

AtlasBarfed•4mo ago

It is pretty tiresome with the hype tweets and not being able to judge the vibe code cruft and demoware factor.

Especially on bootstrap/setup, AIs are fantastic for cutting out massive amounts of time, which is a huge boon for our profession. But core logic? I think that's where the not-really-saving-time studies are coming from.

I'm surprised there aren't faux academic B-school productivity studies coming out to counter that (sponsored by AI funding of course) already, but then again I don't read B-school journals.

I actually wonder if the halflife decay of the critical mass of vibecode will almost perfectly coincide with the crash/vroosh of labor leaving the profession to clean it up. It might be a mini-y2k event, without such a dramatic single day.

micromacrofoot•4mo ago

Yeah absolutely, being so close to the filesystem gets Claude Code the closest experience I've had with an agent that can actually get things done. Really all the years of UIs we've created for each other just get in the way of these systems, and on a broader scale it will probably be more important than ever to have a reasonable API in your apps.

cadamsdotcom•4mo ago

Unix tools let agents act & observe in many versatile ways. That lets them close their loops. Taking yourself out of the loop lets your agent work far more efficiently.

But anything you can do on the CLI, so can an agent. It’s the same thing as chefs preferring to work with sharp knives.

marstall•4mo ago

Agreed - but don't WindSurf, Cursor and Copilot do all the same things now, but with choice of LLM & IDE integration?

mike_ivanov•4mo ago

All GUI apps are different, each being unhappy in its own way. Moated fiefdoms they are, scattered within the boundaries of their operating system. CLI is a common ground, an integration plaza where the peers meet, streams flow and signals are exchanged. No commitment needs to be made to enter this information bazaar. The closest analog in the GUI world is Smalltalk, but again - you need to pledge your allegiance before entering one.

p_ing•4mo ago

> Moated fiefdoms they are, scattered within the boundaries of their operating system.

Yet highly preferred over CLI applications to the common end user.

CLI-only would have stunted the growth of computing.

anthk•4mo ago

I'd love something like the Emacs approach. Multi-UI's. Graphical, but with an M-x (or anything else) command line prompt in order to do UI tasks scriptable, from within the application or from the outside.

skydhash•4mo ago

Emacs is smalltalk with characters instead of pixels.

igouy•4mo ago

Smalltalk does characters and pixels.

array_key_first•4mo ago

We have systems for highly interoperable and compostable GUI applications - think NextSTEP or, modern day, dbus, to a lesser extent.

Really, GUIs can be formed of a public API with graphics slapped on top. They usually aren't, but they can be.

antonvs•4mo ago

Many modern web apps are just APIs with a browser GUI.

wpm•4mo ago

I weep for what happened to AppKit/Cocoa

mike_ivanov•4mo ago

I'd say ROS (Robot Operating System) is the closest to this ideal.

topaz0•4mo ago

Just because it says compostable on the container doesn't mean it will actually break down in a reasonable amount of time on your home compost heap, or that they don't leach some environmentally harmful chemicals in the process.

jadeopteryx•4mo ago

Apple ShortCuts and AppleScript integration is also cool.

sarbajitsaha•4mo ago

How is codex now compared to Claude code? Especially with gpt 5 high for planning and codex for the coding part.

jasonsb•4mo ago

I wouldn’t say it’s particularly good, at least not based on my limited experience. While some people argue it’s significantly better, I’ve found it to be noticeably slower than Claude, often taking two to three times longer to complete the same tasks. That said, I’m open to giving it another try when I have more time; right now my schedule is too tight to accommodate its slower pace.

troupo•4mo ago

I've found it slower than Claude but:

- significantly less obsequious (very few "you're absolutely right" that Claude vomits out on every interaction)

- less likely to forget and ignore context and AGENTS.md instructions

- fewer random changes claiming "now everything is fixed" in the first 30-50% of context

- better understanding of usage rules (see link below), one-shotting quite a few things Claude misses

Language + framework: Elixir, Phoenix LiveView, Ash, Oban, Reactor

SLoC: 22k lines

AGENTS.md: some simple instructions, pointing to two MCPs (Tidewave and Stripe), requirement to compile before moving onto next file, usage rules https://hexdocs.pm/usage_rules/readme.html

simonw•4mo ago

Codex CLI had some huge upgrades in the past few months.

Before the GPT-5 release it was a poor imitation IMO - in the macOS terminal it somehow even disabled copy and paste!

Codex today is almost unrecognizable in comparison to that version. It's really good. I use both it and Claude Code almost interchangeably at the moment and I'm not really feeling that one is notably better than the other.

tptacek•4mo ago

My terminal agent is a Go bubbletea app and it too disables copy/paste and I haven't bothered to figure out why. Of course, I am also not the 5th most visited site on the Internet (more's the pity).

simonw•4mo ago

One workaround I found as using iTerm2 instead of Terminal.app and then using a weird keyboard combo, I think it was Cmd + Option + mouse drag.

fragmede•4mo ago

Between codex and Claude code, I am using both on two entirely different projects, so it's not a fair comparison, but the two things Claude has in my unscientific testing is a better plan mode (though having to use the magic password ultrathink belies needing to learn how to prompt it properly) resulting in a better and longer Todo.md (but again, the two projects between the two are very different, so don't consider it a scientific comparison). The other thing that Claude code has that codex does not is a more advanced ability to run things in the background, both as in "npm run dev"/other webserver in the background and being able to run curl/whatever against it, and gather logs from the webserver, error message or otherwise. The other thing is Claude code can be told to run things "in sub-agents" and work on multiple things simultaneously. If there is a way to get codex to do it, it won't tell me and says it can't do that. Not that I necessarily believe it. My experience with codex is that, despite instructions in agents.md, it'll forget and get its wires crossed. I'm using --model gpt-5-codex, and codex reports version 0.42.0.

jzig•3mo ago

Codex CLI doesn't have a plan mode right? So how are you planning?

fragmede•3mo ago

I tell it explicitly in the prompt. "Make a plan to make thing, Ask me clarifying questions. Save the plan in todo.md"

dudeinhawaii•4mo ago

I've found GPT-5-Codex (the model used by default by OpenAI Codex CLI) to be superior but, as others have stated, slower.

Caveat, requires a linux environment, OSX, or WSL.

In general, I find that it will write smarter code, perform smarter refactors, and introduce less chaos into my codebase.

I'm not talking about toy codebases. I use agents on large codebases with dozens of interconnected tools and projects. Claude can be a bit of a nightmare there because it's quite myopic. People rave about it, but I think that's because they're effectively vibe-coding vastly smaller, tight-scoped things like tools and small websites.

On a larger project, you need a model to take the care to see what existing patterns you're using in your code, whether something's already been done, etc. Claude tends to be very fast but generate redundant code or comical code (let's try this function 7 different ways so that one of those ways will pass). This is junior coder bullshit. GPT-5-Codex isn't perfect but there's far far less of that. It takes maybe 5x longer but generates something that I have more confidence in.

I also see Codex using tools more in smart ways. If it's refactoring, it'll often use tools to copy code rather than re-writing the code. Re-writing code is how so many bugs have been introduced by LLMs.

I've not played with Sonnet 4.5 yet so it may have improved things!

phainopepla2•4mo ago

Codex-cli doesn't require Linux or WSL. I've been using it on Windows all week. That said, the model seems to get confused by Powershell from time to time, but who doesn't?

diggan•4mo ago

You should really try it in WSL or proper Linux, the experience is vastly different. I've mostly been using Codex (non-interactively though) for a long time on Linux. I tried it out on Windows just the other day for the first time and quoting + PowerShell seems to really confuse it. It was borderline unusable for me as it spent most of the reasoning figuring out the right syntax of the tooling, on Linux there is barely anything of that.

phainopepla2•4mo ago

You're right. Tried it out this morning, and it uses fewer tokens and gets the job done quicker, not wasting so much time on the Powershell nonsense. I was resisting setting up WSL because this is just a temporary workstation, but it was worth it

theshrike79•4mo ago

Codex with gpt-5-codex (high) is like an outsourced consultant. You give them the specs and a while later you get the output. It doesn't communicate much during the run (especially the VSCode plugin is really quiet).

Then you check the result and see what happened. It's pretty good at one-shotting things if it gets the gist, but if it goes off the rails you can't go back three steps and redirect.

On the other hand Claude Code is more like pair programming, it's chatting about while doing things, telling you what it's doing and why "out loud". It's easier to interrupt it when you see it going off track, it'll just stop and ask for more instructions (unlike Copilot where if you don't want it to rm the database.file you need to be really fast and skip the operation AND hit the stop button below the chatbox).

I use both regularly, GPT is when I know what to do and have it typed out. Claude is for experimenting and dialogue like "what would be a good idea here?" type of stuff.

dizhn•4mo ago

This says Claude Code but seems like it would apply to Gemini CLI (and its clones like iflow, qwen), opencode, aider etc too as well as work with any decent model out there. I haven't used claude code but these CLIs and models (deepseek, qwen, kimi, gemini, glm, even grok) are quite capable.

hoechst•4mo ago

Just because a popular new tool runs in the terminal, doesn't make it a shining example for the "Unix philosophy" lol. the comparison makes no sense if you think about it for more than 5 seconds and is hacker news clickbait you and i fell for :(

Spivak•4mo ago

1. Small programs that do a single thing and are easy to comprehend.

2. Those programs integrate with one another to achieve more complex tasks.

3. Text streams are the universal interface and state is represented as text files on disk.

Sounds like the UNIX philosophy is a great match for LLMs that use text streams as their interface. It's just so normalized that we don't even "see" it anymore. The fact that all your tools work on files, are trivially callable by other programs with a single text-based interface of exec(), and output text makes them usable and consumable by an LLM with nothing else needed. This didn't have to be how we built software.

drdaeman•4mo ago

Right, and Claude Code is a large proprietary monolith. There’s nothing particularly UNIXy about it except that it can fork/execve to call ripgrep (or whatever), and that its CLI can use argv or stdin to receive inputs. That’s nowhere enough to make it “UNIX way”.

pixl97•4mo ago

I mean, you are also a large proprietary monolith. I can't exactly take the run a shell part of your brain and leave the 8 hours of sleep part out. That seemingly is a limitation of intelligence we've not got around yet.

The fact that the AI interpreter will use small commands makes it very useful.

drdaeman•4mo ago

Uh, my only objection way about talking about Claude Code as “UNIX way/philosophy”. While there are some similarities, they’re nowhere sufficient to call it so.

(As far as I’m aware our brains are opposite of UNIX, starting right from the fact they had evolved and were not designed at all. And the article is about Claude and not me.)

mediaman•4mo ago

The purpose is not so much Claude Code, but that LLMs running semi-autonomously in a shell environment with access to Unix tools are powerful in a way that a web chatbot is not. Replace Claude Code with some other TUI such as opencode and, modulo some of the specifics of CC's implementation, the truth still stands.

drdaeman•4mo ago

In context of UNIX philosophy, something like llm(1) is probably a better option.

https://llm.datasette.io/en/stable/

boxed•4mo ago

The point is that Claude Code is USING the unix world. Not that it IS the unix way.

tentahedronic•4mo ago

UNIX is proprietary too, and also a kinda large monolith. UNIX philosophy or UNIX way is just the standard lingo to describe using many single-purpose (usually command line) programs to accomplish your task, there's nothing more to it, so this AI tool definitely is UNIXy in this way. I'm not old enough to even remember if that was how folks even actually used UNIX systems most of the time back in the days, but it doesn't really matter, that's just how we talk these days.

simonw•4mo ago

The Unix philosophy here is less about it being a terminal app (it's a very rich terminal app, lots of redrawing the whole screen etc) and more about the fact that giving a modern LLM the ability to run shell commands unlocks an incredibly useful array of new capabilities.

An LLM can do effectively anything that a human can do by typing commands into a shell now.

anthk•4mo ago

And to create Lovecraftian horrors to fix.

theshrike79•4mo ago

With a Lovecraftian iä iä iä -chant you cant git reset Cthulhu back to its dimension.

With LLM Agents you can :D

lottin•4mo ago

I can't imagine a situation in which I'd want to explain what I want to do on the command line to an LLM, instead of typing the commands myself.

simonw•4mo ago

Use ffmpeg to extract the audio from the first ten seconds of an mp4 file and save it as mp3.

scuff3d•4mo ago

The top three comments I saw also all sound like they were written by an LLM that was told to advertise itself.

mac-attack•4mo ago

Please read the article before commenting. It actually references the tenets of Unix and how it is ideal for tool calling

BenoitEssiambre•4mo ago

A CLI might be the most information theoretically efficient form of API, significantly more succinct than eg. JSON based APIs. It's fitting that it would be optimal for Claude Code given the origin of the name "Claude".

Information theoretic efficiency seems to be a theme of UNIX architecture: https://benoitessiambre.com/integration.html.

supportengineer•4mo ago

It's not clear to me how much of my local data and files will be sent out over the network - so this can't be used in some settings.

ryancnelson•4mo ago

This really resonated with me, it's echoing the way i've come to appreciate Claude-code in a terminal for working in/on/with unix-y hosts.

A trick I use often with this pattern is (for example): 'you can run shell commands. Use tmux to find my session named "bingo", and view the pane in there. you can also use tmux to send that pane keystrokes. when you run shell commands, please run them in that tmux pane so i can watch. Right now that pane is logged into my cisco router..."

luhsprwhkv2•4mo ago

The Claude and Obsidian combo is great. You can offload all the hard parts of managing the workflow to the robots. I've taken to analyzing my daily notes—a stream-of-consciousness mind dump—for new Zettel notes, projects, ideas, and so on. Gemini does just fine, too, though.

ilteris•4mo ago

Do you automate this process

luhsprwhkv2•4mo ago

No, I just write things in the work journal whenever I feel like it. It’s a mind dump. Then when I’m doing the traditional end of the week review, I’ll analyze it. It’s all very free form.

It works wonders though, like spelunking your raw thoughts.

phito•4mo ago

I would love to do this, but obviously the privacy concerns make it not worth it for me. And no, local LLMs are not an option for now.

luhsprwhkv2•4mo ago

Yes, that's true. I'm keeping it to work-related mind-dumps only. I definitely self-edit before I do so. I find there's tons of value in doing it, but it's not totally raw thoughts.

SamPatt•4mo ago

One point to add about integration between LLMs and Obsidian: plugins.

Obsidian has a plugin system that can be easily customized. You can run your own JS scripts from a local folder. Claude Code is excellent at creating and modifying them on the fly.

For example, I built a custom program that syncs Obsidian files with a publish flag to my Github repo, which triggers a netlify build. My website updates when I update my vault and run a sync.

pqs•4mo ago

I do the same with Emacs and howm-mode. My system has improved a lot since I started using Claude Code. Now I implemented all the features I missed from Evernote!

paraknight•4mo ago

The author may like https://omnara.com/ (no affiliation) instead of SSH-ing from their phone. I have a similar setup with Obsidian and a permanently-on headless Claude Code for my PKM that I can access through the phone app.

globular-toast•4mo ago

Yep. If you're already familiar with Unix, Claude Code doesn't even seem that amazing. The idea of composing simple things together into data pipelines is incredibly powerful. It seems every generation rediscovers this. Kleppmann wrote about it in his book. ESR has a whole book about it.

sakoht•4mo ago

If you like Claude Code or Codex try Warp. It makes both look like a pale shadow of it. And way way cheaper monthly bill.

BatteryMountain•4mo ago

My mind was blown when claude randomly called adb/logcat on my device connected via usb & running my android app, ingesting the real time log streams to debug the application in real time. Mind boggling moment for me. All because it can call "simple" tools/cli application and use their outputs. This has motivated me to adjust some of my own cli applications & tools to have better input, outputs and documentation, so that claude can figure them out out and call them when needed. It will unlock so many interesting workflows, chaining things together (but in a clever way).

ACCount37•4mo ago

I have some repair shop experience, and in my experience, a massive bottleneck in repairing truly complex devices is diagnostics. Often, things are "repaired" by swapping large components until the issue goes away, because diagnosing issues in any more detail is more of an arcane art than something you can teach an average technician to do.

And I can't help but think: what would a cutting edge "CLI ninja" LLM like Claude be able to do if given access to a diagnostic interface that exposes all the logs and sensor readings, a list of known common issues and faults, and a full technical reference manual?

BatteryMountain•4mo ago

So try it. Ask claude to call the tool that tails the diagnostics/logs. For some languages, like in android or C#, simply running the application generates a ton of logs, never mind on OS level, which has more low-level stuff. Claude reads through it really well and can find bugs for you. You can tell it what you are looking for, tell it a common/correct set of data or expectations, so it can compare it to what it finds in the logs. It solved an issue for me in 2 minutes that I wasn't able to solve in a couple of months. Basically anything you can run and see output for in the terminal, claude can do the same and analyse it at the same time.

ACCount37•4mo ago

For many cases, I'd have to build the tooling first. For many more, the vendor would have to build the tooling into their products first.

Cars have the somewhat standardized OBD ports that you could pry the necessary data out from, but industrial robots or vending machines or smartphones? They sure don't.

But what inspires this line of inquiry is exactly the kind of success I had just feeding random error logs to AI and letting it sift through them for clues. It doesn't always work, but it works just often enough to make me wonder about the broader use cases.

jvanderbot•4mo ago

Ah, would you have to build it or would "you" (an AI) have to build it.

resonious•4mo ago

Similar thing happened to me when it busted out the AWS CLI and figured out a problem with my terraform.

eddyfromtheblok•4mo ago

Yes, and on top of this, having MCP servers that can reference AWS docs and terraform provider docs has been a godsend

ljm•4mo ago

This is also a fantastic way for someone to learn the principle of least privilege by setting up a very strict IAM profile for the agent to use without the risk of nuking the system.

user3939382•4mo ago

I keep reading these top voted posts about things I was onto several months ago.

assimpleaspossi•4mo ago

Of note: Occasionally I ask a FreeBSD question. Claude (and I think others) insist on using Bash even though I've told it for quite a long time now that FreeBSD does not natively use or install Bash. For which is humbly apologizes but will continue to do so on the next question.

thinkingtoilet•4mo ago

That's how describe AI code agents to non-coders. If you ask it what 2 + 2 is, it will say "5". And if you tell it that's wrong and ask it to do it again, it will say, "You're right! My last answer was incorrect. The correct answer is 2 + 2 is 5."

Epstein files reveal deeper ties to scientists than previously known

Red teamers arrested conducting a penetration test

Show HN: Open-source AI powered Kubernetes IDE

Show HN: Lucid – Use LLM hallucination to generate verified software specs

AI Doesn't Write Every Framework Equally Well

Aisbf – an intelligent routing proxy for OpenAI compatible clients

Let's handle 1M requests per second

OpenClaw Partners with VirusTotal for Skill Security

Goal: Ship 1M Lines of Code Daily

Show HN: Codex-mem, 90% fewer tokens for Codex

FastLangML: FastLangML:Context‑aware lang detector for short conversational text

LineageOS 23.2

Crypto Deposit Frauds

Substack makes money from hosting Nazi newsletters

Framing an LLM as a safety researcher changes its language, not its judgement

Are there anyone interested about a creator economy startup

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

2003: What is Google's Ultimate Goal? [video]

Roger Ebert Reviews "The Shawshank Redemption"

Busy Months in KDE Linux

Zram as Swap

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Open-source framework for tracking prediction accuracy

India's Sarvan AI LLM launches Indic-language focused models

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

Epstein files reveal deeper ties to scientists than previously known

Red teamers arrested conducting a penetration test

Show HN: Open-source AI powered Kubernetes IDE

Show HN: Lucid – Use LLM hallucination to generate verified software specs

AI Doesn't Write Every Framework Equally Well

Aisbf – an intelligent routing proxy for OpenAI compatible clients

Let's handle 1M requests per second

OpenClaw Partners with VirusTotal for Skill Security

Goal: Ship 1M Lines of Code Daily

Show HN: Codex-mem, 90% fewer tokens for Codex

FastLangML: FastLangML:Context‑aware lang detector for short conversational text

LineageOS 23.2

Crypto Deposit Frauds

Substack makes money from hosting Nazi newsletters

Framing an LLM as a safety researcher changes its language, not its judgement

Are there anyone interested about a creator economy startup

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

2003: What is Google's Ultimate Goal? [video]

Roger Ebert Reviews "The Shawshank Redemption"

Busy Months in KDE Linux

Zram as Swap

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Open-source framework for tracking prediction accuracy

India's Sarvan AI LLM launches Indic-language focused models

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

Unix philosophy and filesystem access makes Claude Code amazing

Comments