"Claw" captures what the existing terminology missed, these aren't agents with more tools (maybe even the opposite), they're persistent processes with scheduling and inter-agent communication that happen to use LLMs for reasoning.
Perfect is the enemy of good. Claw is good enough. And perhaps there is utility to neologisms being silly. It conveys that the namespace is vacant.
PHD in neural networks under Fei-Fei Li, founder of OpenAI, director of AI at Tesla, etc. He knows what he's talking about.
It's as irrelevant as George Foreman naming the grill.
What even happened to https://eurekalabs.ai/?
I'll live up to my username and be terribly brave with a silly rhetorical question: why are we hearing about him through Simon? Don't answer, remember. Rhetorical. All the way up and down.
- doesnt do its own sandboxing (I'll set that up myself)
- just has a web UI instead of wanting to use some weird proprietary messaging app as its interface?
You can sandbox anything yourself. Use a VM.
It has a web ui.
TBH maybe I should just vibe code my own...
An ai that you let loose on your email etc?
And we run it in a container and use a local llm for "safety" but it has access to all our data and the web?
Basically cron-for-agents.
Before we had to go prompt an agent to do something right now but this allows them to be async, with more of a YOLO-outlook on permissions to use your creds, and a more permissive SI.
Not rocket science, but interesting.
I still don't see a way this wouldn't end up with my bank balance being sent to somewhere I didn't want.
In any case, the data that will be provided to the agent must be considered compromised and/or having been leaked.
My 2 cents.
There might be similar safeguards for posting to external services, which might require direct confirmation or be performed by fresh subagents with sanitized, human-checked prompts and contexts.
One is that it relentlessly strives thoroughly to complete tasks without asking you to micromanage it.
The second is that it has personality.
The third is that it's artfully constructed so that it feels like it has infinite context.
The above may sound purely circumstantial and frivolous. But together it's the first agent that many people who usually avoid AI simply LOVE.
What OpenClaw did is to show the messages that this is in fact possible to do. IMHO nobody is using it yet for meaningful things, but the direction is right.
macOS is the only game in town if you want easy access to iMessage, Photos, Reminders, Notes, etc and while Macs are not cheap, the baseline Mac Mini is a great deal. A raspberry Pi is going to run you $100+ when all is said and done and a Mac Mini is $600. So let’s call it. $500 difference. A Mac Mini is infinitely more powerful than a Pi, can run more software, is more useful if you decide to repurpose it, has a higher resale value and is easier to resell, is just more familiar to more people, and it just looks way nicer.
So while iMessage access is very important, I don’t think it comes close to being the only reason, or “it”.
I’d also imagine that it might be easier to have an agent fake being a real person controlling a browser on a Mac verses any Linux-based platform.
Note: I don’t own a Mac Mini nor do I run any Claw-type software currently.
Next flood of (likely heavily YC-backed) Clawbase (Coinbase but for Claws) hosting startups incoming?
That does sound like the worst of both worlds: You get the dependency and data protection issues of a cloud solution, but you also have to maintain a home server to keep the agent running on?
Most of the time, users (or the author himself) submit this blog as the source, when in fact it is just content that ultimately just links to the original source for the goal of engagement. Unfortunately, this actually breaks two guidelines: "promotional spam" and "original sourcing".
From [0]
"Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity."
and
"Please submit the original source. If a post reports on something found on another site, submit the latter."
The moderators won't do anything because they are allowing it [1] only for this blog.
Regardless thanks for the tip
Just because something is popular doesn't make it bad.
and why would anyone down vote you for calling this out, like who wants to see more low effort traffic-grab posts like this?
Now check how many times he links to his blog in comments.
Actually, here, I'll do it for you: He has made 13209 comments in total, and 1422 of those contain a link to his blog[0]. An objectively ridiculous number, and anyone else would've likely been banned or at least told off for self-promotion long before reaching that number.
[0] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...
But this isn't my site and I don't get to pick the rules.
HN really needs a way to block or hide posts from some users.
This is so relatable. I remember trying to set up an LLM gateway back in 2023. There were at least 3 different teams that blocked our rollout for months until they worked through their backlog. "We're blocking you, but you’ll have to chase and nag us for us to even consider unblocking you"
At the end of all that waiting, nothing changed. Each of those teams wrote a document saying they had a look and were presumably just happy to be involved somehow?
I do know the feeling you're talking about though, and probably a better balance is somewhere in the middle. Just wanted to add that the solution probably isn't "Let devs deploy their own services without review", just as the solution probably also isn't "Stop devs for 6 months to deploy services they need".
One of the lessons in that book is that the main reasons things in IT are slow isn't because tickets take a long time to complete, but that they spend a long time waiting in a queue. The busier a resource is, the longer the queue gets, eventually leading to ~2% of the ticket's time spent with somebody doing actual work on it. The rest is just the ticket waiting for somebody to get through the backlog, do their part and then push the rest into somebody else's backlog, which is just as long.
I'm surprised FAANGs don't have that part figured out yet.
The security concerns are valid, I can get anyone running one of these agents on their email inbox to dump a bunch of privileged information with a single email..
I am one of those people and I work at a FANG.
And while I know it seems annoying, these teams are overwhelmed with not only innovators but lawyers asking so many variations of the same question it's pretty hard to get back to the innovators with a thumbs up or guidance.
Also there is a real threat here. The "wiped my hard drive" story is annoying but it's a toy problem. An agent with database access exfiltrating customer PII to a model endpoint is a horrific outcome for impacted customers and everyone in the blast radius.
That's the kind of thing keeping us up at night, not blocking people for fun.
I'm actively trying to find a way we can unlock innovators to move quickly at scale, but it's a bit of a slow down to go fast moment. The goal isn't roadblocks, it's guardrails that let you move without the policy team being a bottleneck on every request.
Isn't the whole selling point of OpenClaw that you give it valuable (personal) data to work on, which would typically also be processed by 3rd party LLMs?
The security and privacy implications are massive. The only way to use it "safely" is by not giving it much of value. Maybe for some open source project where everything is public anyway, but then why does it need to take over the whole device?
Though with the recent layoffs and stuff, the security in Amazon was getting better. Even the best-practices for IAM policies that was the norm in 2018, is just getting enforced by 2025.
Since I had a background of infosec, it always confused me how normal it was to give/grant overly permissive policies to basically anything. Even opening ports to worldwide (0.0.0.0/0) had just been a significant issue in 2024, still, you can easily get away with by the time the scanner finds your host/policy/configuration...
Although nearly all AWS accounts managed by Conduit (internal AWS Account Creation and Management Service), the "magic-team" had many "account-containers" to make all these child/service accounts joining into a parent "organization-account". By the time I left, the "organization-account" had no restrictive policies set, it is up to the developers to secure their resources. (like S3 buckets & their policies)
So, I don't think the policy folks are overall wrong. In the best case scenario, they do not need to exist in the first place! As the enforcement should be done to ensure security. But that always has an exception somewhere in someone's workflow.
https://github.com/sipeed/picoclaw
another chinese coompany m5stack provides local LLMs like Qwen2.5-1.5B running on a local IoT device.
https://shop.m5stack.com/products/m5stack-llm-large-language...
Imagine the possibilities. Soon we will see claw-in-a-box for less than $50.
Completely safe and normal software engineering practice.
If an agent is curling untrusted data while holding access to sensitive data or already has sensitive data loaded into its context window, arbitrary code execution isn't a theoretical risk; it's an inevitability.
As recent research on context pollution has shown, stuffing the context window with monolithic system prompts and tool schemas actively degrades the model's baseline reasoning capabilities, making it exponentially more vulnerable to these exact exploits.
Among many more of them with similar results. This one gives a 39% drop in performance.
https://arxiv.org/abs/2506.18403
This one gives 60-80% after multiple turns.
Excluding the fact that you can run LLMs via ollama or similar directly on the device, but that will not have a very good token/s speed as far as I can guess...
bjackman•1h ago
fxj•44m ago