Ask HN: How do you safely give LLMs SSH/DB access?

85•nico•3w ago

I have been using Claude Code for DevOps style tasks like SSHing into servers, grepping logs, inspecting files, and querying databases

Overall it's been great. However, I find myself having to review every single command, a lot of which are repetitive. It still saves me a ton of time, but it's quickly becoming a bit tedious

I wish I could give the agent some more autonomy. Like giving it a list of pre-approved commands or actions that it is allowed to run over ssh

For example:

    OK: ls, grep, cat, tail
    Not OK: rm, mv, chmod, etc
    OK: SELECT queries
    Not OK: INSERT, DELETE, DROP, TRUNCATE

Has anyone successfully or satisfactorily solved this?

What setups have actually worked for you, and where do you draw the line between autonomy and risk?

Comments

stephendause•3w ago

There is an example of [dis]allowing certain bash commands here: https://code.claude.com/docs/en/settings

As for queries, you might be able to achieve the same thing with usage of command-line tools if it's a `sqlite` database (I am not sure about other SQL DBs). If you want even more control than the settings.json allows, you can use the claude code SDK.

nico•3w ago

Great pointers, thank you

How would you go about allowing something like `ssh user@server "ls somefolder/"` but disallowing `ssh user@server "rm"`?

Similarly, allow `ssh user@server "mysql \"SELECT...\""`, but block `ssh user@server "mysql \"[UPDATE|DELETE|DROP|TRUNCATE|INSERT]...\""` ?

Ideally in a way that it can provide more autonomy for the agent, so that I need to review fewer commands

stephendause•3w ago

I don't know; I've never done something like that. If no one else answers, you can always ask Claude itself (or another chatbot). This kind of thing seems tricky to get right, so be careful!

nico•3w ago

Yup definitely tricky. Unfortunately Claude sucks at answering questions about itself, I've usually had better luck with ChatGPT. Will see how it goes

onmai-xyz•3w ago

If you control the ssh server it can be configured to only allow what you want. Certainly tedious but I would consider it worth while as it stands with agents being well, agentic.

ktm5j•3w ago

Sounds like this might help: https://www.gnu.org/software/bash/manual/html_node/The-Restr...

I'm not familiar with rbash, but it seems like it can do (at least some of) what you want.

christophilus•3w ago

I run my agents in containers, and only put stuff in those containers that I'm happy obliterating.

nico•3w ago

Do you use Claude Code? Do you say "Yes, and don't ask again" for all the commands, since you don't mind breaking things inside the container?

NitpickLawyer•3w ago

> claude --dangerously-skip-permissions

But do not run this on prod servers! You cannot prompt your way into the agent not doing something stupid from time to time.

Also blacklisting commands doesn't work (they'll try different approaches until something works).

Terr_•3w ago

I imagine your best bet are exactly the same tools for a potentially-malicious human user: Separate user account, file permissions, database user permissions, etc.

nico•3w ago

This is probably the safest thing to do, also the most time consuming

It would be nice to just be able to solve it through instructions to the agent, instead of having to apply all the other things for each application/server/database that I'd like to give it access to

wrs•3w ago

That would be nice. If only the agent had the ability to limit itself to your instructions.

ljm•3w ago

Yeah but this is like exposing `sudo eval $input` as a web service and asking the clients to please, please, not do anything bad.

Can create scripts or use stuff like Nix, Terraform, Ansible or whatever to automate the provisioning of restricted read only accounts for your servers and DBs.

cvhc•3w ago

The restrictions have to be enforced by the non-LLM deterministic control logics (in the OS/database/software, or the agent's control plane). It cannot be just verbal instructions and you expect the LLM not to generate certain sequences of tokens.

What I imagine is you might instruct an agent to help you set up the restrictions for various systems to reduce the toil. But you should still review what the agent is going to do and make sure nothing stupid is done (like: using regexes to filter out restricted commands).

maxbond•3w ago

That's equivalent to client-side security.

ahepp•3w ago

Shouldn't you already be using low privilege accounts for stuff like gathering information about prod?

Overprivileged accounts is a huge anti-pattern for humans too. People make mistakes. Insider threats happen. Part of ops is making it so users don't have privileges to do damage without appropriate authorization.

simonw•3w ago

For database stuff most databases like PostgreSQL have robust permissions mechanisms built in.

No need to mess around with regular expressions against SQL queries when you can instead give the agent a PostgreSQL user account that's only allowed read access to specific tables.

nico•3w ago

You are right, and that's great for queries

How do you provide db access? For example, to access an RDS db, you have to connect from within the AWS/EC2 environment, which means either providing the agent ssh access to a server, from which it can run psql, or creating a tunnel

Additionally, with multiple apps/dbs, that means having to do the setup multiple times. It would be nice to be able to only configure the agent instead of all the apps/dbs/servers

tracker1•3w ago

You can't provide an existing ssh tunnel with a port for said database yourself, locally?

browningstreet•3w ago

"aws iam service accounts"

gunalx•3w ago

Never gibe perms to begin with. Anything the chatbot has access to fuckup it eventually will. So the problem is inherently flawed, but.

Use db permissions with read only, and possibly only a set of prepared statements. Give it a useraccount with read-only acces maybe

JoshTriplett•3w ago

Don't.

Among the many other reasons why you shouldn't do this, there are regularly reported cases of AIs working around these types of restrictions using the tools they have to substitute for the tools they don't.

Don't be the next headline about AI deleting your database.

nico•3w ago

> Don't

Do you mean "Don't give it more autonomy", or "Don't use it to access servers/dbs" ?

I definitely want to be cautious, but I don't think I can go back to doing everything manually either

JoshTriplett•3w ago

I mean, both, but in this case I'm saying "don't use it to access any kind of production resource", with a side order of "don't rely on simple sandboxing (e.g. command patterns) to prevent things like database deletions".

dsr_•3w ago

Why aren't you using the tools we already have: ansible, salt, chef, puppet, bcfg2, cfengine... every one of which was designed to do systems administration at scale.

dpoloncsak•3w ago

"Why would you use a new tool when other tools already exist?".

Agents are here. Maybe a fad, maybe a mainstay. Doesn't hurt to play around with them and understand where you can (and can't) use them

dsr_•3w ago

Play and production need to be separate domains. Otherwise, you don't have production, you only have play.

dpoloncsak•3w ago

Okay...? Agreed. I still don't think the answer to "How are you guys giving LLMs access to your DBs?" is "Don't".

Nowhere did OP or any of the comments in the chain specify they were testing Claude in production.

bigstrat2003•3w ago

You have to choose between laziness or having systems that the LLM can't screw up. You can't have both.

hephaes7us•3w ago

You can have it write code that you review (with whatever level of caution you wish) and then run that on real data/infrastructure.

You get a lot of leverage that way, but it's still better than letting AI use your keys and act with full autonomy on stuff of consequence.

ninju•3w ago

https://www.pcmag.com/news/vibe-coding-fiasco-replite-ai-age...

codingdave•3w ago

You need to secure the account an LLM-based app runs under, just like you would any user, AI or not. When you hire real people, do you grant them full privileges on all systems and just ask them not to touch things they shouldn't? No, you secure their accounts to the specific privileges they need, and no more. Do the same with AI.

icedchai•3w ago

You'd be surprised. I've worked at multiple startups where employees were given prod access with zero oversight on day one: AWS, sudo access, database passwords, everything. The one startup that didn't do that never launched. Occasionally there were accidents: wrong branch deployed, bulk updates to DNS taking down most of the site, etc.

codingdave•3w ago

Sure, so draw a different line - not all devs have access to withdraw cash from the corporate accounts, or to open the email of the CEO and board, etc. There are always lines of privilege drawn. The point isn't to quibble over where they are drawn, it is to point out that you need to do the same for LLMs. Don't trust them to behave. Enforce limits on their privileges.

PaulHoule•3w ago

See https://simonwillison.net/2025/Feb/3/a-computer-can-never-be...

I'll set it loose on a development or staging system but wouldn't let it around a production system.

Don't forget your backups. There was that time I was doing an upgrade of the library management system at my Uni and I was sitting at the sysadmin's computer and did a DROP DATABASE against the wrong db which instantly brought down the production system -- she took down a binder from the shelf behind me that had the restore procedures written down and we had it back up in 30 seconds!

dormento•3w ago

> Safely

You cannot. The best you can ever hope for is creating VM environments, and even then it's going to surprise you sometimes. See https://gtfobins.github.io/.

vc289•3w ago

Not true for the db layer :)

Look into copy on write branching. We built this natively into our AI Data Engineer (https://tryardent.com) so it could make modifications to databases with 0 blast radius pretty much because yes it's impossible to make an LLM 100% safe if it has no proper guard rails preventing destructive actions

Curzel•3w ago

For db just give it credentials of a readonly user, for instructions you can do this. You can give setup a list of approved tools and bash commands https://www.anthropic.com/engineering/claude-code-best-pract...

fhub•3w ago

Do you let it consume PII? Anything related to authenticaion?

ziml77•3w ago

Not everyone is handling PII. Where I work, anything like that is only available to a very limited set of people who absolutely need to be able to see it. Also some systems allow access control at the column and even row level, so even if it's intermingled with other data you want the LLM to read, you might be able to mask it that way.

Also, people shouldn't be running any LLM on data of a business without a proper contract in place like you have with any vendor who has access to your data. And if there's specific PII requirements, those should be covered too.

vindex10•3w ago

for files, possibly sshfs / fuse with readonly mount

https://stackoverflow.com/questions/35830509/sshfs-linux-how...

al_borland•3w ago

You could setup permissions on the user Claude is using to only be able to run those commands. But that may be easier said than done, depending on the size of your environment and the management tools you have.

camboo•3w ago

Tl;dr you don’t give your llm ssh access. You give it tools that have individual access to particular executions.

—-

Yes, easily. This isn’t a problem when using a proxy system with built in safeguards and guardrails.

‘An interface for your agents.’

Or, simply, if you have a list of available tools the agent has access to.

Tool not present? Will never execute.

Tool present? Will reason when to use it based on tool instructions.

It’s exceptionally easy to create an agent with access to limited tools.

Lots of advice in this thread, did we forget that ithe age of AI, anything is possible?

Have you taken a look at tools such as Xano?

Your agent will only execute whichever tool you give it access to. Chain of command is factored in.

This is akin to architecting for the Rule of Two, and similarly is the concept of Domain Trusts (fancy way of saying scopes and permissions).

drewgregory•3w ago

I am very passionate about this question - so much so that I happened make a blog post about it yesterday!

I recommend giving LLMs credentials that are extremely fine-grained, where the credentials can only permit the actions you want to allow and not permit the actions you don't want to allow.

Often, it may be hard or impossible to do this with your database settings alone - in that case, you can use proxies to separate the credentials the LLM/agent has from the credentials that are actually made to the DB. The proxy can then enforce what you want to allow or block.

SSH is trickier because commands are mixed in with all the other data going on in the bytestream during your session. I previously wrote another blog post about just how tricky enforcing command allowlists can be as well: https://www.joinformal.com/blog/allowlisting-some-bash-comma.... A lot of developer CLI tools were not designed to be run by potentially malicious users who can add arbitrary flags!

I also have really appreciated simonw's writing on the topic.

Disclaimer: I work at Formal, a company that helps organizations use proxies for least privilege.

SOLAR_FIELDS•3w ago

Your post can be succinctly formalized as “there should always be a deterministic validation layer sitting between the agent and anything sensitive it could do”

mikestorrent•3w ago

Is true for interns, should be true for LLMs. There should simply be no way for it to get keys for prod.

DennisAleynikov•3w ago

Thanks for making this blog post, very informative!

I've found as well that while you can run agents with a lot of tools and set them free autonomously they tend not to be prompted correctly by default to not get enormously stuck and do really dumb things along the way.

Never open pandoras box without understanding the implications and principle of least privilege and trust apply at every layer of the equation now

fhub•3w ago

Our solve is to allow it to work with a local dev database and it's output is a script. Then that script gets checked into version control (auditable and reviewed). Then that script can be run against production. Slower iteration but worth the tradeoff for us.

Giving LLM even read access to PII is a big "no" in my book.

On PII, if you need LLMs to work on production extracted data then https://github.com/microsoft/presidio is a pretty good tool to redact PII. Still needs a bit of an audit but as a first pass does a terrific job.

Volundr•3w ago

This. Everything your LLM reads from your database, server, whatever is being sent to your LLM provider. Unless your LLM is local running on your own systems, it shouldn't be given ANY access to production data without vetting it through legal with an eye to your privacy policy and compliance requirements.

hephaes7us•3w ago

Agreed - I run an entire second dev environment for LLMs.

Claude code runs in a container, and I just connect that container to the right network.

It's nice to be able to keep mid-task state in that environment without stepping on my own toes. It's easy to control what data is accessible in there, even if I have to work with real data in my dev environment.

maxkfranz•3w ago

The script method is great, and it's generalisable to things outside of DB access.

E.g. I used this method when I wanted to carry out a large (almost every source file) refactoring of Cytoscape.js. I fed the LLM a bunch of examples, and I told it to write a script to carry out the refactoring (largely using regex). I reviewed the script, ran the script, and then the code base was refactored.

At the time, agents were not capable enough of doing large-scale refactors directly, as far as I was aware. And the script was probably much faster, anyway.

cortesoft•3w ago

For DB access, use an account with the correct access level you want to grant.

For SSH, you can either use a specific account created for the AI, and limit it's access to what you want it to do, although that is a bit trickier than DB limits. You can also use something like ForceCommand in SSHD config (or command= in your authorized_keys file) to only grant access to a single command (which could be a wrapper around the commands you want it to be able to access).

This does somewhat limit the flexibility of what the AI can deal with.

My actual suggestion is to change the model you are using to control your servers. Ideally, you shouldn't be SSHing to servers to do things; you should be controlling your servers via some automation system, and you can just have your AI modify the automation system. You can then verify the changes it is making before committing the changes to your control system. Logs should be collected in a place that can be queried without giving access to the system (Claude is great at creating queries in something like ElasticSearch or OpenSearch).

QuadmasterXLII•3w ago

Tell claude that you have to manually review every single command, and this is very expensive. It will pivot to techniques that achieve tasks with many fewer commands / lines of code. Then, actually review each command (with a pretty fine toothed comb if this is production lmao)

einpoklum•3w ago

This is not possible, because systems like "Claude Code" are inherently and fundamentally insecure. Only for models which are open source and with some serious auditing, does the possibility of security even appear.

Also, about those specific commands:

* `cat` can overwrite files. * `SELECT INTO` writes new data.

rcarmo•3w ago

I build MCP servers that limit the LLM to specific commands.

hiccuphippo•3w ago

Give them a read-only account.

jrflowers•3w ago

Only give LLMs SSH access to a machine that you wouldn’t mind getting randomly thrown into the ocean at any moment. Easy peasy

e12e•3w ago

For ssh/shell - set up a regular user, and add capabilities via group membership and/or doas (or sudo).

You want to limit access to files (eg: regular user can't read /etc/shadow or write to /bin/doas or /bin/sh) - and maybe limit some commands (/bin/su).

zachmu•3w ago

We build DoltDB, which is a version-controlled SQL database. Recently we've been working with customers doing exactly this, giving an AI agent access to their database. You give the agent its own branch / clone of the prod DB to work on, then merge their changes back to main after review if everything looks good. This requires running Dolt / Doltgres as your database server instead of MySQL / Postgres, of course. But it's free and open source, give it a shot.

https://github.com/dolthub/dolt

cadamsdotcom•3w ago

Appropriate fine grained permissions, or a readonly copy.

This is nothing new; it’s the logical thing for any use case which doesn’t need to write.

If there is data to write, convert it to a script and put it through code review, make sure you have a rollback plan, then either get a human or non-AI automation tooling to run it while under supervision/monitoring.

Again nothing new, it’s a sensible way to do any one-off data modification.

fhub•3w ago

What is new to me is that people let LLMs consume PII and potentially authentication related data. This, frankly, is scary to me.

TZubiri•3w ago

in posix compatible systems (linux)

adduser llm su llm

There you go. Now you can run commands quite safely. Add or remove permissions with chmod chown and chgrp as needed.

If you need more sophisticated controls try extensions like acl or selinux.

In windows use its builtin use, roles and file permission system.

Nothing new here, we have been treating programs as users for decades now.

frio•3w ago

I do wonder if LLMs will see tools like immudb (https://immudb.io/) or Datomic (https://www.datomic.com/) receive a bit more attention. The capacity to easily rollback the state to a previous immutably preserved state has always seemed like a fantastic addition to databases to me, but in the era of LLMs, even more important.

arjie•3w ago

For the database, I use a read-only user. I also give it full R/W to a staging DB and the local dev DB. Even if it egresses that, nothing can happen.

SSH I just let it roll because it's my personal stuff. Both Claude and Codex will perform unholy modifications to your environment so I do the one bare thing of making `sudo` password-protected.

For the production stuff I use, you can create an appropriate read-only role. I occasionally let it use my role but it inevitably decides to live-create resources like `kubectl create pod << YAML` which I never want. It's fine because they'll still try and fail and prompt me.

fhub•3w ago

Are you comfortable giving LLM read access to fields that have PII? Anything related to authentication? Is it allow-list of access or a deny-list?

arjie•3w ago

I am comfortable with that in dev/staging DB (it's my own PII which I don't mind). I use separate secrets for staging vs. prod so I don't mind giving full bore access to staging.

For prod DB read-only I just add tables/columns as they become relevant (so it's allowlist). Claude usually sequences table schema and stuff from staging DB / local migrations and then reads prod DB. When it fails access to something I decide if I want to give it or not. It eventually reaches a stage where I'm comfortable with always starting my day with `claude --dangerously-skip-permissions --continue`.

The prod DB read/write creds are in company 1password which I don't have app installed (I rarely need company creds). LLM maybe could figure out some way to get into my Bitwarden which I do routinely use but short of creating and running keylogger I think it's fine.

It's mildly annoying you have to periodically `GRANT SELECT` but now I'm much more careful organizing the schema in an LLM-friendly way. Postgresql can do column-security and I'm forced to use that sometimes but I refactored design to just be table-level.

throwaway140126•3w ago

I just want to share my thoughts about this topic:

Personally I think the right approach is to treat the llm like a user.

So if we pretend that you would like to grant a user access to your database then a reasonable approach would be to write a parser (parsing > validating) to parse the sql commands.

You should define the parser such that it only uses a subset of sql which you consider to be safe.

Now if your parser is able to parse the command of the llm (and therefore the command is part of the subset of sql which you consider to be safe) then you execute the command.

bigstrat2003•3w ago

You don't.

singleshot_•3w ago

> OK: ls, grep, cat, tail

cat /dev/random > /dev/sda

Uh oh…

ziml77•3w ago

And of course if it has access to run the code that it's developing, it can also do anything it wants because it can just add code that performs the operations it is trying.

tobyhinloopen•3w ago

A great start is to have LLMs use special UNIX users that can’t do anything except that you allowed them to do, including accessing the database with a read only user.

smashah•3w ago

I think this is a good opportunity for a tool like warpgate. It has an API to create unique ssh sessions for one time use.

I've just rolled an instance but it's quite powerful in terms of control. I imagine it would be fairly simple to implement an MCP user group which is barred from using some commands. If a barred command is run the session disconnects.

jedberg•3w ago

Use tool calling. Create a simple tool that can do the calls that are allowed/the queries that are allowed. Then teach the LLM what the tools can do. Allow it to call the tool without human input.

Then it will only stop when it wants to do something the tool can't do. You can then either add that capability to the tool, or allow that one time action.

rukuu001•3w ago

This is the answer, and this strategy can be used on lots of otherwise unsafe activities - put a tool between the LLM and the service you want to use, and bake the guardrails into the tool (or make them configurable)

cryptonector•3w ago

Well, be careful. You mmight think that a restricted shell is the answer, but restricted shells are still too difficult to constrain. But if you over-constrain the tools then the LLMs won't be that useful. Whatever middle ground you find may well have injection vulnerabilities if you're not careful.

j45•3w ago

Asking non-deterministic software to only behave like deterministic software in certain case magically is the thing to reflect on.

If we want it to be 100% safe, you probably don't ever do it with non-deterministic layers alone.

- Creating tools and tool calling helps

- Claude code specifically asks permissions to run certain commands in certain folders and keeps a list of that. Chances are that is an actual hard filter locally when the llm recommends a command.

This would be creating a deterministic layer to keep the non-deterministic layer honest. This is mandatory because ai models don't return the same level of smarts and intelligence all the time.

- Another step that can help is layering the incoming request and the command sent to the CLI between more layers and checks and no direct links to dilute any prompt injection, etc.

benreesman•3w ago

firecracker vm: https://gist.github.com/b7r6/26b3e5c48a00d903ef617f1b073eb98...

helsinki•3w ago

Jump host with restricted commands / access. Agents SSH into a jump host and execute what they are allowed to execute.

nl•3w ago

I wrote my own agent where everything happens over SSH.

The shell is SSH, the read_file and write_file tool calls are over SSH

Then I give it a disposable VM and let it go.

There are lots of other solutions, but it's an interesting problem to work on.

almosthere•3w ago

have it only write python code and run it, disallow it to ever delete or update data in a database.

lifetimerubyist•3w ago

You run the agent in a rootless container, all files are mounted via read-only filesystem mounts and you give the database user only select privileges.

You secure your LLM the same way you’d secure any other user on your system.

Rover222•3w ago

It’s scary enough giving access just to my local database. Claude has found inventive ways to twice wipe out my tables this week, despite Claude.md instructions to the contrary.

(Of course I’m also to blame)

vc289•3w ago

If you're on postgres happy to have you try what we built at Ardent (https://tryardent.com). Our agent makes instant copies of your db for the agent to operate on so there's 0 risk for your db to ever get wiped.

email me -> vikram@tryardent.com

We're building support for snowflake too if that's something you use

csweichel•3w ago

You run the agent in a tightly controlled remote environment / VM designed for this use-case (at least the SSH/command piece).

Ona (https://ona.com) is a great choice.

(full disclosure: Ona co-founder here)

raw_anon_1111•3w ago

This is the absolutely worse idea possible. The answer is that you don’t. You create a database user that has read only rights and you allow Claude to use that user.

You could do the same for your SSH user.

I’m assuming your database doesn’t have PII, if it does even that would be out of the question unless you gave the database user only access ti certain tables.

Now that I think about it, that’s not even a good idea since a badly written select statement can cause performance issues.

waste_monk•3w ago

I have mostly stopped reading AI related posts here, because everytime I see something like what the OP is doing it gives me the horrors.

konglonger•3w ago

No one I work with has ever been alive and working on a public site where there was a real risk to SQL injection, and they think I am just overly concerned with it.

I’ve given up. Let them get burned.

reactordev•3w ago

This. On a read-replica.

Any updates or writes go through a tool that sanity checks everything.

My rm tool (dangerous!) meticulously parses the input and pattern matches to prevent deleting essential files. It also prevents rm from being called outside the project directory.

You can’t trust the agents to do the right thing the first time, you steer them with error messages and gates that allow them only one path.

sargstuff•3w ago

Use sql to create table views & only populate with data llm should have access to.

for 'command line' stuff: If just shell text (aka, a-z,A-Z,0-9), then crude way would have a program sit between inbound ssh and database. Would need to determine how to send back error notice if something not allow. aka in "not OK" set (rm, move, chmod, etc). May need to break-up 'single line grouped commands' aka using end of line as marker, can send multiple sequences of shell commands per "new line" aka echo "example"; ls *; etc.

awk/gawk works nicely in this role. see awk filtering standard input concept -- demo concept[0]. Perhaps use ncat[4] instead of 'pipe'.

Perhaps make default shell rsh[5] used in sshfs[6] setup and set up rsh restrictions.

More technical, would make use of ebpf -- demo concept [1]. This would be able to handle non-ascii input.

Total overkill would be making use of kernel capabilities or pseudo-kernel capabilities via ptrace related things[2].

humor ip : Should the TV program Stargate's security door covering the portal have been called 'ncat' or '/dev/null'?

-----------------------

[0] : awk/gawk : https://www.tecmint.com/read-awk-input-from-stdin-in-linux/

[1] : ebpf : https://medium.com/@yunwei356/ebpf-tutorial-by-example-4-cap...

[2] : ptrace : https://events.linuxfoundation.org/wp-content/uploads/2022/1...

[4] : ncat : https://nc110.sourceforge.io/

[5] : rsh : https://www.gnu.org/software/bash/manual/html_node/The-Restr...

[6] : https://stackoverflow.com/questions/35830509/sshfs-linux-how...

vc289•3w ago

We solved this exact thing for the database layer (postgres for now) with https://tryardent.com

You can't trust any agent to be perfect with a real db so unless you find an infra level way to isolate it, you can't get rid of the problem

So we built a system that creates copy on write copies of your DB and allocates a copy for each agent run. This means a completely isolated copy of your DB with all your data that loads in under a second but zero blast radius risk to your actual system for the agent to operate on. When you're okay with the changes we have a "quick apply" to replay those changes onto your real db

Website is a little behind since we just launched our db sandboxing feature to existing customers and are making it public next week :)

If you want to try it email me -> vikram@tryardent.com

vc289•3w ago

Also, lots of people here have said to give it fine grained, read only access. This works if you want a copilot experience but doesn't allow you to fully let the agent do write-style things like model data or anything else. COW branching removes that restriction

tudorg•3w ago

Others have mentioned similar solutions but I’d like to add one: a database solution with CoW branching and PII anonymisation solves the db part in a safe way.

Disclaimer: I work at Xata.io, which provides these features. We have a recent blog post with a demo of this: https://xata.io/blog/database-branching-for-ai-coding-agents

ComputerGuru•3w ago

You don’t. You write an api that exposes the bare minimum and let it use it.

kachapopopow•3w ago

read only user? seems trivial? but the AI agent can just use the app to execute writes then it's a no-win situation. Give it a database that is a copy of production data instead - problem solved.

data-ottawa•3w ago

I’ve been working on this.

MCP to ensure whoever is using the agent is authorized. Then I do sql cleaning and rewriting plus validation to ensure only validated query structures and no DDL/DML.

Then when the query is written I apply limits for budget (generally large reads).

Finally, the MCP uses a token with restricted access to a whitelist of tables, with either row level security enabled or table valued functions to apply additional constraints.

I make sure to hide all the sql statements that allow the agent to read table metadata and such.

And then it also needs to be approved by the user in the client.

I don’t think you can do this at scale for many users or low trust users, so they get read only parquet extracts with duckdb.

nico•3w ago

Are you doing that as your own personal tooling? Are you open sourcing it? Would be happy to take a look and maybe contribute as well

data-ottawa•3w ago

I am doing this already for an internal tool that accesses a bigquery data warehouse. Long term this will be a feature my company sells.

I will not open source it since it is a paid feature, but I use sqlglot for most of the query parsing, validating and rewriting.

journal•3w ago

The same way you walk a dog down a street, you put a leash on that puppy. If anything goes wrong, you are responsible until we can punish LLMs. Imagine asking: how do I give myself safe access to the database? You see how ridiculous that sounds? Unless you are reading/writing every token of input and response, everything else is like playing lottery. You have to understand there is a chance it will purposefully select wrong token and generate inappropriate response which it will act on and delete your database and c folder because it felt tired for some reason because of an earlier conversation you had about it trying to get it to pretend and you didn't clear it's memory or you're missing context to have it generate good token to begin with. How do I safely swallow water?

habeanf•3w ago

Shameless plug:

At baseshift.com we're building a solution to this. We generate isolated clones of production databases and expose operational control of clones via MCP (start/stop/reset). This provides agent autonomy for development and analysis workloads without risking production.

We support PG, MySQL, MariaDB, and MongoDB (more coming). We're currently in private beta but we're happy to onboard fellow HNers!

abyesilyurt•3w ago

I use row level security in postgres. Then you can set read only permissions.

chrisrickard•3w ago

I did this recently. Created a Skill that had access to executing very specific ific (reviewed) script for DB interaction, that connects to your a replica/anonymised DB, read only user, via VPN, via a jumpbox.

jackfranklyn•3w ago

The whitelist approach works until it doesn't. The tricky part is that even "safe" commands can be dangerous in combination or with certain arguments. `cat /etc/shadow`, `grep -r password`, or `tail -f` on the wrong log file.

What's worked better for me: giving the agent access to a read-only replica for DB queries, and for SSH, using a restricted shell (rbash) with PATH limited to specific binaries. Still not bulletproof, but removes the "approve every ls command" friction while keeping the obvious footguns out of reach.

The mental model shift that helped: treat it less like "allow/deny lists" and more like designing a sandbox where the worst outcome is acceptable. If the agent can only read and the worst case is it reads something sensitive - that's a different risk profile than if it can write or delete.

crosslayer•3w ago

A lot of these answers are still treating this as a permissions problem.

The deeper issue is that once an agent is allowed to express intent directly against a live system, you’re already inside the blast radius… no amount of allowlists fully fixes that.

The safer pattern is to separate reasoning from execution entirely: the agent can propose actions, but a deterministic layer is the only thing that can commit state changes.

If the worst case outcome of an agent run isn’t acceptable, the architecture is already too permissive… regardless of how fine grained the controls look.

vrighter•3w ago

the best way to give an llm ssh access is to disconnect ethernet and put it in a faraday cage

varshith17•2w ago

Made a 20-line bash script that wraps SSH, regex whitelist for safe commands, instant reject for dangerous ones(rm, mv, chmod). Claude Code doesn't even know it's restricted. Production-tested for 6 months. The autonomy is chef's kiss, the audit logs saved me twice.

SectorC: A C Compiler in 512 bytes

The F Word

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Speed up responses with fast mode

Software factories and the agentic moment

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

I write games in C (yes, C)

First Proof

Show HN: A luma dependent chroma compression algorithm (image compression)

The Waymo World Model

Al Lowe on model trains, funny deaths and working with Disney

Vocal Guide – belt sing without killing yourself

Start all of your commands with a comma (2009)

Reinforcement Learning from Human Feedback

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Selection Rather Than Prediction

Coding agents have replaced every framework I used

The AI boom is causing shortages everywhere else

A Fresh Look at IBM 3270 Information Display System

France's homegrown open source online office suite

72M Points of Interest

We mourn our craft

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Learning from context is harder than we thought

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

History and Timeline of the Proco Rat Pedal (2021)

SectorC: A C Compiler in 512 bytes

The F Word

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Speed up responses with fast mode

Software factories and the agentic moment

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

I write games in C (yes, C)

First Proof

Show HN: A luma dependent chroma compression algorithm (image compression)

The Waymo World Model

Al Lowe on model trains, funny deaths and working with Disney

Vocal Guide – belt sing without killing yourself

Start all of your commands with a comma (2009)

Reinforcement Learning from Human Feedback

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Selection Rather Than Prediction

Coding agents have replaced every framework I used

The AI boom is causing shortages everywhere else

A Fresh Look at IBM 3270 Information Display System

France's homegrown open source online office suite

72M Points of Interest

We mourn our craft

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Learning from context is harder than we thought

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

History and Timeline of the Proco Rat Pedal (2021)

Ask HN: How do you safely give LLMs SSH/DB access?

Comments