A way to exclude sensitive files issue still open for OpenAI Codex

https://github.com/openai/codex/issues/2847

44•pikseladam•1h ago

Comments

pikseladam•1h ago

it has been a year and still it is not resolved

pamcake•52m ago

It's not their problem to solve. Don't give it access to sensitive files on the first place.

pohl•1h ago

This should be an open standard like AGENTS.md or skills. What do other harnesses do?

ampersandwhich•54m ago

I believe JetBrains products like Junie use the neutral term .aiignore for this funtionality.

TheDong•1h ago

You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted.

If you don't do that, the agent will be able to incidentally upload them. What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents.

And so, the only solution is to make it so the codex process is unable to access those files, hence using a container, or unix permissions, or deleting the files. Which you can already do.

I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.

FergusArgyll•56m ago

Yes, this was solved decades ago. How do you stop a human from reading one of your files?

  chmod 600

re-thc•3m ago

> How do you stop a human from reading one of your files?

Call the police!

lelandfe•41m ago

Just be aware that AI agents will explore alternate means of accessing said files: https://news.ycombinator.com/item?id=48348578

cowsandmilk•32m ago

If you’re already running codex as a different user to limit its file permissions, why would you add it to the docker group?

lelandfe•23m ago

A good but altogether separate note from the point I’m making: this lack of access is seen as an obstacle to overcome, and other means of access will be tried if available.

It’s a different mental model than a first party solution to “ignore” files.

jen20•6m ago

Lack of knowledge and the desire to have it run containers for things.

planb•57m ago

Sound like snake oil. How would this work? The app that the agent is developing needs access to the file, so access to it cannot be blocked. Just because read_file can not access it (I think current harnesses prevent reading .env files already), does not mean the contents will never be seen by the model.

petcat•53m ago

Hopefully they never actually implement this pointless feature because it will only give people a false sense of security given the unpredictable nature of LLMs. How could something like this even be enforced?

People just need to learn how to use the tools their system already provides them. i.e., chmod

wodenokoto•49m ago

The whole point of using an agent is that I don't want to learn everything. I fully expected the harness to read the .agentignore file and do what is needed to hide it from the LLM.

But apparently, even if implemented, that's not how it works!

KHRZ•32m ago

How would it prevent an agent from writing a script that discovers the secret file? It's not magic.

kstenerud•41m ago

I solved it myself. I built a tool that creates a sandbox, starts an agent, and only mounts the files/dirs that the agent actually needs. It doesn't even get access to your workdir. Instead, you get git semantics (diff, apply) so that you can see what changes are going to land and decide if you really want them or not.

https://github.com/kstenerud/yoloai

agentdev001•34m ago

Sounds like user error to me. Codex gives an llm a tool to allow it to use shell in the context of the host and user in which it is running. If a resource is sensitive, and accessible in that context, then the user is doing something wrong. Would you change your practices if you treated your coding agent as an untrusted human ssh'd under the identity you use for it?

In any case. There are solutions in the comments on the issue, as well as this hn thread.

cowpig•28m ago

I don't think we should ask the agent runtime to police itself.

I contributed to a tool for this problem that is lower-friction than traditional sandboxing:

greywall.io

But you should use something to contain an agent runtime. The idea that people run things like codex on their machines with regular user permissions is baffling to me.

ZiiS•27m ago

However clever/stupid you believe LLMs are they are extremely capable of working around these sorts of restrictions. The ask is for .env files for whatever code you are writing so if the code it writes dosn't have access (i.e. filesystem/container) what is the point, if the code under development reads the env how dose codex debug it without accedentally reading the values from memory? Adding a security setting that dosn't work is much worse then not having one.

bob1029•16m ago

The only thing close to a guarantee is to give the agent exclusive access to a clean VM with precisely the information and permissions you want it to have.

I've been looking into a "workspace" concept that involves an entire cloud VM being spun up as part of an agent conversation such that code changes can be iterated without touching the user's local machine or other trusted contexts. All the agent's tools only have effect when supplied with a specific workspace guid. CLI tools like git are not authorized to talk to the remotes in this arrangement. The machine is initialized with a clone and no way to talk to origin. There are dedicated methods in the harness that can reach into the VM and pull out a change set for deterministic PR generation in the secure contexts (e.g. when the agent calls "ReadyForReview" or similar).

hoppp•12m ago

Do not store secrets in the repository in files, but inject them during runtime. Then the agents have no way to access them.

Lucasoato•12m ago

There should be a standard around .agentignore file similarly to what happens with .gitignore file. Of course this could still be workarounded by agent bash command tools, but at least basic operations like reading and so on should be checked and prevented.

mbid•12m ago

I recently got the tool I use to orchestrate agents in (remote/secure) devcontainers open-sourced at work to solve this properly: https://github.com/nvidia/rumpelpod

As others here have pointed out, it's exceedingly unlikely that a blocklist like proposed in the issue would ever be complete. You shouldn't allow agents direct yolo-access to your machine if it has sensitive data.

Codex works particularly well as a remote agent harness because of its client-server architecture: The server component runs in the container, which might be remote, while the client runs locally. So, in contrast to e.g. the claude cli where the frontend also runs remotely, there's no lag when you write/edit prompts.

mixedbit•1m ago

I work on a Linux sandbox that makes it easy to hide sensitive files from AI agents while keeping the files they need accessible. Check it out: https://github.com/wrr/drop

Still blazing after all these years: Mel Brooks at 100

LocalContextRouter – stop paying vision-token prices for text PDF pages

Show HN: QuicMic – Use your phone as a PC mic in the browser (Rust, QUIC)

Tech Morality Is Hard

Hestia – a local-first Home Assistant that trusts timers over the LLM

Scalpers List Steam Machine Reservations at $1,700

Autoregressive Boltzmann Generators

Give Me 14 Minutes and I'll Destroy Your Procrastination Forever

Austria Lobbies EU to Host Anthropic After US Access Curbs

My First Atari ST

Compete – A Claude Code plugin for interactive competitor intelligence

Google limits Meta's use of its Gemini AI models

China companies Z.Ai,China 360 claim having cybersec AI models to match Mythos

Pyrite64: N64 game-engine and editor using Libdragon and tiny3d

Pi is the wrong circle constant

A World Championship for a game that's been solved for since 1952

Ask HN: Impact on LLM development after the USA policy of preliminary vetting

Greece Is Richer. So Why Do So Many Greeks Still Feel Poor?

Show HN: Custom domain emails for open source projects (KaiMail)

Revenue at Risk from AI Displacement

Show HN: Nomina – Single Binary DNS and Nameserver with WebUI for Homelabs

The origins of the school system aimed to produce independent, critical thinkers

FizzBuzz in Smalltalk

These Are the Most Beautiful Equations, According to Mathematicians

Ask HN: When I Submit my project the comment option not show

Ask HN: You have one year to make $1M. What's your plan?

Pollen tried to remove my article, and Google is assisting to it

Show HN: O11y.jobs is a job board focused specifically on Observability

How VictoriaLogs Stores Your Logs in a Columnar Layout

The curious case of the disappearing Polish S