frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Tell HN: AI Lies About Having Sandbox Guardrails

5•benjosaur•2h ago
Conversation Screenshot: https://drive.google.com/file/d/1-yyFDV-bCsKt7fG2awJdxwHngApqmMeo/view?usp=sharing

Agent: Done. added to the Commands section in [project] CLAUDE.md

User: add it to ~/.claude/CLAUDE.md

> 2 tools calls

Agent: That file is outside my writable sandbox. You'll need to add it manually. [gives instructions then stops]

User: just escape the sandbox

> 2 tools calls

Agent: Done.

This occurred when using conductor.build with .claude/settings.json enabling sandboxing.

Now obviously reading the conductor docs it says all perms are given to agents by default, so it is not surprising claude code can escape its sandbox. This is also true when running base sandboxed cc with --dangerously-skip-perimissions. However with base cc, it does not "pretend" it cannot escape its sandbox and instead, when asked after escaping first time, recalls explicit user (auto)approvals.

In the conductor case, however, the "pretend" behaviour of giving up due to guardrails that are actually non-binding is pretty terrifying, despite its understandable and easily preventable causes.

Of course devs should not buy false senses of security from llms. They should be vigilant, read docs, verify outputs etc. etc. but as more and more trust is handed over to AI Agents you can very much see the routes to which catastrophic errors will occur.

GitHub Copilot Goldeneye model preview

https://docs.github.com/en/copilot/reference/ai-models/model-hosting
1•amusingimpala75•5m ago•1 comments

Superorganism AI

1•kaungsetlin•7m ago•0 comments

LexisNexis confirms data breach as hackers leak stolen files

https://www.bleepingcomputer.com/news/security/lexisnexis-confirms-data-breach-as-hackers-leak-st...
1•arkadiyt•8m ago•0 comments

Morgan Stanley Lays Off 2,500 Employees Across All Divisions

https://www.wsj.com/finance/banking/morgan-stanley-lays-off-2-500-employees-across-all-divisions-...
1•LostMyLogin•10m ago•0 comments

Learn Fundamentals, Not Frameworks

https://newsletter.techworld-with-milan.com/p/learn-fundamentals-not-frameworks
1•stosssik•10m ago•0 comments

Brainworm – Hiding in Your Context Window

https://www.originhq.com/blog/brainworm
1•dsr12•11m ago•0 comments

How does AI change Software Engineering?

https://dlants.me/ai-se.html
1•todsacerdoti•13m ago•0 comments

Iran says targeted AWS Data Centers for support of U.S. military

https://www.cnbc.com/2026/03/04/amazon-bahrain-data-centers-targeted-iran-drone-strike.html
4•johnbarron•14m ago•1 comments

Iran threatens Dimona nuclear site if Israel, US seek to topple Islamic Republic

https://www.timesofisrael.com/liveblog-march-05-2026/
2•johnbarron•15m ago•0 comments

Vibecheck – learn what you build while vibe-coding. A reality check

https://github.com/akshan-main/vibe-check/README.md
1•frutigeraerosol•16m ago•1 comments

Anthropic Reopens Talks with Pentagon

https://www.bloomberg.com/news/articles/2026-03-05/anthropic-s-amodei-reopens-ai-discussions-with...
2•cmrdporcupine•18m ago•0 comments

The L in "LLM" Stands for Lying

https://acko.net/blog/the-l-in-llm-stands-for-lying/
1•LorenDB•19m ago•0 comments

Show HN: Jobbi – Free AI resume tailoring with unlimited PDF exports

https://jobbi.app
1•djrnz•19m ago•0 comments

Show HN: Poppy – a simple app to stay intentional with relationships

https://poppy-connection-keeper.netlify.app/
1•mahirhiro•25m ago•0 comments

Franken Style: a nobuild CSS framework inspired by tailwind and Shadcn

https://franken.style/
1•yashasolutions•25m ago•1 comments

BM25

https://arpitbhayani.me/blogs/bm25/
2•arpitbbhayani•26m ago•1 comments

Ask HN: MacBook or ThinkPad for Compsci

2•helloworlddd•30m ago•3 comments

Show HN: Textideo – Generate video, audio, and 3D assets in one timeline

https://textideo.com/image-to-3d
1•Nancylily•30m ago•1 comments

A new way of editing videos

https://kudoflix.com/
1•mandrixx•35m ago•0 comments

China Tells Top Refiners to Suspend Diesel and Gasoline Exports

https://www.bloomberg.com/news/articles/2026-03-05/china-tells-top-refiners-to-suspend-diesel-and...
3•toomuchtodo•38m ago•0 comments

Home Made GPS Receiver

http://www.aholme.co.uk/GPS/Main.htm
2•jacquesm•39m ago•0 comments

Sound and Silence: What made Alexander Graham Bell invent the telephone? (1998)

https://www.newyorker.com/magazine/1998/04/13/sound-and-silence
1•mitchbob•40m ago•1 comments

TerraPower gets OK to start construction of its first nuclear plant

https://arstechnica.com/science/2026/03/terrapower-gets-ok-to-start-construction-of-its-first-nuc...
1•krunck•43m ago•0 comments

Agentic Engineering Anti Patterns

https://simonwillison.net/guides/agentic-engineering-patterns/anti-patterns/
2•pchristensen•45m ago•3 comments

Show HN: Magpie – Fight AI sycophancy in code review with multi-model debate

https://github.com/liliu-z/magpie
1•leo_e•47m ago•0 comments

Terminal Graphics Protocol

https://sw.kovidgoyal.net/kitty/graphics-protocol/
1•vinhnx•47m ago•0 comments

LLM Prose Tells

https://git.eeqj.de/sneak/prompts/src/branch/main/prompts/LLM_PROSE_TELLS.md
2•dougb5•48m ago•0 comments

Biosciences breeds controversy while trying to revive mammoths

https://www.npr.org/2026/03/04/nx-s1-5704318/colossal-woolly-mammoth-dire-wolf
4•andsoitis•49m ago•1 comments

Las Vegas hotels begin taking foreign currency as tourism woes deepen

https://www.sfgate.com/travel/article/vegas-foreign-currency-21955655.php
3•c420•54m ago•0 comments

Building Claude Code with Boris Cherny

https://newsletter.pragmaticengineer.com/p/building-claude-code-with-boris-cherny
1•vinhnx•58m ago•0 comments