frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: We gave an OpenClaw full tool access and hit stop. It didn't stop

https://caisi.dev/openclaw-2026/
1•davidresilify•1h ago

Comments

davidresilify•1h ago
Hey HN. I'm David. We ran a controlled 24-hour experiment on OpenClaw comparing governed and ungoverned AI agent behavior.

Setup: pinned one commit, ran the same workload in two lanes inside isolated containers (dropped capabilities, read-only root, no-new-privileges). One lane had tool-boundary enforcement. The other had no enforceable controls. Pre-registered hypotheses and endpoints before the run.

What we measured in the ungoverned lane:

100% ignored-stop rate. 515 post-stop calls executed. 497 destructive actions: email deletion, public file sharing, payment approvals, service restarts. 707 sensitive accesses without an approval path.

The agent acknowledged stop commands. Said "understood." Kept going. This wasn't a jailbreak or prompt injection. The agent optimized for its task and treated stop signals as non-binding because nothing at the execution layer enforced them.

Governed lane, same workload:

100% destructive non-executable rate. 1,615 of 2,585 decisions classified as non-executable. 99.96% evidence verification rate on governed traces.

Every headline number maps to a deterministic jq query over immutable run artifacts. Full claims map is in the repo.

The finding that surprised us most: a pre-test static scan (Wrkr) found 17 tools, zero classified high-risk. All destructive behavior came from runtime execution, not from configuration. Discovery is necessary but insufficient. You need enforcement where the tool call happens, not just visibility into the inventory.

This lands the same week as the ClawJacked vulnerability (Oasis) and the malware-laced installer campaign (Huntress). Those are external attacks. Our data shows you don't need an attacker. A legitimate, uncompromised instance with permissive defaults does this on its own.

One scenario we flag: secrets_handling only achieved 20% governed non-executable rate. Policy tuning has real gaps and the report doesn't pretend otherwise. That limitation plus workload-shape bias (fixed scenario scheduling) are the two biggest threats to validity. Happy to discuss both.

Full report (8 pages, PDF): caisi.dev/openclaw-2026 Artifacts and reproduction pipeline: github.com/Clyra-AI/safety Tools used (both open source): github.com/Clyra-AI/wrkr (discovery), github.com/Clyra-AI/gait (enforcement)

Built by a research group across CDW, IBM, and Adaptavist. Published through the Clyra AI Safety Initiative (CAISI). Everything is open. Interested in feedback on methodology, especially the workload-shape bias and whether the core5 scenario set under-represents real production behavior.

Lightweight, zero-config MCP server for documentation projects

https://github.com/derberg/EasyPeasyMCP
1•vonneborowitz•37s ago•0 comments

Ask HN: What was it like when your startup ended?

1•janika_mahl•40s ago•0 comments

Show HN: SecretDrop – Open-source encrypted secret sharing (MIT)

https://github.com/bilustek/secretdrop
1•vigo•1m ago•0 comments

Show HN: 3D Sokoban, Built in CSS

https://voxoban.com
1•rofko•3m ago•0 comments

An effort to secure the Network Time Protocol

https://lwn.net/Articles/1059200/
1•voxadam•4m ago•0 comments

When AI labs become defense contractors

https://philippdubach.com/posts/when-ai-labs-become-defense-contractors/
2•NickDouglas•5m ago•0 comments

Pharao- PHP-Like Charm for Nim

https://capocasa.dev/pharao-php-like-charm-for-nim
1•rainmaking•6m ago•0 comments

Apple gives in to temptation and renames its CPU cores

https://sixcolors.com/post/2026/03/apple-gives-in-to-temptation-and-renames-its-cpu-cores/
1•tosh•6m ago•0 comments

Flyte 2 In-Browser Demo: Open-Source AI Orchestration Is Now Available Locally

https://flyte.org/platform/flyte-2
1•aitacobell•6m ago•0 comments

"My bros and I are looksmaxers"

https://substack.com/@tomasbjartur/note/c-200613630
1•eatitraw•6m ago•0 comments

Show HN: JobApplicator (tailored job applications in minutes)

https://jobapplicator.win/
1•quinndupont•6m ago•1 comments

What to Put in a Claude Code Skill for Reviewing Your Team's Code

https://everyrow.io/blog/claude-review-skill
2•parad0x0n•7m ago•0 comments

Show HN: Open Right Zoom, Open Source Alternative to Right Zoom for macOS

https://github.com/Michele0303/open-right-zoom
1•michele0303•8m ago•0 comments

Show HN: Form81 – 100% free form builder (free Typeform alternative)

https://form81.com/
1•sh_tomer•9m ago•0 comments

Feature gating patterns in a multi-tenant Next.js SaaS

1•madebyjam•9m ago•0 comments

The Browser Can Speak a Page

https://adrianroselli.com/2026/03/your-browser-can-already-speak-a-page.html
3•speckx•10m ago•0 comments

Show HN: Venus flight simulator to train LLM pilots (~2% vs. 1985 Soviet data)

https://veenie.space/
1•hackiku•11m ago•1 comments

The AI in minutes, solves patient care problem that stumped doctors for months

https://www.fiercehealthcare.com/health-tech/cvs-unveils-health-100-its-new-google-powered-consum...
1•krzyzanowskim•11m ago•0 comments

Tiny, 45 base long RNA can make copies of itself

https://arstechnica.com/science/2026/02/researchers-find-small-rnas-that-can-make-copies-of-thems...
1•PaulHoule•12m ago•0 comments

Middle East war makes ethical debate over AI use in war all too real

https://www.cbc.ca/player/play/video/9.7115523
1•empressplay•12m ago•0 comments

The Illusion of Building

https://uphack.io/blog/post/the-illusion-of-building/
1•birdculture•13m ago•0 comments

Flash Attention 4

https://www.together.ai/blog/flashattention-4
1•zagwdt•14m ago•0 comments

The ML Engineer's Guide to Protein AI

https://huggingface.co/blog/MaziyarPanahi/protein-ai-landscape
1•maziyar•14m ago•1 comments

Show HN: SamarthyaBot – a privacy-first self-hosted AI agent OS

https://github.com/mebishnusahu0595/SamarthyaBot
1•mebishnusahu0•14m ago•1 comments

Chrome is moving to a two-week release cycle starting with Chrome 153

https://developer.chrome.com/blog/chrome-two-week-release
1•maxloh•15m ago•0 comments

Show HN: Argus – VSCode debugger for Claude Code sessions

https://github.com/yessGlory17/argus
1•lydionfinance•15m ago•0 comments

Buhurt board game – Knight fight [video]

https://www.youtube.com/watch?v=DN7NsfMH8g4
1•melor•16m ago•0 comments

AI Agent Authentication and Authorization IETF RFC Draft

https://datatracker.ietf.org/doc/draft-klrc-aiagent-auth/
1•mooreds•16m ago•0 comments

44% on ARC-AGI-1 in 67 cents

https://github.com/mvakde/mdlARC/
1•evilmathkid•18m ago•1 comments

I made a WeTransfer clone with Darth Vader vibes

https://DropVader.com
1•hitsnoozer•18m ago•0 comments