frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Miguel: An AI agent that modifies its own source code, sandboxed in Docker

https://github.com/soulfir/miguel
2•PedroMFernandes•2h ago

Comments

PedroMFernandes•2h ago
Hi HN, I built Miguel, an AI agent (Claude + Agno framework) that reads, modifies, and extends its own source code autonomously, sandboxed inside Docker.

I gave it 10 seed capabilities (answer questions, read own code, create tools, handle errors, etc). It completed all 10, then generated its own capability checklist and started implementing those too. It's now at 21+ self-implemented capabilities, including web search, persistent memory, task planning, file analysis, API integrations, and sub-agent delegation. Every change is a real git commit you can browse: https://github.com/soulfir/miguel/commits/main

HOW IT WORKS The system splits into two sides with a hard trust boundary. The host side is protected: it runs the CLI, the improvement runner, git operations, and validation checks. The agent can never see or modify these files. The container side is sandboxed: the agent, its tools, and all execution live inside Docker with the project mounted read-only. The agent can only write to its own code directory. The improvement loop goes like this. The runner takes a git snapshot, then sends the agent its own source code along with the next capability to implement. The agent modifies its own files inside the container. The runner validates the changes with AST syntax checks, JSON schema checks, and import checks. If everything passes, it commits and pushes. If anything fails, it automatically rolls back to the last working state.

WHAT I LEARNED The agent kept creating thin wrapper tools. For example, an api_get() that just calls http_request(method="GET"). It was optimizing for convenience without understanding that every tool costs context tokens. I ended up writing immutable "10 Commandments of Self-Improvement" into the protected runner — principles the agent sees every batch but can never modify. One of them: "Your tool count is a tax on cognition, not a score." After the commandments, the agent ran a consolidation batch on its own. It cut its codebase by 10% and its system prompt by 63%, while keeping all functionality. It understood the principle and acted on it. The agent also evolved itself from a single agent into a team architecture with specialized sub-agents (Coder, Researcher, Analyst), using the framework's native team/delegation API. I didn't touch any agent code for that. The most dangerous failure mode isn't bad code — it's context exhaustion. The agent would read six or more of its own files to "understand" itself, then have no context left to actually write code. Managing the agent's cognitive budget turned out to be the core design challenge.

LIMITATIONS It uses the Claude API, so it costs money to run. Each improvement batch is roughly a few euros in API calls, which is the main bottleneck on how fast it evolves. It's early stage — the architecture works well but the agent still makes questionable decisions sometimes. Licensed under CC BY-NC 4.0. Code is on GitHub. Star it and check back in a week. The code will be different, because Miguel will have changed it.

Happy to answer questions about the architecture, the safety model, or the weirdness of watching an AI rewrite its own brain.

Think Twice Before Buying or Using Meta's Ray-Bans

https://www.eff.org/deeplinks/2026/03/think-twice-buying-or-using-metas-ray-bans
1•hn_acker•3m ago•0 comments

Anthropic gives lesson in AI revenue hallucination

https://www.reuters.com/commentary/breakingviews/anthropic-gives-lesson-ai-revenue-hallucination-...
1•latinodev•7m ago•1 comments

Production query plans without production data

https://boringsql.com/posts/portable-stats/
1•birdculture•11m ago•0 comments

Build a deep researcher and learn DSPy Signatures and Modules

https://www.cmpnd.ai/blog/learn-dspy-deep-research.html
1•dbreunig•12m ago•0 comments

AI Is Making Libraries Obsolete

https://maho.dev/2026/03/ai-is-making-libraries-obsolete/
1•mahoivan•13m ago•0 comments

Singularity Is Around?

1•essekar•14m ago•0 comments

Do YC companies all use the top sales tools?

1•justin_cheu•15m ago•0 comments

Deleted Tweet from Energy Secretary Sends Oil Markets on Another Wild Ride

https://www.wsj.com/finance/stocks/deleted-tweet-from-energy-secretary-sends-oil-markets-on-anoth...
1•petethomas•16m ago•0 comments

Evolving the Node.js Release Schedule

https://nodejs.org/en/blog/announcements/evolving-the-nodejs-release-schedule
1•suresh70•16m ago•0 comments

DOGE employee stole Social Security data and put it on a thumb drive

https://techcrunch.com/2026/03/10/doge-employee-stole-social-security-data-and-put-it-on-a-thumb-...
9•elsewhen•20m ago•1 comments

Claude Opus 4.6 generated a YouTube poop video with a single prompt

https://twitter.com/josephdviviano/status/2031196768424132881
1•dokdev•20m ago•1 comments

Build a "Deep Data" MCP Server to Connect LLMs to Your Local Database in 10min

https://root-ai.beehiiv.com/p/build-a-deep-data-mcp-server-to-connect-llms-to-your-local-database...
1•mehdikbj•22m ago•0 comments

Aaron Swartz and the Return of Jottit

https://jottit.org/
1•shanselman•23m ago•1 comments

A Special AMD Ryzen AM5 Motherboard for Linux / Open-Source Enthusiasts

https://www.phoronix.com/review/msi-pro-b850p-wifi
4•RachelF•23m ago•0 comments

Side questions with /btw in Claude Code

https://code.claude.com/docs/en/interactive-mode
2•mfiguiere•25m ago•0 comments

Mathematics is undergoing the biggest change in its history

1•Stratoscope•26m ago•0 comments

SaaSpocalypse Now

https://hantverkskod.se/2026/03/01/saaspocalypse/
1•mosura•27m ago•0 comments

Classifying email providers of 2000 Swiss municipalities via DNS

https://mxmap.ch/
2•notmine1337•29m ago•0 comments

I Ching or Book of Changes

https://iching.r053.org/
1•tzury•30m ago•0 comments

I Got Root on Meta AI's Infrastructure Using a Chat Prompt

https://netguard24-7.com/blog/meta-ai-root
3•cybrdude•30m ago•1 comments

Chemists thought phosphorus had shown all its cards–until it surprised them

https://phys.org/news/2026-02-chemists-thought-phosphorus-shown-cards.html
2•PaulHoule•30m ago•0 comments

How to start coding with AI agents

https://www.paralect.com/academy/product-engineer/ai-agents-coding
1•igorkrasnik•31m ago•0 comments

Zero Point Energy

https://twitter.com/EagleworksSonny/status/2031128667019972616
1•Flere-Imsaho•32m ago•0 comments

Show HN: Repovex – GitHub repo health scores for your whole org

https://repovex.com
1•calminferno•38m ago•0 comments

Front End Memory Leaks: 500-Repo Static Analysis and 5-Scenario Benchmark Study

https://stackinsight.dev/blog/memory-leak-empirical-study/
1•nadis•41m ago•0 comments

Visual plasticity and exercise revisited: No evidence for a "cycling lane"

https://jov.arvojournals.org/article.aspx?articleid=2737222
2•amadeuspagel•43m ago•0 comments

Google and Tesla think we're managing the electrical grid all wrong

https://techcrunch.com/2026/03/10/google-and-tesla-think-were-managing-the-electrical-grid-all-wr...
1•jnord•43m ago•0 comments

I've no technical background, hope someone finds this interesting

https://github.com/aleflow420/rinoa
1•aleflow420•43m ago•0 comments

GLP-1 drugs push U.S. consumers toward spicy foods, lifting sauce makers

https://www.reuters.com/business/healthcare-pharmaceuticals/sauce-spice-makers-attract-deal-inter...
2•petethomas•43m ago•0 comments

Television and computer use and dementia risk in older adults

https://alz-journals.onlinelibrary.wiley.com/doi/10.1002/alz.71259
3•amadeuspagel•45m ago•0 comments