- Cursor attempted to make a browser from scratch: https://cursor.com/blog/scaling-agents
- Anthropic attempted to make a C Compiler: https://www.anthropic.com/engineering/building-c-compiler
I have been wondering if there are software packages that can be easily reproduced by taking the available test suites and tasking agents to work on projects until the existing test suites pass.
After playing with this concept by having Claude Code reproduce redis and sqlite, I began looking for software packages where an agent-made reproduction might actually be useful.
I found libxml2, a widely used, open-source C language library designed for parsing, creating, and manipulating XML and HTML documents. Three months ago it became unmaintained with the update, "This project is unmaintained and has [known security issues](https://gitlab.gnome.org/GNOME/libxml2/-/issues/346). It is foolish to use this software to process untrusted data.".
With a few days of work, I was able to create xmloxide, a memory safe rust replacement for libxml2 which passes the compatibility suite as well as the W3C XML Conformance Test Suite. Performance is similar on most parsing operations and better on serialization. It comes with a C API so that it can be a replacement for existing uses of libxml2.
- crates.io: https://crates.io/crates/xmloxide
- GitHub release: https://github.com/jonwiggins/xmloxide/releases/tag/v0.1.0
While I don't expect people to cut over to this new and unproven package, I do think there is something interesting to think about here in how coding agents like Claude Code can quickly iterate given a test suite. It's possible the legacy code problem that COBOL and other systems present will go away as rewrites become easier. The problem of ongoing maintenance to fix CVEs and update to later package versions becomes a larger percentage of software package management work.
blegge•1h ago
Why "in the public API"? Does this imply it's using unsafe behind the hood? If so, what for?
DetroitThrow•36m ago
mirashii•25m ago
It is absolutely a useful distinction on whether your users need to deal with unsafe themselves or not.