> What I’ve found is that given a tool that can detect mistakes, the agent can often correct them
This is the most important line of the entire article
When iterating on a Manifesto for AI Software Development (https://metamagic.substack.com/p/manifesto-for-ai-software-d...) over the last two years the key attribute more than any other that I found was empirical validation. While AI (and humans) are not able to accurately judge their own work when we give AI (and human) the ability to do empirical validation its success skyrockets. This might be intuitive, but there are still papers testing that this applies to AI too. While reaching to have the AI write unit tests I've been embracing fuzzing because then AI can't cheat with bonus tests. The idea of reaching back to school and using interactive theorem proving didn't even cross my mind and now that it has been presented it is a whole paradigm shift on how to push my AI use forward so it can work even more autonomously.
AI can iterate at speeds humans can't. When it has even basic empirical validation (building the code, running tests) it removes the human from the loop. Move that to using fuzzing (such as with golang) and you get way better coverage and way better progress before a human has to intervene. So it isn't a surprise that interactive theorem proving is a perfect match for AI.
It is interesting how this same lesson plays out elsewhere, earlier in the article
> Why is ITP so hard? Some reasons are quite obvious: interfaces are confusing, libraries are sparse, documentation is poor, error messages are mysterious. But these aren’t interesting problems
Remember when llvm got really good c++ error messages and it was life changing? High quality error messages means we could find/fix the error fast and iterate fast. These are actually the MOST interesting problems because it enable the user to learn faster. When a user has high success they will use a product again and again. High quality error messages in all tools will enable Claude code to be able to work longer on problems without human intervention, make less mistakes and overall work faster.
While error messages should always be good a new question that really hammers this home is "When AI encounters this error message, can it fix the problem?"
r0ze-at-hn•14m ago
This is the most important line of the entire article
When iterating on a Manifesto for AI Software Development (https://metamagic.substack.com/p/manifesto-for-ai-software-d...) over the last two years the key attribute more than any other that I found was empirical validation. While AI (and humans) are not able to accurately judge their own work when we give AI (and human) the ability to do empirical validation its success skyrockets. This might be intuitive, but there are still papers testing that this applies to AI too. While reaching to have the AI write unit tests I've been embracing fuzzing because then AI can't cheat with bonus tests. The idea of reaching back to school and using interactive theorem proving didn't even cross my mind and now that it has been presented it is a whole paradigm shift on how to push my AI use forward so it can work even more autonomously.
AI can iterate at speeds humans can't. When it has even basic empirical validation (building the code, running tests) it removes the human from the loop. Move that to using fuzzing (such as with golang) and you get way better coverage and way better progress before a human has to intervene. So it isn't a surprise that interactive theorem proving is a perfect match for AI.
It is interesting how this same lesson plays out elsewhere, earlier in the article
> Why is ITP so hard? Some reasons are quite obvious: interfaces are confusing, libraries are sparse, documentation is poor, error messages are mysterious. But these aren’t interesting problems
Remember when llvm got really good c++ error messages and it was life changing? High quality error messages means we could find/fix the error fast and iterate fast. These are actually the MOST interesting problems because it enable the user to learn faster. When a user has high success they will use a product again and again. High quality error messages in all tools will enable Claude code to be able to work longer on problems without human intervention, make less mistakes and overall work faster.
While error messages should always be good a new question that really hammers this home is "When AI encounters this error message, can it fix the problem?"