Idea: AI suggests code changes but execution decides What's wrong with this?
1•adhamghazali•1mo ago
I’m trying to break an idea and would appreciate blunt feedback. The concept is a system where AI can propose code or test changes, but every change is actually run (tests/benchmarks/sims) and judged only by real before-and-after metrics against an explicit goal. No “task completed,” no autonomy without bounds, no shipping without validation — the loop continues only if reality improves. I’m worried this might just be overcomplicated, too slow or annoying in practice, or only useful for a tiny niche (infra/robotics). If you’ve built real systems, where does this fall apart, and what would make you turn it off after a week?
Comments
gus_massa•1mo ago
I just found a bug in my code. I had no test for that case. In retrospective, it was an obvious test, but I didn't have it. So, validation is as good as your test suit.
gus_massa•1mo ago