Even with only 2 patterns implemented out of 19 identified, it reduced a test binary by 0.02%. Imagine the potential once all patterns are covered.
Highlights:
Works directly on binaries, no source changes needed.
Compatible with existing optimizations like O2/O3, Oz, and strip.
Cross-architecture potential, not limited to ARM64 or ELF.
Can complement packing tools (like UPX) without slowing execution.
This is early-stage, but the concept proves instruction-level pattern replacement is feasible. Next step: implement all hot patterns for meaningful optimization.
Would love thoughts from anyone who’s worked with binary transformations or runtime instruction emulation.