The expense calculation might be
expense of improvement = (time taken per optimization step * cost of unit time ) / ( speedup - 1)
The expensive heuristic function is saving wall time well also being cheaper in cost of unit time. And as the paper shows the speed up provided for each unit time multiplied by unit cost of time is large.
The hardest part of this work isn't coming up with the math; it's the mental overhead of managing the scratchpad memory and async DMA calls without stepping on your own toes. You spend 3 days debugging a race condition just to find out you got a 2% speedup.
If this tool can actually handle the 'grunt work' of generating the tiling logic and memory moves based on a high-level plan, that’s a game changer. I don't even care about the 17x number as much as I care about the '0 to 1' speed. getting any performant kernel running on new hardware usually takes weeks. If this cuts it down to a few hours of LLM churning, that's huge for the industry.
Like that research that evolved an FPGA where some unconnected parts where crucial for the the expected behaviour.
https://www.eetimes.com/whatever-happened-to-evolvable-hardw...
AI has told me its not raining in my city and that in fact there was 0% chance of it that day. As I was looking out my open front door watching a heavy downpour.
We were just a little early, I think.
the 'dropout' on the optimization menu is a pretty neat hack. kinda reminds me how i work when im stuck... 'ok what if i dont unroll this loop, what else can i do?'. forces the search out of local minima. nice to see an AI tool designed around verification (the simulator loop) rather than just hoping the llm guesses right on the first shot.
what are the risks of using these kinds of tools thou? Did you get any tricky/silent bugs you had to manually fix?
Maybe if we invest $100 trillion in data centers, we can rewrite the Linux Kernel in Malbolge.
If a beam search, initiative plan and execute phase is more effective than having better tooling in a deterministic programming language then this will clearly take the lead.
qat321•2h ago