Yes, it was kinda refactoring for bigger client and his use case with bigger data. I implemented slightly different order in input data, so it now does work more serially than parallel, yes, it takes less memory, and is more cache-friendly. But that's not all.
I also moved it from row-oriented CSV to column oriented. You say good refactor! Yeah, i think so, too. Yes, that uses Clickhouse as a DB. Native Clickhouse column-oriented wire format vs any human oriented format. Yes, I gave it Clickhouse C++ source, because format however stable, is not documented in the "documentation". Yes, it created custom-tailored serializer and de-serializer, including dictionaries (low cardinality columns). Yes, i explained what I expect from it.
Yes, I asked LLM to not implement data transfer objects. Instead, it reads directly from Clickhouse native wire format, without allocations, and writes Clickhouse native wire format, without allocations. It allocates slightly when processing data itself, but I optimized it too.
Code did few passes on data, I asked LLM to perform loop fusion and do all in one pass, because, as a human, that would complicate code, and was not done before, but the client is important, you know.
It contained some suboptimal data layout IN ITS CORE, too. I played and measured several layouts, by just changing it (adding/removing few indirections here and there), all code using this CORE DATA was adapted automatically, so I did quite a few iterations evaluating best one during that morning.
Already efficient code became 20 times faster. Not because it was not efficient. But because it was legacy, human oriented, well designed, and it worked, it was java, even with fast parser/processor optimizations, reduced allocations etc - it was long maintained, it was an asset.
I just applied some transformations to it, in mostly automatic way. Yes, I can do that again.
This is called supercompilation, guys. It can be automated these days. Legacy is original generic program. Like they synthesized rocket engine using AI (see news and pictures circa year ago), we can synthesize supercompiled programs from single legacy source, given various boundary conditions.
My congratulations!
NB Supercompilation is not a cool word, it's quite old concept in IT, for example see https://sites.google.com/site/keldyshscp/Home/supercompilerconcept
PS on the "this code is liability now". Nope. Cost of maintaining this code TODAY is radically lower than ever before.