Performance is achieved through column-wise reordering and parallel algorithms. Sort order is computed using fastutil’s parallel radix or quicksorts, and columns are reordered in parallel as well. Parquet-Sort outperforms DuckDB when sorting a 59M-row Parquet file generated by TPC-H by about 25% when compiled by graalvm native-image and about 12% faster when run with the latest Corretto 24 JVM.
More details, benchmarks, and source code are available here: https://github.com/Earnix/parquet-sort All code is under the Apache 2.0 license.
Would love feedback, ideas, or contributions.