I work on ET-Miner [https://zenodo.org/records/18674353], which is a GPU-accelerated frequent itemset mining pipeline based on the infamous apriori-algorithm. We came with the idea to reformulate the algorithm into a fully vectorized implementation, using a boolean transaction matrix representation,CUDA kernels + Rust group builder for index construction to speed up computations. The original use case was mining protein structure patterns from AlphaFold, where we processed 109.2M proteins and extracted 16.8 billion frequent itemsets for protein structural motif discovery. At some point I realized the same pipeline could be pointed at any domain with structured categorical data, so I pointed it at poker, one of my long-standing hobbies.
What we learned: Most of the "surprising" patterns the mining surfaces are things good players already know intuitively: positional advantages, aggression frequency correlations, stack-to-pot ratios. But seeing them as statistically validated itemsets with exact support counts is different from folk wisdom. A few patterns around multi-way pot dynamics and specific board texture interactions were genuinely non-obvious to the poker players we showed them to. Modern GTO solvers have no solutions for these multi-way pot scenarios. Oh, controversial, but donk-betting is a ~40% winner's exclusive rule
Data is completely free to exlore. In total 1.4m rules have been mined from the PHH dataset published here: https://zenodo.org/records/13997158.