We're the team at Meta open-sourcing MCGrad. We built this because we found that models often look calibrated on global metrics but fail silently on specific data slices (subgroups).
This library provides production-ready implementations of "multicalibration" to detect and fix these local biases.
Unlike standard calibration (which looks at the average), MCGrad optimizes for calibration across thousands of potentially overlapping subgroups simultaneously. It’s written in Python and designed to scale. It is currently in production at hundreds of ML models at Meta!
It includes: - Estimators for detecting miscalibration. - Algorithms to recalibrate predictions (post-processing). - Tools to visualize where your model is underperforming.
Docs: https://mcgrad.dev
Repo: https://github.com/facebookincubator/MCGrad/
Happy to answer any questions about the implementation or how we use it!