On a 2025 hold-out (~61,000 hours), it beats the operators' own day-ahead submissions to EIA — the production forecasts they use to schedule generation — on 6 of 7 major RTOs. Macro MAE ~40% lower. The one loss is ISO-NE, whose forecasting is just very good (24h-ahead MASE 0.34). On the same window, CAISO and SPP operator submissions did worse than "same as yesterday."
The site plots the median + 80% PI band against the operator submission, with 48h of actuals running into the forecast.
Code, model on HF, operator-comparison benchmark reproduces from one script:
- https://github.com/tylergibbs1/surge - https://huggingface.co/Tylerbry1/surge-fm-v3