What is Scaling Mode? A new pipeline around TabPFN-2.5 designed for large-N workloads, which removes the fixed row limit of TabPFN. The system works with large training sets, constrained only by your compute and memory.
We benchmarked Scaling Mode on datasets ranging from 1M to 10M rows, comparing against CatBoost, XGBoost, and LightGBM. Key findings:
- Scaling Mode enables TabPFN-2.5 to continue improving performance with more data
- Scales dramatically better than TabPFN-2.5 with 50K subsampling
- No evidence of the performance gap to gradient boosting shrinking as we scale up
- Performance continues to improve strongly with more data on the largest tested datasets
To quickly summarise our progress, here's the history of our scaling trajectory:
- TabPFN v2 (Jan 2025): 10K rows
- TabPFN-2.5 (Nov 2025): 50K rows
- Scaling Mode (today): 10M+ rows tested
Current limitations: Scaling Mode is designed for companies for now. If you’re working with data at a large scale and want to test it, access is currently only by request to ensure we can support early users properly.
Full blog post: https://priorlabs.ai/technical-reports/large-data-model
Request access: https://priorlabs.ai/tabpfn/large-data
Would love to hear feedback from anyone working with large tabular datasets!