frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

P2P crypto exchange development company

1•sonniya•2m ago•0 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
1•jesperordrup•7m ago•0 comments

Write for Your Readers Even If They Are Agents

https://commonsware.com/blog/2026/02/06/write-for-your-readers-even-if-they-are-agents.html
1•ingve•7m ago•0 comments

Knowledge-Creating LLMs

https://tecunningham.github.io/posts/2026-01-29-knowledge-creating-llms.html
1•salkahfi•8m ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•15m ago•0 comments

Sid Meier's System for Real-Time Music Composition and Synthesis

https://patents.google.com/patent/US5496962A/en
1•GaryBluto•22m ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
3•keepamovin•23m ago•1 comments

Show HN: Empusa – Visual debugger to catch and resume AI agent retry loops

https://github.com/justin55afdfdsf5ds45f4ds5f45ds4/EmpusaAI
1•justinlord•26m ago•0 comments

Show HN: Bitcoin wallet on NXP SE050 secure element, Tor-only open source

https://github.com/0xdeadbeefnetwork/sigil-web
2•sickthecat•28m ago•1 comments

White House Explores Opening Antitrust Probe on Homebuilders

https://www.bloomberg.com/news/articles/2026-02-06/white-house-explores-opening-antitrust-probe-i...
1•petethomas•28m ago•0 comments

Show HN: MindDraft – AI task app with smart actions and auto expense tracking

https://minddraft.ai
2•imthepk•33m ago•0 comments

How do you estimate AI app development costs accurately?

1•insights123•34m ago•0 comments

Going Through Snowden Documents, Part 5

https://libroot.org/posts/going-through-snowden-documents-part-5/
1•goto1•35m ago•0 comments

Show HN: MCP Server for TradeStation

https://github.com/theelderwand/tradestation-mcp
1•theelderwand•38m ago•0 comments

Canada unveils auto industry plan in latest pivot away from US

https://www.bbc.com/news/articles/cvgd2j80klmo
3•breve•39m ago•1 comments

The essential Reinhold Niebuhr: selected essays and addresses

https://archive.org/details/essentialreinhol0000nieb
1•baxtr•41m ago•0 comments

Rentahuman.ai Turns Humans into On-Demand Labor for AI Agents

https://www.forbes.com/sites/ronschmelzer/2026/02/05/when-ai-agents-start-hiring-humans-rentahuma...
1•tempodox•43m ago•0 comments

StovexGlobal – Compliance Gaps to Note

1•ReviewShield•46m ago•1 comments

Show HN: Afelyon – Turns Jira tickets into production-ready PRs (multi-repo)

https://afelyon.com/
1•AbduNebu•47m ago•0 comments

Trump says America should move on from Epstein – it may not be that easy

https://www.bbc.com/news/articles/cy4gj71z0m0o
6•tempodox•48m ago•3 comments

Tiny Clippy – A native Office Assistant built in Rust and egui

https://github.com/salva-imm/tiny-clippy
1•salvadorda656•52m ago•0 comments

LegalArgumentException: From Courtrooms to Clojure – Sen [video]

https://www.youtube.com/watch?v=cmMQbsOTX-o
1•adityaathalye•55m ago•0 comments

US moves to deport 5-year-old detained in Minnesota

https://www.reuters.com/legal/government/us-moves-deport-5-year-old-detained-minnesota-2026-02-06/
8•petethomas•58m ago•3 comments

If you lose your passport in Austria, head for McDonald's Golden Arches

https://www.cbsnews.com/news/us-embassy-mcdonalds-restaurants-austria-hotline-americans-consular-...
1•thunderbong•1h ago•0 comments

Show HN: Mermaid Formatter – CLI and library to auto-format Mermaid diagrams

https://github.com/chenyanchen/mermaid-formatter
1•astm•1h ago•0 comments

RFCs vs. READMEs: The Evolution of Protocols

https://h3manth.com/scribe/rfcs-vs-readmes/
3•init0•1h ago•1 comments

Kanchipuram Saris and Thinking Machines

https://altermag.com/articles/kanchipuram-saris-and-thinking-machines
1•trojanalert•1h ago•0 comments

Chinese chemical supplier causes global baby formula recall

https://www.reuters.com/business/healthcare-pharmaceuticals/nestle-widens-french-infant-formula-r...
2•fkdk•1h ago•0 comments

I've used AI to write 100% of my code for a year as an engineer

https://old.reddit.com/r/ClaudeCode/comments/1qxvobt/ive_used_ai_to_write_100_of_my_code_for_1_ye...
2•ukuina•1h ago•1 comments

Looking for 4 Autistic Co-Founders for AI Startup (Equity-Based)

1•au-ai-aisl•1h ago•1 comments
Open in hackernews

Show HN: TabPFN-2.5 – SOTA foundation model for tabular data

https://priorlabs.ai/technical-reports/tabpfn-2-5-model-report
73•onasta•3mo ago
I am excited to announce the release of TabPFN-2.5, our tabular foundation model that now scales to datasets of up to 50,000 samples and 2,000 features - a 5x increase from TabPFN v2, published in the Nature journal earlier this year. TabPFN-2.5 delivers state-of-the-art predictions in one forward pass without hyperparameter tuning across classification and regression tasks.

What’s new in 2.5: TabPFN-2.5 maintains the core approach of v2 - a pretrained transformer trained on more than hundred million synthetic datasets to perform in-context learning and output a predictive distribution for the test data. It natively supports missing values, cateogrical features, text and numerical features is robust to outliers and uninformative features.

The major improvements:

- 5x scale increase: Now handles 50,000 samples × 2,000 features (up from 10,000 × 500 in v2)

- SOTA performance: TabPFN-2.5 outperforms tuned tree-based methods and matches the performance of a complex ensemble (AutoGluon 1.4), that itself includes TabPFN v2, tuned for 4 hours. Tuning the model improves performance, outperforming AutoGluon 1.4 for regression tasks.

- Rebuilt API: New REST interface along with Python SDK with dedicated fit & predict endpoints, making deployment and integration more developer-friendly

- A distillation engine that converts TabPFN-2.5 into a compact MLP or tree ensemble while preserving accuracy and offer low latency inference.

There are still some limitations. The model is designed for datasets up to 50K samples. It can handle larger datasets but that hasn’t been our focus with TabPFN-2.5. The distillation engine is not yet available through the API but only through licenses (though we do show the performance in the model report).

We’re actively working on removing these limitations and intend to release newer models focused on context reasoning, causal inference, graph networks, larger data and time-series. TabPFN-2.5 is available via API and a package on Hugging Face. Would love for you to try it and give us your feedback!

Model report: https://priorlabs.ai/technical-reports/tabpfn-2-5-model-repo...

Package: https://github.com/PriorLabs/TabPFN

Client: https://github.com/PriorLabs/tabpfn-client

Docs: https://docs.priorlabs.ai/quickstart

Comments

klemens_floege•3mo ago
Good stuff!
zurfer•3mo ago
The current go to solution for the kinds of problems that TabPFN is solving would be something like XGBoost. In general it's a good baseline, but the challenge is always that you need to spend a lot of time feature engineering and tweaking the data representation before something like XGBoost can deliver good performance on your regression or classification problems.

For me the promise of foundation models for tabular data is that there are enough generalizable patterns, so that you need less manual feature engineering and data cleaning.

And kudos to the team, I think it's a really creative application of neural networks. I was always frustrated with neural networks, since they were hard to tune on "structured" data and always under-performed (for me), but we also never had real foundational models for structured data.

noahho•3mo ago
Less feature engineering is definitely something we are aiming for. The current version is actually only based on statistics, the real world connections between features is something we're working on right now and hope to show results for soon. That's the next step
dill_1•3mo ago
Tabular data is still underrated!
noahho•3mo ago
When we released TabPFNv1 over three years ago, I didn’t expect at all the hundreds of comments and reposts we would see. Tabular data had been a field getting little love from AI research—but we immediately felt that this was a topic that data scientists, scientists, financial analysts, and enterprise users deeply cared about. Glad its useful to people!
abracos•3mo ago
how does it compare to automl tools?
noahho•3mo ago
TabPFN-2.5 default (one forward pass) matches AutoGluon 1.4 tuned for four-hours. Autogluon is the strongest AutoML including stacking of XGB and cat boost and even includes the previous TabPFNv2.
TheTaytay•3mo ago
Looks really cool. In reading through the FAQ, it says this: Q: "How are text features handled?" A: "In the local package version text features are encoded as categoricals without considering their semantic meaning. Our API automatically detects text features and includes their semantic meaning into our prediction. The local package version encodes text as numerical categories and does not include semantic meaning."

So that means that automatic embedding/semantic meaning is reserved for API use of TabPFN, right? Otherwise, if I use it locally, it's going to assign each of my distinct text values an arbitrary int, right?

noahho•3mo ago
Yes exactly, the API is the best way to handle text features. The actual semantics often matter a lot . Is the API an option for you or would you need this local?
vessenes•3mo ago
I think you need a custom benchmark -- have you considered making one out of the excel world championships?
scorpion7•3mo ago
It's fascinating how this works with such a small model. Especially given that the training is a kind of meta learning of "how to do in-context learning". I wonder, is there a good intuition of the role of the MLP in this architecture? For LLMs the consensus seems to be that they store knowledge...what would that be for tabular data?
aitchnyu•3mo ago
Are applications using table models supposed to load the entire thing into context? As A CRUD app guy, I either ask AI to read entire table if its small or use/make scripts to analyze it if its big.
enigmaa99•2mo ago
have been using since an year now for benchmarking and the improvements with 2.5 look massive. A lot of usecases already discussed in the report will help interdisciplinary domains improve their predictions.