frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: RFX-Fuse: Breiman and Cutler's Random Forest + Explainable Similarity

https://github.com/chriskuchar/RFX-Fuse
1•ck33•2h ago

Comments

ck33•2h ago
Hi HN, I'm the author.

Breiman and Cutler's original Random Forest implementation (early 2000s) included far more than what modern libraries provide — classification, regression, unsupervised learning, proximity-based similarity, outlier detection, missing value imputation, and visualization. When scikit-learn implemented Random Forests they only included classification, regression, and overall permutation importance.

I wanted to fix that, so I implemented the full original vision and extended it with something new: Native Explainable Similarity. The model can now answer "what makes this sample similar to it's neighbors?" directly via Proximity Importance — as far as I am aware this is the first native explainable similarity in ML.

The practical result is that a single trained model (one set of trees grown once) is now a comparative alternative to situations that require 3-5 separate tools. For example, a recommender system that would normally need FAISS + XGBoost + SHAP + Isolation Forests + custom code can be done with 1-2 RFX-Fuse models.

Special thanks to Dr. Adele Cutler for sharing original Breiman-Cutler source materials, which made this possible.

Written in C++/CUDA with Python bindings. GPU and CPU versions available.

Happy to answer questions about the implementation or methodology.