Breiman and Cutler's original Random Forest implementation (early 2000s) included far more than what modern libraries provide — classification, regression, unsupervised learning, proximity-based similarity, outlier detection, missing value imputation, and visualization. When scikit-learn implemented Random Forests they only included classification, regression, and overall permutation importance.
I wanted to fix that, so I implemented the full original vision and extended it with something new: Native Explainable Similarity. The model can now answer "what makes this sample similar to it's neighbors?" directly via Proximity Importance — as far as I am aware this is the first native explainable similarity in ML.
The practical result is that a single trained model (one set of trees grown once) is now a comparative alternative to situations that require 3-5 separate tools. For example, a recommender system that would normally need FAISS + XGBoost + SHAP + Isolation Forests + custom code can be done with 1-2 RFX-Fuse models.
Special thanks to Dr. Adele Cutler for sharing original Breiman-Cutler source materials, which made this possible.
Written in C++/CUDA with Python bindings. GPU and CPU versions available.
Happy to answer questions about the implementation or methodology.
ck33•2h ago
Breiman and Cutler's original Random Forest implementation (early 2000s) included far more than what modern libraries provide — classification, regression, unsupervised learning, proximity-based similarity, outlier detection, missing value imputation, and visualization. When scikit-learn implemented Random Forests they only included classification, regression, and overall permutation importance.
I wanted to fix that, so I implemented the full original vision and extended it with something new: Native Explainable Similarity. The model can now answer "what makes this sample similar to it's neighbors?" directly via Proximity Importance — as far as I am aware this is the first native explainable similarity in ML.
The practical result is that a single trained model (one set of trees grown once) is now a comparative alternative to situations that require 3-5 separate tools. For example, a recommender system that would normally need FAISS + XGBoost + SHAP + Isolation Forests + custom code can be done with 1-2 RFX-Fuse models.
Special thanks to Dr. Adele Cutler for sharing original Breiman-Cutler source materials, which made this possible.
Written in C++/CUDA with Python bindings. GPU and CPU versions available.
Happy to answer questions about the implementation or methodology.