Main features:
- Unified API: one function (diversify) supporting several well-known strategies: MMR, MSD, DPP, and COVER (with more to come)
- Lightweight: the only dependency is NumPy, keeping the package small and easy to install
- Fast: efficient implementations for all supported strategies; diversify results in milliseconds
Re-ranking with cross-encoders is very popular right now, but also very expensive. From my experience, you can usually improve retrieval results with simpler and faster methods, such as the ones implemented in this package. This helps retrieval, recommendation, and RAG systems present richer, more informative results by ensuring each new item adds new information.
Code and docs: github.com/pringled/pyversity
Let me know if you have any feedback, or suggestions for other diversification strategies to support!