frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Pyversity – Fast Result Diversification for Retrieval and RAG

https://github.com/Pringled/pyversity
18•Tananon•2h ago
Hey HN! I’ve recently open-sourced Pyversity, a lightweight library for diversifying retrieval results. Most retrieval systems optimize only for relevance, which can lead to top-k results that look almost identical. Pyversity efficiently re-ranks results to balance relevance and diversity, surfacing items that remain relevant but are less redundant. This helps with improving retrieval, recommendation, and RAG pipelines without adding latency or complexity.

Main features:

- Unified API: one function (diversify) supporting several well-known strategies: MMR, MSD, DPP, and COVER (with more to come)

- Lightweight: the only dependency is NumPy, keeping the package small and easy to install

- Fast: efficient implementations for all supported strategies; diversify results in milliseconds

Re-ranking with cross-encoders is very popular right now, but also very expensive. From my experience, you can usually improve retrieval results with simpler and faster methods, such as the ones implemented in this package. This helps retrieval, recommendation, and RAG systems present richer, more informative results by ensuring each new item adds new information.

Code and docs: github.com/pringled/pyversity

Let me know if you have any feedback, or suggestions for other diversification strategies to support!

Comments

leobg•58m ago
Might also be useful for dataset curation, or even just prompt engineering. For example when training a classification task and picking a diverse set of examples for training or evaluation.
Tananon•42m ago
True, I think that's also a great usecase! Though these algorithms likely won't scale to very large datasets (e.g. millions of samples), but for smaller datasets, like fine-tuning sets, I think this would work very well. I've worked on something similar in the past that works for larger datasets (semantic deduplication: https://github.com/MinishLab/semhash).

Do the new obesity drugs pay for themselves?

https://medicalxpress.com/news/2025-10-obesity-drugs-pay.html
1•PaulHoule•1m ago•0 comments

I found the missing 6GB on my Mac (APFS, recovery partitions, and GB vs. GiB)

https://mikenotthepope.com/i-found-the-missing-6gb-on-my-mac-apfs-recovery-partitions-and-gb-vs-gib/
1•MikeNotThePope•2m ago•0 comments

Show HN: The modern flip phone – but it's an iPhone

https://dumbsmartphones.com
1•YPCrumble•3m ago•0 comments

Roast Domains at This Domain Sucks

https://thisdomain.sucks
1•nachoag7•3m ago•0 comments

Creative Disruption in the Order of the World

https://www.noemamag.com/creative-disruption-in-the-order-of-the-world/
1•Brajeshwar•3m ago•0 comments

When Pollution Spikes in Southeast Asia, Rainfall Shifts from Land to Sea

https://e360.yale.edu/digest/southeast-asia-aerosols-rainfall?asds
2•Brajeshwar•3m ago•0 comments

ML-builder: Tool to recreate charts with prompt-friendly inputs

https://ml-builder.vercel.app/
1•samuelleecong•5m ago•1 comments

Ask HN: What are revenue generating side projects you can do utilizing AI?

1•sandboxdev•5m ago•0 comments

AT&T Long Lines – A Forgotten System (2018)

https://personal.garrettfuller.org/blog/2018/01/19/att-long-lines-a-forgotten-system/
1•Bogdanp•6m ago•0 comments

Space Frontiers

https://spacefrontiers.org/
1•alterdaddy•7m ago•0 comments

Robotics Scissors

https://huggingface.co/robotics-course
1•cjbarber•9m ago•0 comments

Yet Another Year with Decker

http://beyondloom.com/blog/unionstate3.html
1•RodgerTheGreat•10m ago•0 comments

America's Rare Earth Delusion

https://www.ft.com/content/583abbd2-ffa8-4232-931f-66f55949b5d5
1•bookofjoe•10m ago•1 comments

Discovery of Paranthropus Hand Changes Understanding of Human Evolution

https://www.haaretz.com/archaeology/2025-10-15/ty-article/paranthropus-boisei-hand-found-for-firs...
1•wslh•11m ago•0 comments

Breakthrough Vitamin K Compounds May Reverse Alzheimer's Damage

https://scitechdaily.com/breakthrough-vitamin-k-compounds-may-reverse-alzheimers-damage/
1•01-_-•12m ago•0 comments

New fossils reveal the hand of Paranthropus boisei

https://www.nature.com/articles/s41586-025-09594-8
1•wslh•13m ago•0 comments

Philco Predicta

https://en.wikipedia.org/wiki/Predicta
1•bariumbitmap•13m ago•0 comments

Judge says body cameras for Chicago officers "was not a suggestion"

https://www.cbsnews.com/chicago/news/judge-homeland-security-federal-agents-chicago-body-cameras/
18•01-_-•14m ago•4 comments

Who Owns RubyGems? [video]

https://www.youtube.com/shorts/a2MYmmHKBWA
1•basileafe•14m ago•1 comments

A firewall for AI agents – early access waitlist (92.2% attack detection acc.)

https://savira.dev/
1•colinlevine•16m ago•0 comments

Windows 11 25H2 October Update Bug Renders Recovery Environment Unusable

https://www.techpowerup.com/342032/windows-11-25h2-october-update-bug-renders-recovery-environmen...
5•MaximilianEmel•20m ago•0 comments

BitLocker encryption permanently locks users' backup drives causing loss of data

https://www.tomshardware.com/software/windows/bitlocker-reportedly-auto-locks-users-backup-drives...
4•josephcsible•22m ago•0 comments

Show HN: Moonfish – AI podcast generator with research, writing, and voicing

https://apps.apple.com/us/app/moonfish-ai/id6748574770
1•huygiab•22m ago•0 comments

Repo to AI: Bash one-liner to concatenate directory contents for LLM

https://firasd.substack.com/p/repo-to-ai-bash-one-liner-to-concatenate
1•firasd•24m ago•0 comments

What Unix Pipelines Got Right (and How We Can Do Better)

https://programmingsimplicity.substack.com/p/what-unix-pipelines-got-right-and
1•FromTheArchives•25m ago•0 comments

Should Pocketbase add support for a native Python SDK?

https://github.com/pocketbase/pocketbase/discussions/7263
1•Olshansky•26m ago•0 comments

I provide technical clarity to non-technical leaders

https://www.seangoedecke.com/clarity/
1•kowalhn•29m ago•0 comments

'Is AI a trillion-dollar bubble or a world-changing juggernaut?'

https://thenewstack.io/is-ai-a-trillion-dollar-bubble-or-a-world-changing-juggernaut/
2•MilnerRoute•30m ago•0 comments

Ask HN: Do any AI Chat apps implement LLM URI scheme?

1•smashah•30m ago•0 comments

How to Build a Low-Tech Website

https://solar.lowtechmagazine.com/2018/09/how-to-build-a-low-tech-website/
2•FromTheArchives•31m ago•0 comments