frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Model2vec-Rs – Fast Static Text Embeddings in Rust

https://github.com/MinishLab/model2vec-rs
47•Tananon•9h ago
Hey HN! We’ve just open-sourced model2vec-rs, a Rust crate for loading and running Model2Vec static embedding models with zero Python dependency. This allows you to embed text at (very) high throughput; for example, in a Rust-based microservice or CLI tool. This can be used for semantic search, retrieval, RAG, or any other text embedding usecase.

Main Features:

- Rust-native inference: Load any Model2Vec model from Hugging Face or your local path with StaticModel::from_pretrained(...).

- Tiny footprint: The crate itself is only ~1.7 mb, with embedding models between 7 and 30 mb.

Performance:

We benchmarked single-threaded on a CPU:

- Python: ~4650 embeddings/sec

- Rust: ~8000 embeddings/sec (~1.7× speedup)

First open-source project in Rust for us, so would be great to get some feedback!

Comments

noahbp•6h ago
What is your preferred static text embedding model?

For someone looking to build a large embedding search, fast static embeddings seem like a good deal, but almost too good to be true. What quality tradeoff are you seeing with these models versus embedding models with attention mechanisms?

Tananon•6h ago
It depends a bit on the task and language, but my go-to is usually minishlab/potion-base-8M for every task except retrieval (classification, clustering, etc). For retrieval minishlab/potion-retrieval-32M works best. If performance is critical minishlab/potion-base-32M is best, although it's a bit bigger (~100mb).

There's definitely a quality trade-off. We have extensive benchmarks here: https://github.com/MinishLab/model2vec/blob/main/results/REA.... potion-base-32M reaches ~92% of the performance of MiniLM while being much faster (about 70x faster on CPU). It depends a bit on your constraints: if you have limited hardware and very high throughput, these models will allow you to still make decent quality embeddings, but ofcourse an attention based model will be better, but more expensive.

refulgentis•2h ago
Thanks man this is incredible work, really appreciate the details you went into.

I've been chewing on if there was a miracle that could make embeddings 10x faster for my search app that uses minilmv3, sounds like there is :) I never would have dreamed. I'll definitely be trying potion-base in my library for Flutter x ONNX.

EDIT: I was thanking you for thorough benchmarking, then it dawned on me you were on the team that built the model - fantastic work, I can't wait to try this. And you already have ONNX!

EDIT2: Craziest demo I've seen in a while. I'm seeing 23x faster, after 10 minutes of work.

Havoc•6h ago
Surprised it is so much faster. I would have thought the python one is C under the hood
Tananon•6h ago
Indeed, I also didn't expect it to be so much faster! I think it's because most of the time is actually spent on tokenization (which also happens in Rust in the Python package), but there is some transfer overhead there between Rust and Python. The other operations should be the same speed I think.

France Becomes First Government to Endorse UN Open Source Principles

https://social.numerique.gouv.fr/@codegouvfr/114529954373492878
218•bzg•2h ago•43 comments

Spaced repetition systems have gotten better

https://domenic.me/fsrs/
679•domenicd•13h ago•418 comments

Show HN: I modeled the Voynich Manuscript with SBERT to test for structure

https://github.com/brianmg/voynich-nlp-analysis
267•brig90•8h ago•76 comments

Ditching Obsidian and building my own

https://amberwilliams.io/blogs/building-my-own-pkms
220•williamsss•8h ago•254 comments

$30 Homebrew Automated Blinds Opener

https://sifter.org/~simon/journal/20240718.html
180•busymom0•7h ago•73 comments

Show HN: Vaev – A browser engine built from scratch (It renders google.com)

https://github.com/skift-org/vaev
109•monax•7h ago•48 comments

Show HN: A platform to find tech conferences, discounts, and ticket giveaways

https://www.tech.tickets/
36•danthebaker•2d ago•10 comments

Spaced Repetition Memory System

https://notes.andymatuschak.org/Spaced_repetition_memory_system
146•gasull•9h ago•13 comments

K-Scale Labs: Open-source humanoid robots, built for developers

https://www.kscale.dev/
51•rbanffy•5h ago•29 comments

The Fall of Roam

https://every.to/superorganizers/the-fall-of-roam
82•ingve•6h ago•36 comments

Comparing Parallel Functional Array Languages: Programming and Performance

https://arxiv.org/abs/2505.08906
48•vok•2d ago•9 comments

The Journal of Imaginary Research

https://journalofimaginaryresearch.home.blog/
9•cenazoic•2d ago•1 comments

Show HN: Python Simulator of David Deutsch’s "Constructor Theory of Time"

https://github.com/gvelesandro/constructor-theory-simulator
46•SandroG•4h ago•6 comments

Building my childhood dream PC

https://fabiensanglard.net/2168/index.html
146•todsacerdoti•10h ago•57 comments

A New Headache for Honest Students: Proving They Didn't Use A.I

https://www.nytimes.com/2025/05/17/style/ai-chatgpt-turnitin-students-cheating.html
34•ripe•1h ago•18 comments

Green Fabrication of Sulfonium-Containing Bismuth Materials for X-Ray Detection

https://advanced.onlinelibrary.wiley.com/doi/10.1002/adma.202418626
4•PaulHoule•2d ago•1 comments

Show HN: Buckaroo – Data table UI for Notebooks

https://github.com/paddymul/buckaroo
78•paddy_m•9h ago•6 comments

California vanity license plate applications with reasons for rejection

https://github.com/veltman/ca-license-plates
55•networked•1h ago•55 comments

KDE is finally getting a native virtual machine manager called "Karton"

https://www.neowin.net/news/kde-is-finally-getting-a-native-virtual-machine-manager-called-karton/
57•bundie•2h ago•13 comments

The effect of physical fitness on mortality is overestimated

https://www.uu.se/en/press/press-releases/2025/2025-05-15-the-effect-of-physical-fitness-on-mortality-is-overestimated
41•gnabgib•2h ago•41 comments

In Memoriam: John L. Young, Cryptome Co-Founder

https://www.eff.org/deeplinks/2025/05/memoriam-john-l-young-cryptome-co-founder
164•coloneltcb•3d ago•15 comments

Emergent social conventions and collective bias in LLM populations

https://www.science.org/doi/10.1126/sciadv.adu9368
44•jbotz•8h ago•13 comments

Show HN: Hardtime.nvim – break bad habits and master Vim motions

https://github.com/m4xshen/hardtime.nvim
161•m4xshen•12h ago•63 comments

Yahtzeeql – Yahtzee solver that's mostly SQL

https://github.com/charliemeyer/yahtzeeql
16•skadamat•3d ago•7 comments

How the humble chestnut traced the rise and fall of the Roman Empire

https://www.bbc.com/future/article/20250513-what-chestnuts-reveal-about-the-roman-empire
39•bookofjoe•4d ago•4 comments

Dezyne Programming Language

https://dezyne.org/dezyne/manual/dezyne/dezyne.html
26•aulisius•1d ago•3 comments

How the Sun Enterprise 10000 was born (2007)

https://www.filibeto.org/aduritz/truetrue/e10000/how-e10k-wasborn.html
52•robin_reala•12h ago•51 comments

Mystical

https://suberic.net/~dmm/projects/mystical/README.html
355•mmphosis•1d ago•42 comments

AniSora: Open-source anime video generation model

https://komiko.app/video/AniSora
325•PaulineGar•1d ago•183 comments

Severed Fingers and 'Wrench Attacks' Rattle the Crypto Elite

https://www.wsj.com/finance/currencies/crypto-industry-robberies-attacks-32c2867a
41•spenvo•2h ago•19 comments