frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Bridging the gap between keyword and semantic search with SPLADE (2024)

http://arcturus-labs.com/blog/2024/10/09/bridging-the-gap-between-keyword-and-semantic-search-with-splade/
23•softwaredoug•2mo ago

Comments

jbellis•2mo ago
I'm kind of disappointed in this article, Splade is a cool way to improve results of a TF/IDF index with minimally invasive changes and this obscures that more than it clarifies.

> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.

Yes, exactly, that is the whole point of Splade.

Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.

Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1

JnBrymn•2mo ago
You're absolutely right. This was a post I tossed together quickly just to see what could be done without thinking too much. In retrospect, I think this would be better implemented using Elasticsearch sparse vector fields which allow you to specify the value of every token. Maybe I'l make an update post to try again.

# [derive(Clone)] Is Broken

https://rgbcu.be/blog/derive-broken/
59•RGBCube•3d ago•27 comments

New sphere-packing record stems from an unexpected source

https://www.quantamagazine.org/new-sphere-packing-record-stems-from-an-unexpected-source-20250707/
314•pseudolus•13h ago•140 comments

Mercury: Ultra-fast language models based on diffusion

https://arxiv.org/abs/2506.17298
451•PaulHoule•19h ago•195 comments

Epanet-JS

https://macwright.com/2025/07/03/epanet-placemark
75•surprisetalk•3d ago•6 comments

The chemical secrets that help keep honey fresh for so long

https://www.bbc.com/future/article/20250701-the-chemical-secrets-that-help-keep-honey-fresh-for-so-long
139•bookofjoe•3d ago•71 comments

LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping

https://studios.disneyresearch.com/2025/06/09/lookingglass-generative-anamorphoses-via-laplacian-pyramid-warping/
77•jw1224•10h ago•14 comments

What Microchip doesn't (officially) tell you about the VSC8512

https://serd.es/2025/07/04/Switch-project-pt3.html
95•ahlCVA•3d ago•17 comments

I used o3 to profile myself from my saved Pocket links

https://noperator.dev/posts/o3-pocket-profile/
376•noperator•19h ago•143 comments

Launch HN: Morph (YC S23) – Apply AI code edits at 4,500 tokens/sec

183•bhaktatejas922•17h ago•141 comments

The Two Towers MUD

https://t2tmud.org/
90•astronads•2d ago•52 comments

The Miyawaki Method of micro-forestry

https://www.futureecologies.net/listen/fe-6-5-the-method
145•zeristor•3d ago•28 comments

Adding a feature because ChatGPT incorrectly thinks it exists

https://www.holovaty.com/writing/chatgpt-fake-feature/
871•adrianh•17h ago•314 comments

My first verified imperative program

https://markushimmel.de/blog/my-first-verified-imperative-program/
147•TwoFx•14h ago•68 comments

When Figma starts designing us

https://designsystems.international/ideas/when-figma-starts-designing-us/
244•bravomartin•1d ago•109 comments

What is going on in Unix with errno's limited nature

https://utcc.utoronto.ca/~cks/space/blog/unix/ErrnoWhySoLimited
36•ingve•4d ago•14 comments

François Chollet: The Arc Prize and How We Get to AGI [video]

https://www.youtube.com/watch?v=5QcCeSsNRks
188•sandslash•4d ago•164 comments

Why are there no good dinosaur films?

https://briannazigler.substack.com/p/why-are-there-no-good-dinosaur-films
94•fremden•3d ago•209 comments

ChatGPT testing a mysterious new feature called 'study together'

https://techcrunch.com/2025/07/07/chatgpt-is-testing-a-mysterious-new-feature-called-study-together/
11•Bluestein•55m ago•2 comments

CU Randomness Beacon

https://random.colorado.edu/
26•wello•2d ago•5 comments

Show HN: NYC Subway Simulator and Route Designer

https://buildmytransit.nyc
157•HeavenFox•18h ago•18 comments

Lightfastness Testing of Colored Pencils

https://sarahrenaeclark.com/lightfast-testing-pencils/
145•picture•3d ago•37 comments

Man of Glass: Boccaccio: A Biography

https://literaryreview.co.uk/man-of-glass
3•Thevet•3d ago•0 comments

Analysing Roman itineraries using GIS tooling

https://link.springer.com/article/10.1007/s12520-025-02175-w
27•diodorus•3d ago•3 comments

SIMD.info – Reference tool for C intrinsics of all major SIMD engines

https://simd.info/
19•pabs3•6h ago•5 comments

Hymn to Babylon, missing for a millennium, has been discovered

https://phys.org/news/2025-07-hymn-babylon-millennium.html
191•wglb•4d ago•82 comments

Solving Wordle with uv's dependency resolver

https://mildbyte.xyz/blog/solving-wordle-with-uv-dependency-resolver/
157•mildbyte•2d ago•14 comments

The era of exploration

https://yidingjiang.github.io/blog/post/exploration/
90•jxmorris12•16h ago•8 comments

Integrated photonic source of Gottesman–Kitaev–Preskill qubits

https://www.nature.com/articles/s41586-025-09044-5
7•gnabgib•3d ago•1 comments

Neanderthals operated prehistoric “fat factory” on German lakeshore

https://archaeologymag.com/2025/07/neanderthals-operated-fat-factory-125000-years-ago/
250•hilux•4d ago•186 comments

Trying to find meaning in owning an old Mac

https://blog.decryption.net.au/posts/macse30.html
83•decryption•5h ago•41 comments