frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Bridging the gap between keyword and semantic search with SPLADE (2024)

http://arcturus-labs.com/blog/2024/10/09/bridging-the-gap-between-keyword-and-semantic-search-with-splade/
23•softwaredoug•8mo ago

Comments

jbellis•7mo ago
I'm kind of disappointed in this article, Splade is a cool way to improve results of a TF/IDF index with minimally invasive changes and this obscures that more than it clarifies.

> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.

Yes, exactly, that is the whole point of Splade.

Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.

Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1

JnBrymn•7mo ago
You're absolutely right. This was a post I tossed together quickly just to see what could be done without thinking too much. In retrospect, I think this would be better implemented using Elasticsearch sparse vector fields which allow you to specify the value of every token. Maybe I'l make an update post to try again.

Happy Public Domain Day 2026

https://publicdomainreview.org/blog/2026/01/public-domain-day-2026/
128•apetresc•3h ago•14 comments

A website to destroy all websites

https://henry.codes/writing/a-website-to-destroy-all-websites/
419•g0xA52A2A•8h ago•244 comments

Why users cannot create Issues directly

https://github.com/ghostty-org/ghostty/issues/3558
84•xpe•3h ago•32 comments

Marmot – A distributed SQLite server with MySQL wire compatible interface

https://github.com/maxpert/marmot
37•zX41ZdbW•2h ago•5 comments

Can Bundler be as fast as uv?

https://tenderlovemaking.com/2025/12/29/can-bundler-be-as-fast-as-uv/
190•ibobev•7h ago•64 comments

Cameras and Lenses (2020)

https://ciechanow.ski/cameras-and-lenses/
381•sebg•11h ago•46 comments

Show HN: Enroll, a tool to reverse-engineer servers into Ansible config mgmt

https://enroll.sh
98•_mig5•1d ago•22 comments

Linux is good now

https://www.pcgamer.com/software/linux/im-brave-enough-to-say-it-linux-is-good-now-and-if-you-wan...
573•Vinnl•8h ago•495 comments

Extensibility: The "100% Lisp" Fallacy

https://kyo.iroiro.party/en/posts/100-percent-lisp/
21•todsacerdoti•3h ago•4 comments

WebAssembly as a Python Extension Platform

https://nullprogram.com/blog/2026/01/01/
58•ArmageddonIt•6h ago•1 comments

Show HN: OpenWorkers – Self-hosted Cloudflare workers in Rust

https://openworkers.com/introducing-openworkers
393•max_lt•13h ago•117 comments

BYD Sells 4.6M Vehicles in 2025, Meets Revised Sales Goal

https://www.bloomberg.com/news/articles/2026-01-01/byd-sells-4-6-million-vehicles-in-2025-meets-r...
201•toomuchtodo•12h ago•318 comments

2025 Letter

https://danwang.co/2025-letter/
282•Amorymeltzer•14h ago•185 comments

Dell's version of the DGX Spark fixes pain points

https://www.jeffgeerling.com/blog/2025/dells-version-dgx-spark-fixes-pain-points
114•thomasjb•9h ago•59 comments

James Moylan, engineer behind arrow signaling which side to refuel a car, dies

https://fordauthority.com/2025/12/ford-engineer-that-designed-gas-tank-indicator-passes-away/
12•NaOH•6d ago•2 comments

Python numbers every programmer should know

https://mkennedy.codes/posts/python-numbers-every-programmer-should-know/
306•WoodenChair•14h ago•137 comments

Bluetooth Headphone Jacking: A Key to Your Phone [video]

https://media.ccc.de/v/39c3-bluetooth-headphone-jacking-a-key-to-your-phone
445•AndrewDucker•17h ago•165 comments

50% of U.S. vinyl buyers don't own a record player

https://lightcapai.medium.com/the-great-return-from-digital-abundance-to-analog-meaning-cfda9e428752
149•ResisBey•12h ago•162 comments

Finland detains ship and its crew after critical undersea cable damaged

https://www.cnn.com/2025/12/31/europe/finland-estonia-undersea-cable-ship-detained-intl
351•wslh•9h ago•313 comments

I was wrong about TypeScript part 1

https://chefama.blog/blog/posts/i-was-wrong-about-typescript-1
19•todsacerdoti•4d ago•1 comments

Gaming on a Receipt Printer [video]

https://www.youtube.com/watch?v=oEqvYXYI56s
10•zdw•5d ago•1 comments

Quickemu: Quickly create and run optimised Windows, macOS and Linux VMs

https://github.com/quickemu-project/quickemu
127•teekert•2d ago•28 comments

I rebooted my social life

https://takes.jamesomalley.co.uk/p/this-might-be-oversharing
363•edent•17h ago•290 comments

Moving Images Related to the Apollo Missions, 1967–1969

https://catalog.archives.gov/id/133360601
40•handfuloflight•1w ago•5 comments

Straussian Memes

https://www.lesswrong.com/posts/CAwnnKoFdcQucq4hG/straussian-memes-a-lens-on-techniques-for-mass-...
28•kp1197•7h ago•34 comments

C-events, yet another event loop, simpler, smaller, faster, safer

https://zelang-dev.github.io/c-events/
66•thetechstech•6d ago•11 comments

If you care about security you might want to move the iPhone Camera app

https://blog.jgc.org/2025/12/if-you-care-about-security-you-might.html
169•jgrahamc•4d ago•79 comments

All my Deutschlandtickets gone: Fraud at an industrial scale [video]

https://media.ccc.de/v/39c3-all-my-deutschlandtickets-gone-fraud-at-an-industrial-scale
107•Kyro38•4d ago•48 comments

Building an internal agent: Code-driven vs. LLM-driven workflows

https://lethain.com/agents-coordinators/
56•pavel_lishin•10h ago•25 comments

Why Prefer Textfiles? (2010)

http://textfiles.com/uploads/textfiles.txt
20•kmstout•5h ago•22 comments