frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Bridging the gap between keyword and semantic search with SPLADE (2024)

http://arcturus-labs.com/blog/2024/10/09/bridging-the-gap-between-keyword-and-semantic-search-with-splade/
23•softwaredoug•8mo ago

Comments

jbellis•7mo ago
I'm kind of disappointed in this article, Splade is a cool way to improve results of a TF/IDF index with minimally invasive changes and this obscures that more than it clarifies.

> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.

Yes, exactly, that is the whole point of Splade.

Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.

Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1

JnBrymn•7mo ago
You're absolutely right. This was a post I tossed together quickly just to see what could be done without thinking too much. In retrospect, I think this would be better implemented using Elasticsearch sparse vector fields which allow you to specify the value of every token. Maybe I'l make an update post to try again.

Cameras and Lenses (2020)

https://ciechanow.ski/cameras-and-lenses/
205•sebg•2h ago•18 comments

Show HN: OpenWorkers – Self-hosted Cloudflare workers in Rust

https://openworkers.com/introducing-openworkers
267•max_lt•5h ago•92 comments

iOS allows alternative browser engines in Japan

https://developer.apple.com/support/alternative-browser-engines-jp/
229•eklavya•6h ago•144 comments

Python numbers every programmer should know

https://mkennedy.codes/posts/python-numbers-every-programmer-should-know/
161•WoodenChair•5h ago•74 comments

C-events, yet another event loop, simpler, smaller, faster, safer

https://zelang-dev.github.io/c-events/
32•thetechstech•6d ago•4 comments

Memory Subsystem Optimizations

https://johnnysswlab.com/memory-subsystem-optimizations/
25•mfiguiere•2h ago•3 comments

Dell's version of the DGX Spark fixes pain points

https://www.jeffgeerling.com/blog/2025/dells-version-dgx-spark-fixes-pain-points
13•thomasjb•1h ago•1 comments

Bluetooth Headphone Jacking: A Key to Your Phone [video]

https://media.ccc.de/v/39c3-bluetooth-headphone-jacking-a-key-to-your-phone
366•AndrewDucker•9h ago•111 comments

Common Lisp SDK for the Datastar Hypermedia Framework

https://github.com/fsmunoz/datastar-cl
53•fsmunoz•4h ago•7 comments

Quickemu: Quickly create and run optimised Windows, macOS and Linux VMs

https://github.com/quickemu-project/quickemu
54•teekert•2d ago•5 comments

All my Deutschlandtickets gone: Fraud at an industrial scale [video]

https://media.ccc.de/v/39c3-all-my-deutschlandtickets-gone-fraud-at-an-industrial-scale
54•Kyro38•4d ago•10 comments

Build a Deep Learning Library

https://zekcrates.quarto.pub/deep-learning-library/
66•butanyways•5h ago•11 comments

Implementing HNSW (Hierarchical Navigable Small World) Vector Search in PHP

https://centamori.com/index.php?slug=hierarchical-navigable-small-world-hnsw-php&lang=en
61•centamiv•4h ago•13 comments

Building an internal agent: Code-driven vs. LLM-driven workflows

https://lethain.com/agents-coordinators/
19•pavel_lishin•1h ago•1 comments

Finland detains ship and its crew after critical undersea cable damaged

https://www.cnn.com/2025/12/31/europe/finland-estonia-undersea-cable-ship-detained-intl
70•wslh•1h ago•26 comments

Sony PS5 ROM keys leaked – jailbreaking could be made easier with BootROM codes

https://www.tomshardware.com/video-games/playstation/playstation-5-rom-keys-leaked-jailbreaking-c...
190•gloxkiqcza•4h ago•38 comments

Love your customers

https://bcantrill.dtrace.org/2025/12/31/love-your-customers/
51•chmaynard•22h ago•7 comments

Show HN: Wario Synth – Turn any song into Game Boy version

https://www.wario.style
10•birdmania•9h ago•2 comments

Heap Overflow in FFmpeg EXIF

https://bugs.pwno.io/0014
64•retr0reg•4h ago•22 comments

Simple 3D Packing

https://github.com/Vrroom/psacking
28•matroid•5d ago•4 comments

Worlds largest electric ship launched by Tasmanian boatbuilder

https://www.theguardian.com/australia-news/2025/may/02/hull-096-worlds-largest-electric-ship-batt...
108•aussieguy1234•9h ago•89 comments

Arpanet standardized TCP/IP on this day in 1983

https://www.tomshardware.com/networking/arpanet-standardized-tcp-ip-on-this-day-in-1983-43-year-o...
14•barishnamazov•47m ago•0 comments

If you care about security you might want to move the iPhone Camera app

https://blog.jgc.org/2025/12/if-you-care-about-security-you-might.html
115•jgrahamc•4d ago•48 comments

2025: The Year in LLMs

https://simonwillison.net/2025/Dec/31/the-year-in-llms/
809•simonw•20h ago•437 comments

Children and Helical Time

https://moultano.wordpress.com/2025/12/30/children-and-helical-time/
131•moultano•10h ago•95 comments

Rust--: Rust without the borrow checker

https://github.com/buyukakyuz/rustmm
107•ravenical•9h ago•164 comments

2025 Letter

https://danwang.co/2025-letter/
161•Amorymeltzer•5h ago•96 comments

Meta made scam ads harder to find instead of removing them

https://sherwood.news/tech/rather-than-fully-cracking-down-on-scam-ads-meta-worked-to-make-them-h...
245•wtcactus•7h ago•81 comments

Easel Turns One One year of building my own IDE in Clojure

https://blog.phronemophobic.com/easel-one-year.html
154•todsacerdoti•5d ago•13 comments

The Curious Case of the Shallow Session SPAs

https://calendar.perfplanet.com/2025/the-curious-case-of-the-shallow-session-spas/
10•tatersolid•4h ago•5 comments