frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Bridging the gap between keyword and semantic search with SPLADE (2024)

http://arcturus-labs.com/blog/2024/10/09/bridging-the-gap-between-keyword-and-semantic-search-with-splade/
23•softwaredoug•8mo ago

Comments

jbellis•8mo ago
I'm kind of disappointed in this article, Splade is a cool way to improve results of a TF/IDF index with minimally invasive changes and this obscures that more than it clarifies.

> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.

Yes, exactly, that is the whole point of Splade.

Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.

Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1

JnBrymn•8mo ago
You're absolutely right. This was a post I tossed together quickly just to see what could be done without thinking too much. In retrospect, I think this would be better implemented using Elasticsearch sparse vector fields which allow you to specify the value of every token. Maybe I'l make an update post to try again.

East Germany balloon escape

https://en.wikipedia.org/wiki/East_Germany_balloon_escape
493•robertvc•16h ago•168 comments

Cloudflare acquires Astro

https://astro.build/blog/joining-cloudflare/
811•todotask2•18h ago•358 comments

High-Level Is the Goal

https://bvisness.me/high-level/
117•tobr•1d ago•51 comments

Cursor's latest “browser experiment” implied success without evidence

https://embedding-shapes.github.io/cursor-implied-success-without-evidence/
542•embedding-shape•18h ago•224 comments

FLUX.2 [Klein]: Towards Interactive Visual Intelligence

https://bfl.ai/blog/flux2-klein-towards-interactive-visual-intelligence
123•GaggiX•9h ago•39 comments

Beebo, a wave simulator written in C

https://git.sr.ht/~willowf/beebo/
45•anon25783•3d ago•3 comments

6-Day and IP Address Certificates Are Generally Available

https://letsencrypt.org/2026/01/15/6day-and-ip-general-availability
403•jaas•17h ago•230 comments

LLM Structured Outputs Handbook

https://nanonets.com/cookbooks/structured-llm-outputs
240•vitaelabitur•1d ago•39 comments

Drone Hacking Part 1: Dumping Firmware and Bruteforcing ECC

https://neodyme.io/en/blog/drone_hacking_part_1/
55•tripdout•6h ago•5 comments

Every data centre is a U.S. military base

https://www.policyalternatives.ca/news-research/every-data-centre-is-a-u-s-military-base/
27•HotGarbage•1h ago•6 comments

Releasing rainbow tables to accelerate Net-NTLMv1 protocol deprecation

https://cloud.google.com/blog/topics/threat-intelligence/net-ntlmv1-deprecation-rainbow-tables
114•linolevan•11h ago•68 comments

IKEA for Software

https://tommaso-girotto.co/blog/an-ikea-for-software
65•tgirotto•4d ago•38 comments

Dell UltraSharp 52 Thunderbolt Hub Monitor

https://www.dell.com/en-us/shop/dell-ultrasharp-52-thunderbolt-hub-monitor-u5226kw/apd/210-bthw/m...
220•cebert•16h ago•281 comments

Experts Warn of Growing Parrot Crisis in Canada

https://www.ctvnews.ca/ottawa/video/2026/01/06/experts-warn-of-growing-parrot-crisis-in-canada/
58•debo_•4d ago•27 comments

STFU

https://github.com/Pankajtanwarbanna/stfu
813•tanelpoder•15h ago•512 comments

Keifu – A TUI for navigating commit graphs with color and clarity

https://github.com/trasta298/keifu
42•indigodaddy•8h ago•6 comments

Reading across books with Claude Code

https://pieterma.es/syntopic-reading-claude/
97•gmays•14h ago•23 comments

Which is "Bouba", and which is "Kiki"? [video]

https://www.youtube.com/watch?v=1TDIAObsqcs
15•basilikum•6d ago•14 comments

Show HN: Tusk Drift – Turn production traffic into API tests

https://github.com/Use-Tusk/tusk-drift-cli
27•jy-tan•1d ago•1 comments

Install.md: A standard for LLM-executable installation

https://www.mintlify.com/blog/install-md-standard-for-llm-executable-installation
68•npmipg•11h ago•87 comments

Patching the Wii News Channel to serve local news (2025)

https://raulnegron.me/2025/wii-news-pr/
87•todsacerdoti•20h ago•21 comments

Meditation and Unconscious: A Buddhist Monk and a Neuroscientist (2022)

https://thereader.mitpress.mit.edu/meditation-and-the-unconscious-buddhism-neuroscience-conversat...
9•arunc•3h ago•3 comments

The 'untouchable hacker god' behind Finland's biggest ever crime

https://www.theguardian.com/technology/2026/jan/17/vastaamo-hack-finland-therapy-notes
19•c420•1h ago•11 comments

Elasticsearch was never a database

https://www.paradedb.com/blog/elasticsearch-was-never-a-database
132•jamesgresql•5d ago•94 comments

You have three minutes to escape the perpetual underclass – geohot

https://geohot.github.io//blog/jekyll/update/2026/01/17/three-minutes.html
83•mefengl•57m ago•91 comments

Emoji Use in the Electronic Health Record is Increasing

https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2843883
79•giuliomagnifico•15h ago•73 comments

The five orders of ignorance (2000)

https://cacm.acm.org/opinion/the-five-orders-of-ignorance/
56•svilen_dobrev•4d ago•16 comments

HTTP RateLimit Headers

https://dotat.at/@/2026-01-13-http-ratelimit.html
61•zdw•2d ago•13 comments

Michelangelo's first painting, created when he was 12 or 13

https://www.openculture.com/2026/01/discover-michelangelos-first-painting.html
348•bookofjoe•19h ago•165 comments

Dev-owned testing: Why it fails in practice and succeeds in theory

https://dl.acm.org/doi/10.1145/3780063.3780066
135•rbanffy•19h ago•157 comments