frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Bridging the gap between keyword and semantic search with SPLADE (2024)

http://arcturus-labs.com/blog/2024/10/09/bridging-the-gap-between-keyword-and-semantic-search-with-splade/
23•softwaredoug•8mo ago

Comments

jbellis•8mo ago
I'm kind of disappointed in this article, Splade is a cool way to improve results of a TF/IDF index with minimally invasive changes and this obscures that more than it clarifies.

> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.

Yes, exactly, that is the whole point of Splade.

Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.

Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1

JnBrymn•8mo ago
You're absolutely right. This was a post I tossed together quickly just to see what could be done without thinking too much. In retrospect, I think this would be better implemented using Elasticsearch sparse vector fields which allow you to specify the value of every token. Maybe I'l make an update post to try again.

Level S4 solar radiation event

https://www.swpc.noaa.gov/news/g4-severe-geomagnetic-storm-levels-reached-19-jan-2026
167•WorldPeas•4h ago•64 comments

Nearly a third of social media research has undisclosed ties to industry

https://www.science.org/content/article/nearly-third-social-media-research-has-undisclosed-ties-i...
168•bikenaga•6h ago•75 comments

Nanolang: A tiny experimental language designed to be targeted by coding LLMs

https://github.com/jordanhubbard/nanolang
43•Scramblejams•2h ago•21 comments

What came first: the CNAME or the A record?

https://blog.cloudflare.com/cname-a-record-order-dns-standards/
261•linolevan•7h ago•93 comments

The assistant axis: situating and stabilizing the character of LLMs

https://www.anthropic.com/research/assistant-axis
39•mfiguiere•3h ago•5 comments

Reticulum, a secure and anonymous mesh networking stack

https://github.com/markqvist/Reticulum
7•brogu•38m ago•2 comments

British redcoat's lost memoir reveals harsh realities of life as a disabled vet

https://phys.org/news/2026-01-british-redcoat-lost-memoir-reveals.html
25•wglb•3d ago•13 comments

Targeted Bets: An alternative approach to the job hunt

https://www.seanmuirhead.com/blog/targeted-bets
34•seany62•3h ago•36 comments

From Nevada to Kansas by Glider

https://www.weglide.org/flight/978820
87•sammelaugust•4d ago•14 comments

How we made Python's packaging library 3x faster

https://iscinumpy.dev/post/packaging-faster/
21•rbanffy•3d ago•1 comments

Notes on Apple's Nano Texture (2025)

https://jon.bo/posts/nano-texture/
118•dsr12•6h ago•74 comments

The coming industrialisation of exploit generation with LLMs

https://sean.heelan.io/2026/01/18/on-the-coming-industrialisation-of-exploit-generation-with-llms/
56•long•16h ago•47 comments

Conditions in the Intel 8087 floating-point chip's microcode

https://www.righto.com/2025/12/8087-microcode-conditions.html
80•diogotozzi•4d ago•22 comments

Use Social Media Mindfully

https://danielleheberling.xyz/blog/mindful-social-media/
18•mooreds•2h ago•9 comments

Weight Transfer for RL Post-Training in under 2 seconds

https://research.perplexity.ai/articles/weight-transfer-for-rl-post-training-in-under-2-seconds
15•jxmorris12•4h ago•0 comments

Show HN: An interactive physics simulator with 1000's of balls, in your terminal

https://github.com/minimaxir/ballin
19•minimaxir•6h ago•4 comments

Sending Data over Offline Finding Networks

https://cc-sw.com/find-my-and-find-hub-network-research/
62•findmysanity•5d ago•6 comments

CSS Web Components for marketing sites (2024)

https://hawkticehurst.com/2024/11/css-web-components-for-marketing-sites/
97•zigzag312•9h ago•47 comments

Simple Sabotage Field Manual (1944) [pdf]

https://www.cia.gov/static/5c875f3ec660e092cf893f60b4a288df/SimpleSabotage.pdf
93•praptak•3h ago•41 comments

Graphics In Flatland – 2D ray tracing [video]

https://www.youtube.com/watch?v=WYTOykSqf2Y
48•evakhoury•3d ago•11 comments

Radicle 1.6.0 – Amaryllis

https://radicle.xyz/2026/01/14/radicle-1.6.0
14•zdw•5d ago•2 comments

Show HN: Pipenet – A Modern Alternative to Localtunnel

https://pipenet.dev/
79•punkpeye•8h ago•15 comments

San Francisco coyote swims to Alcatraz

https://www.sfgate.com/local/article/san-francisco-coyote-alcatraz-21302218.php
131•kaycebasques•22h ago•39 comments

There's a hidden Android setting that spots fake cell towers

https://www.howtogeek.com/theres-a-hidden-android-setting-that-spots-fake-cell-towers/
91•rmason•4h ago•26 comments

Show HN: A creative coding library for making art with desktop windows

https://github.com/willmeyers/window-art
21•willmeyers•4h ago•2 comments

Fix your robots.txt or your site disappears from Google

https://www.alanwsmith.com/en/37/wa/jz/s1/
110•bobbiechen•7h ago•67 comments

Bypassing Gemma and Qwen safety with raw strings

https://teendifferent.substack.com/p/apply_chat_template-is-the-safety
93•teendifferent•19h ago•25 comments

Iterative image reconstruction using random cubic bézier strokes

https://tangled.org/luthenwald.tngl.sh/splined
71•luthenwald•4d ago•16 comments

Letter from a Birmingham Jail (1963)

https://www.africa.upenn.edu/Articles_Gen/Letter_Birmingham.html
407•hn_acker•5h ago•136 comments

GLM-4.7-Flash

https://huggingface.co/zai-org/GLM-4.7-Flash
322•scrlk•9h ago•107 comments