frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Bridging the gap between keyword and semantic search with SPLADE (2024)

http://arcturus-labs.com/blog/2024/10/09/bridging-the-gap-between-keyword-and-semantic-search-with-splade/
23•softwaredoug•8mo ago

Comments

jbellis•8mo ago
I'm kind of disappointed in this article, Splade is a cool way to improve results of a TF/IDF index with minimally invasive changes and this obscures that more than it clarifies.

> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.

Yes, exactly, that is the whole point of Splade.

Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.

Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1

JnBrymn•8mo ago
You're absolutely right. This was a post I tossed together quickly just to see what could be done without thinking too much. In retrospect, I think this would be better implemented using Elasticsearch sparse vector fields which allow you to specify the value of every token. Maybe I'l make an update post to try again.

Cloudflare acquires Astro

https://astro.build/blog/joining-cloudflare/
500•todotask2•5h ago•269 comments

STFU

https://github.com/Pankajtanwarbanna/stfu
330•tanelpoder•2h ago•193 comments

6-Day and IP Address Certificates Are Generally Available

https://letsencrypt.org/2026/01/15/6day-and-ip-general-availability
205•jaas•4h ago•117 comments

Michelangelo's first painting, created when he was 12 or 13

https://www.openculture.com/2026/01/discover-michelangelos-first-painting.html
206•bookofjoe•6h ago•124 comments

Just the Browser

https://justthebrowser.com/
376•cl3misch•7h ago•204 comments

Lock-Picking Robot

https://github.com/etinaude/Lock-Picking-Robot
171•p44v9n•4d ago•76 comments

Cursor's latest "browser experiment" implied success without evidence

https://embedding-shapes.github.io/cursor-implied-success-without-evidence/
142•embedding-shape•5h ago•65 comments

Launch HN: Indy (YC S21) – A support app designed for ADHD brains

https://www.shimmer.care/indy-redirect
42•christalwang•3h ago•44 comments

Elasticsearch Was Never a Database

https://www.paradedb.com/blog/elasticsearch-was-never-a-database
31•jamesgresql•4d ago•33 comments

Read_once(), Write_once(), but Not for Rust

https://lwn.net/SubscriberLink/1053142/8ec93e58d5d3cc06/
77•todsacerdoti•4h ago•22 comments

Earth from Space: The Fate of a Giant

https://www.esa.int/ESA_Multimedia/Images/2026/01/Earth_from_Space_The_fate_of_a_giant
7•geox•1h ago•2 comments

Zep AI (Agent Context Engineering, YC W24) Is Hiring Forward Deployed Engineers

https://www.ycombinator.com/companies/zep-ai/jobs/
1•roseway4•2h ago

Dell UltraSharp 52 Thunderbolt Hub Monitor

https://www.dell.com/en-us/shop/dell-ultrasharp-52-thunderbolt-hub-monitor-u5226kw/apd/210-bthw/m...
69•cebert•2h ago•74 comments

Dev-owned testing: Why it fails in practice and succeeds in theory

https://dl.acm.org/doi/10.1145/3780063.3780066
63•rbanffy•6h ago•83 comments

Show HN: 1Code – Open-source Cursor-like UI for Claude Code

https://github.com/21st-dev/1code
23•Bunas•1d ago•15 comments

Why DuckDB is my first choice for data processing

https://www.robinlinacre.com/recommend_duckdb/
110•tosh•8h ago•47 comments

Training my smartwatch to track intelligence

https://dmvaldman.github.io/rooklift/
112•dmvaldman•1d ago•51 comments

OpenBSD-current now runs as guest under Apple Hypervisor

https://www.undeadly.org/cgi?action=article;sid=20260115203619
375•gpi•16h ago•51 comments

psc: The ps utility, with an eBPF twist and container context

https://github.com/loresuso/psc
56•tanelpoder•6h ago•19 comments

The Alignment Game

https://dmvaldman.github.io/alignment-game/
11•dmvaldman•19h ago•1 comments

List of individual trees

https://en.wikipedia.org/wiki/List_of_individual_trees
314•wilson090•19h ago•102 comments

Can You Disable Spotlight and Siri in macOS Tahoe?

https://eclecticlight.co/2026/01/16/can-you-disable-spotlight-and-siri-in-macos-tahoe/
72•chmaynard•4h ago•57 comments

Feature Selection: A Primer

https://ikromshi.com/2025/12/30/feature-selection-primer.html
4•ikromshi•4d ago•0 comments

Zorgdomein Integration: A Guide to Secure .NET and Azure Architecture

https://plakhlani.in/healthcare/bidirectional-patient-data-exchange-with-zorgdomein/
10•prashantl•4d ago•7 comments

Interactive eBPF

https://ebpf.party/
173•samuel246•11h ago•8 comments

Pocket TTS: A high quality TTS that gives your CPU a voice

https://kyutai.org/blog/2026-01-13-pocket-tts
593•pain_perdu•1d ago•142 comments

Emoji Use in the Electronic Health Record is Increasing

https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2843883
5•giuliomagnifico•1h ago•2 comments

Our approach to advertising and expanding access to ChatGPT

https://openai.com/index/our-approach-to-advertising-and-expanding-access/
78•rvz•1h ago•45 comments

How to wrangle non-deterministic AI outputs into conventional software? (2025)

https://www.domainlanguage.com/articles/ai-components-deterministic-system/
7•druther•12h ago•3 comments

Canada slashes 100% tariffs on Chinese EVs to 6%

https://electrek.co/2026/01/16/canada-breaks-with-us-slashes-100-tariffs-chinese-evs/
308•1970-01-01•2h ago•358 comments