frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Bridging the gap between keyword and semantic search with SPLADE (2024)

http://arcturus-labs.com/blog/2024/10/09/bridging-the-gap-between-keyword-and-semantic-search-with-splade/
23•softwaredoug•8mo ago

Comments

jbellis•8mo ago
I'm kind of disappointed in this article, Splade is a cool way to improve results of a TF/IDF index with minimally invasive changes and this obscures that more than it clarifies.

> Next, my SPLADE implementation in Elasticsearch is oversimplified. If you scroll back up to get_splade_embedding, we extract non-zero elements from vec_np (the SPLADE tokens) but discard their associated weights. This is a missed opportunity. The SPLADE papers use these weights for scoring matches.

Yes, exactly, that is the whole point of Splade.

Probably easier to demonstrate if you drop down a level to Lucene, I don't think you will be able to do it easily with Elastic.

Tangentially, I haven't looked closely at SPLATE which tries to marry Splade and ColBERT, but it's an interesting idea. https://arxiv.org/html/2404.13950v1

JnBrymn•8mo ago
You're absolutely right. This was a post I tossed together quickly just to see what could be done without thinking too much. In retrospect, I think this would be better implemented using Elasticsearch sparse vector fields which allow you to specify the value of every token. Maybe I'l make an update post to try again.

Cowork: Claude Code for the rest of your work

https://claude.com/blog/cowork-research-preview
978•adocomplete•15h ago•437 comments

FOSS in times of war, scarcity and (adversarial) AI [video]

https://fosdem.org/2026/schedule/event/FE7ULY-foss-in-times-of-war-scarcity-and-ai/
11•maelito•1h ago•1 comments

Text-Based Web Browsers

https://cssence.com/2026/text-based-web-browsers/
94•pabs3•5h ago•41 comments

TimeCapsuleLLM: LLM trained only on data from 1800-1875

https://github.com/haykgrigo3/TimeCapsuleLLM
616•admp•18h ago•253 comments

U.S. Emissions Jumped in 2025 as Coal Power Rebounded

https://www.nytimes.com/2026/01/13/climate/us-emissions-2025-coal-power.html
18•fleahunter•45m ago•2 comments

Postal Arbitrage

https://walzr.com/postal-arbitrage
407•The28thDuck•17h ago•205 comments

Floppy disks turn out to be the greatest TV remote for kids

https://blog.smartere.dk/2026/01/floppy-disks-the-best-tv-remote-for-kids/
632•mchro•21h ago•356 comments

The Cray-1 Computer System (1977) [pdf]

https://s3data.computerhistory.org/brochures/cray.cray1.1977.102638650.pdf
98•LordGrey•3d ago•52 comments

The chess bot on Delta Air Lines will destroy you (2024) [video]

https://www.youtube.com/watch?v=c0mLhHDcY3I
240•cjaackie•14h ago•217 comments

Implementing a web server in a single printf() call (2014)

https://tinyhack.com/2014/03/12/implementing-a-web-server-in-a-single-printf-call/
54•nateb2022•4d ago•4 comments

Unauthenticated remote code execution in OpenCode

https://cy.md/opencode-rce/
335•CyberShadow•1d ago•109 comments

Some ecologists fear their field is losing touch with nature

https://www.nature.com/articles/d41586-025-04150-w
118•Growtika•4d ago•59 comments

Date is out, Temporal is in

https://piccalil.li/blog/date-is-out-and-temporal-is-in/
386•alexanderameye•19h ago•154 comments

Chromium Has Merged JpegXL

https://chromium-review.googlesource.com/c/chromium/src/+/7184969
110•thunderbong•4h ago•18 comments

Fabrice Bellard's TS Zip (2024)

https://www.bellard.org/ts_zip/
165•everlier•14h ago•68 comments

Apple picks Gemini to power Siri

https://www.cnbc.com/2026/01/12/apple-google-ai-siri-gemini.html
850•stygiansonic•19h ago•528 comments

LLVM: The bad parts

https://www.npopov.com/2026/01/11/LLVM-The-bad-parts.html
338•vitaut•20h ago•66 comments

Justice Delayed Is Justice Denied

https://en.wikipedia.org/wiki/Justice_delayed_is_justice_denied
14•barrister•1h ago•0 comments

Show HN: AI in SolidWorks

https://www.trylad.com
163•WillNickols•17h ago•87 comments

Zirgen: Compiler for a Domain-Specific Language

https://github.com/risc0/zirgen
10•0xkato•4d ago•0 comments

Anthropic made a mistake in cutting off third-party clients

https://archaeologist.dev/artifacts/anthropic
301•codesparkle•23h ago•200 comments

UK Expands Online Safety Act to Mandate Preemptive Scanning

https://reclaimthenet.org/uk-expands-online-safety-act-to-mandate-preemptive-scanning
7•aftergibson•45m ago•1 comments

Designing an IPv6-native P2P transport – lessons from building I6P

https://theushen.medium.com/designing-an-ipv6-native-p2p-transport-lessons-from-building-i6p-b8ca...
6•TheusHen•3d ago•1 comments

Data Exfiltration via DNS Resolution

https://github.com/anthropic-experimental/sandbox-runtime/issues/88
5•m-hodges•3h ago•0 comments

Why BM25 queries with more terms can be faster (and other scaling surprises)

https://turbopuffer.com/blog/bm25-latency-musings
27•_peregrine_•4d ago•0 comments

Windows 8 Desktop Environment for Linux

https://github.com/er-bharat/Win8DE
194•edent•21h ago•191 comments

Show HN: Yolobox – Run AI coding agents with full sudo without nuking home dir

https://github.com/finbarr/yolobox
85•Finbarr•16h ago•67 comments

F2 (YC S25) Is Hiring

https://www.ycombinator.com/companies/f2/jobs/cJsc7Fe-product-designer
1•arctech•12h ago

Show HN: Agent-of-empires: OpenCode and Claude Code session manager

https://github.com/njbrake/agent-of-empires
95•river_otter•20h ago•34 comments

The struggle of resizing windows on macOS Tahoe

https://noheger.at/blog/2026/01/11/the-struggle-of-resizing-windows-on-macos-tahoe/
2638•happosai•1d ago•1132 comments