frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

450× Faster Joins with Index Condition Pushdown

https://readyset.io/blog/optimizing-straddled-joins-in-readyset-from-hash-joins-to-index-condition-pushdown
58•marceloaltmann•4d ago

Comments

marceloaltmann•4d ago
Straddled joins were still a bottleneck in Readyset even after switching to hash joins. By integrating Index Condition Pushdown into the execution path, we eliminated the inefficiency and achieved up to 450× speedups.
LtdJorge•2h ago
Why downvote?
airstrike•2h ago
Reads like an ad written by an LLM, is my guess.

It could just be that they translated from their original language to English and got that as a byproduct. Many such cases.

Sesse__•1h ago
It also does not add anything interesting to the discussion. Like, why add a bland summary of the article?
ianks•3h ago
I love this type of practical optimization for DB queries. I’ve always liked how [rom-rb](https://rom-rb.org/learn/core/5.2/combines/) made the combine pattern easy to use when joins are slow. Nice to see this implemented at DB layer
jamesblonde•3h ago
We call these pushdown joins in rondb. They only support an equality condition for the index condition. Joins with index condition pushdown is a bit of a mouthful.

We also went from like 6 seconds to 50ms. Huge speedup.

Reference

https://docs.rondb.com/rondb_parallel_query/#pushdown-joins

bdcravens•3h ago
What database engine is this in? You reference your product, but I assume this is in MySQL/MariaDB?

https://dev.mysql.com/doc/refman/9.4/en/index-condition-push...

Sesse__•3h ago
This isn't really the same as MySQL's ICP; it seems more like what MySQL would call a “ref” or “eq_ref” lookup, i.e. a simple lookup on an indexed value on the right side of a nested-loop join. It's bread and butter for basically any database optimizer.

ICP in MySQL (which can be built on top of ref/eq_ref, but isn't part of the normal index lookup per se) is a fairly weird concept where the storage engine is told to evaluate certain predicates on its own, without returning the row to the optimizer. This is to a) reduce the number of round-trips (function calls) from the executor down into the storage engine, and b) because InnoDB's secondary indexes need an extra storage round-trip to return the row (secondary indexes don't point at the row you want, they contain the primary key and then you have to lookup the actual row from the PK), so if you can remove the row early, you can skip the main row lookup.

LtdJorge•2h ago
Seems like they are caching MySQL with their own layer built on RocksDB.
vjerancrnjak•2h ago
Another example of row based dbs somehow being insanely slow compared to column based.

Just an endless sequence of misbehavior and we’re waving it off as rows work good for specific lookups but columns for aggregations, yet here it is all the other stuff that is unreasonably slow.

tharkun__•2h ago
It's an example. But not of that.

It's an example of old things being new again maybe. Or reinventing the wheel because the wheel wasn't known to them.

Yes I know, nobody wants to pay that tax or make that guy richer, but databases like Oracle have had JPPD for a long time. It's just something the database does and the optimizer chooses whether to do it or not depending on whether it's the best thing to do or not.

rotis•1h ago
Exactly. This is a basic optimization technique and all the dinosaur era databases should have that. But if you build a new database product you have to implement these techniques from scratch. There is no way you shortcut that. Reminds me about CockroachDB and them building a query optimizer[1]. They started with rule based one and then switched to cost based. Feature that older databases already had.

[1] https://www.cockroachlabs.com/blog/building-cost-based-sql-o...

sschnei8•2h ago
I feel like this is more an example of:

“We filtered first instead of reading an entire table from disk and performing a lookup”

Where both OLAP and OLTP dbms would benefit.

To your point, it’s clear certain workloads lend themselves to OLAP and columnar storage much better, but “an endless sequence of misbehavior” seems a bit harsh .

flufluflufluffy•1h ago
I read their website landing page but it’s still kinda confusing — what exactly is readyset? It all sounds like it’s a cache you can set up in front of MySQL/postgres. But then this article is talking about implementing joins which is what the database itself would do, not a cache. But then the blurbs talk about it like it’s a “CDN for your database” that brings your data to the edge. What the heck is it?!
Sesse__•1h ago
It seems to be some sort of read-only reimplementation of MySQL/Postgres that can ingest their replication streams and materialize views (for caching). Complete with a really primitive optimizer, if the article is to be believed.
aseipp•32m ago
ReadySet is basically "incremental view maintenance" but applied to arbitrary SQL queries. It acts like a caching proxy for your database, but it simultaneously ingests the replication log from the system in order to see things happen. Then it uses that information to perform "incremental" updates of data it has cached, so that if you requery something, it is much faster.

Naive example: let's say you had a query that was a table scan and it computed the average age of all users in the users table. If you insert a new row into the users table and then rerun the query, you'd expect another table scan, so it will grow over time. In a traditional setup, you might cache this query and only update it "every once in a while." Instead, ReadySet can decompose this query into an "incremental program", run it, cache the result -- and then when it sees the insert into the table it incrementally updates the underlying data cache. That means the second run would actually be fast, and the cost to update the cache is only proportional to the underlying change, not the size of the table.

ReadySet is derived from research on Noira, whose paper is here: https://www.usenix.org/conference/osdi18/presentation/gjengs...

dangoodmanUT•1h ago
Maybe it's not obvious initially, but in retrospect, this handling of joins feels like the obvious way to handle it.

Push down filters to read the least data possible.

Or, know your data and be able to tell the query engine which kind of join strategy you would like (hash vs push down)

SoftTalker•1h ago
Decades ago we used to provide hints in queries based on "knowing the data" but modern optimizers have a lot better statistics on indexes, and the need to tell the query optimizer what to do should be rare.

Libre – An anonymous social experiment without likes, followers, or ads

https://libreantisocial.com
42•rododecba•2h ago•32 comments

RFC 9839 and Bad Unicode

https://www.tbray.org/ongoing/When/202x/2025/08/14/RFC9839
164•Bogdanp•5h ago•81 comments

Manim: Animation engine for explanatory math videos

https://github.com/3b1b/manim
297•pykello•11h ago•57 comments

Librebox: An open source, Roblox-compatible game engine

https://github.com/librebox-devs/librebox-demo
159•libreboxdevs•7h ago•37 comments

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

https://gau-nernst.github.io/fa-5090/
96•dsr12•6h ago•10 comments

I Made a Floppy Disk from Scratch

https://kottke.org/25/08/i-made-a-floppy-disk-from-scratch
131•bookofjoe•8h ago•59 comments

Rethinking the Linux cloud stack for confidential VMs

https://lwn.net/Articles/1030818/
89•Bogdanp•7h ago•28 comments

450× Faster Joins with Index Condition Pushdown

https://readyset.io/blog/optimizing-straddled-joins-in-readyset-from-hash-joins-to-index-conditio...
58•marceloaltmann•4d ago•18 comments

Determinants and causal effects of admission to selective private colleges [pdf]

https://www.nber.org/system/files/working_papers/w31492/w31492.pdf
10•EvgeniyZh•8h ago•4 comments

Bild AI (YC W25) Is Hiring Applied AI Founding Engineer

https://www.workatastartup.com/jobs/75647
1•rooppal•1h ago

Optimizing our way through Metroid

https://antithesis.com/blog/2025/metroid/
38•eatonphil•1d ago•0 comments

Developer's block

https://underlap.org/developers-block/
140•todsacerdoti•9h ago•74 comments

WebR – R in the Browser

https://docs.r-wasm.org/webr/latest/
110•sieste•4d ago•22 comments

Why was Apache Kafka created?

https://bigdata.2minutestreaming.com/p/why-was-apache-kafka-created
6•enether•23h ago•1 comments

Texas Instruments' $60B U.S. project, the next iPhone chips fabric

https://www.cnbc.com/2025/08/22/apple-will-make-chips-at-texas-instruments-60-billion-us-project....
9•giuliomagnifico•1d ago•2 comments

Lightning declines over shipping lanes following regulation of sulfur emissions

https://theconversation.com/the-world-regulated-sulfur-in-ship-fuels-and-the-lightning-stopped-24...
183•lentoutcry•4d ago•43 comments

Arrays in Practice (2024)

https://programming-journal.org/2024/8/14/
4•mpweiher•2d ago•0 comments

Waitgroups: What they are, how to use them and what changed with Go 1.25

https://mfbmina.dev/en/posts/waitgroups/
49•mfbmina•3h ago•33 comments

Line scan camera image processing for train photography

https://daniel.lawrence.lu/blog/y2025m09d21/
7•dllu•2h ago•0 comments

David Klein's TWA Posters (2018)

https://flashbak.com/david-kleins-magnificent-twa-posters-404428/
78•NaOH•4d ago•6 comments

The JWST Rocky Worlds DDT Program reveals GJ 3929B to likely be a bare rock

https://arxiv.org/abs/2508.12516
6•bikenaga•3h ago•0 comments

RFK Jr demanded a vaccine study be retracted – the journal said no

https://www.nature.com/articles/d41586-025-02682-9
49•rntn•1h ago•29 comments

The F-35 Is Losing the (Trade) War

https://www.jalopnik.com/1945910/f-35-fighter-jet-losing-trade-war/
3•rntn•15m ago•0 comments

Shader Academy: Learn computer graphics by solving challenges

https://shaderacademy.com/
221•pykello•3d ago•59 comments

Converting an online game to work without any JavaScript

https://bejofo.com/blog/no-js-game-case-study
16•YannickR•4d ago•6 comments

My experience creating software with LLM coding agents – Part 2 (Tips)

https://efitz-thoughts.blogspot.com/2025/08/my-experience-creating-software-with_22.html
168•efitz•17h ago•80 comments

The first Media over QUIC CDN: Cloudflare

https://moq.dev/blog/first-cdn/
278•kixelated•1d ago•110 comments

Show HN: JavaScript-free (X)HTML Includes

https://github.com/Evidlo/xsl-website
196•Evidlo•1d ago•104 comments

You can't grow cool-climate plants in hot climates

https://www.crimepaysbutbotanydoesnt.com/blog/why-you-cant-grow-cool-climate-plants-in-hot-climates
151•surprisetalk•3d ago•108 comments

Nitro: A tiny but flexible init system and process supervisor

https://git.vuxu.org/nitro/about/
223•todsacerdoti•23h ago•83 comments