Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters - https://news.ycombinator.com/item?id=22742905 - March 2020 (25 comments)
Xor Filters: Faster and Smaller Than Bloom Filters - https://news.ycombinator.com/item?id=21840821 - Dec 2019 (81 comments)
IIRC, binary fuse filters are faster to construct than ribbon filters, but typically not quite as space-efficient. There are also frayed ribbon filters (by me) which are slower and more complex to construct but more space-efficient. There's no paper for those, just a Rust implementation.
Ribbon filters are deployed in Mozilla's Clubcard for distributing compressed certificate revocation lists: https://github.com/mozilla/clubcard and https://jmschanck.info/papers/20250327-clubcard.pdf. CRLs are an almost ideal application of this sort of compressed set tech, since the aggregator runs batch jobs and needs to distribute the set to very many clients. It's not perfectly ideal because CRLs require frequent updates and none of these methods support delta updates. There is a straightforward but inelegant workaround, which is to send a compressed set that represents the delta, and query both on the client.
> we build probabilistic filters -- called binary fuse filters -- that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.
> that are within 13% of the storage lower bound -- without sacrificing query speed. As an additional benefit, the construction of the new binary fuse filters can be more than twice as fast as the construction of xor filters. By slightly sacrificing query speed, we further reduce storage to within 8% of the lower bound.
They are faster to construct (2x) and smaller (within 13% of theoretical limit) while maintaining the same query performance.
I'm especially fond of tiny filters; a well-placed 32- or 64-bit Bloom filter can be surprisingly effective in the right context!
Tiny filters: Some time ago I worked on tiny statistics [2]. This includes some 64-bit HyperLogLog implementations; some use linear counting, which is basically a 64-bit Bloom filter, until some limit, and only then switch to HyperLogLog. This is great for distinct counts of columns in databases (cardinality estimation). This project also includes 64-bit approximate counts and histograms.
[1] https://github.com/FastFilter/fastfilter_java/blob/master/fa... [2] https://github.com/thomasmueller/tinyStats
I guess that is because the count storage update is really slow, right, so it's better to have one than two (or whatever number of set bits) operations? At least the linked code seems to process it one by one bit when updating, and without some sort of “rank of n-th set bit” operation (which would accelerate the “select” version fairly well), I'm not sure it could be made much faster than that either.
Edit: I see https://github.com/FastFilter/fastfilter_java/blob/master/fa... tries to make that operation in O(1), but to be honest, with this many operations, the cure almost looks worse than the disease. :-) Haven't benchmarked, though.
How would you realistically do this without creating more overhead in the thread communication than what you're saving? I've never heard of offloading a 30–40 cycle operation to a thread before. Typically sending an atomic across CPUs is what, 200 cycles? (That's assuming you have the thread just sitting there in some kind of pool; firing up a new one is much, much more expensive. It also assumes you never need to hear back from it, so wouldn't work well for a decrease where you need to know the count.)
The "succinct counting (blocked) Bloom filters" have two components: the regular Bloom filter for querying, and the count storage. The count storage is not "strictly" needed for querying: you will never get a false _negative_ is increments / decrements in the count storage are deferred. So updating the count storage can be done asynchronously. Both increments and decrements, if you want. So these can be buffered, eg 100 increments / decrements at a time. Sure, the false-positive rate is slightly higher than needed if the decrements are deferred, but with a large filter this effect is small.
So that means, you can buffer increments / decrements in the count storage. You still want to do the "or" operations in the the Bloom filter part synchronously, so that you don't get false negatives. And then, it is no longer just 30-40 cycles, but 3000-4000 cycles, or 30'000-40'000 cycles, or so. I understand this would not be trivial to implement, but also not very complex. I never really had a real-world use case so far, so I didn't work on a full implementation yet.
As it fills up the false probability rate goes up. Once the false probability rate reaches the threshold of unacceptability, the bloom filter is full, and you can no longer insert into it.
That most interfaces still let you do something that looks like an insert is an interface failure, not a bloom filter feature.
If you find this controversial and want to reply "I don't have a threshold of unacceptability", I'll counter that a false probability rate of 100% will be reached eventually. And if you still find that acceptable, you can trivially modify any probabilistic filter to "never become full" by replacing the "is full" error condition with setting a flag that all future queries should return a false positive.
https://link.springer.com/content/pdf/10.1186/s12859-021-039...
Still very neat, nice work!
Static filters are good for eg. leaked password lists, LSM trees (LevelDB, RocksDB), and so on.
Bloom filters are the coolest thing ever and I wish I could replace them with something better.
djmips•2w ago
orlp•2w ago
pwagland•2w ago
> Building on theoretical work by Dietzfelbinger and Walzer [8], we propose a novel practical approach, the binary fuse filters. They are conceptually similar to xor filters, and they rely on nearly the same simple code.