frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Biscuit is a specialized PostgreSQL index for fast pattern matching LIKE queries

https://github.com/CrystallineCore/Biscuit
34•eatonphil•4d ago

Comments

eatonphil•4d ago
Noticed Daniel Lemire talking about it and how they use Roaring Bitmaps.

https://x.com/lemire/status/2000944944832504025

fabian2k•2h ago
Looks very interesting. I really like trigram indexes for certain use cases, but those are essentially running an ILIKE %something% on various text content in the DB. So that would fit the described limitations of this index type very well.

Usually you're quickly steered towards fulltext search (tsvector) in Postgres if you want to do something like that. But depending on what kind of search you actually need, trigram indexes can be a better option. If you don't search so much for natural language, but more for specific keywords the stemming in fulltext search can get in the way.

One information that would be nice here is a comparison of the index size on disk for both index types.

out_of_protocol•1h ago
Any data on index size for big tables? Comparison (with ms/megabytes) vs trigram regarding size/speed?

UPD

> Biscuit is 15.0× faster than B-tree (median) and 5.6× faster than Trigram (median)

> Trade-off: 3.2× larger index than Trigram, but 5.6× faster queries (median)

maxmcd•47m ago
I found some more info here: https://biscuit.readthedocs.io/en/latest/benchmark_roaring.h...
kwillets•1h ago
This is a fairly simple idea of indexing characters for each column/offset and compressing the bitmaps. Simple is good, as the overhead of more sophisticated ideas (eg suffix sorting) is often prohibitive.

One suggestion is to index the end-of-string as a character as well; then you don't need negative offsets. But that turns the suffix search into a wildcard type of thing where you have to try all offsets, which is what the '%pat%' searches do already, so maybe it's OK.

pedrozieg•26m ago
Postgres’s extensible index AM story doesn’t get enough love, so it’s nice to see someone really lean into it for LIKE. Biscuit is basically saying: “what if we precompute an aggressive amount of bitmap structure (forward/backward char positions, case-insensitive variants, length buckets) so most wildcard patterns become a handful of bitmap ops instead of a heap scan or bitmap heap recheck?” That’s a very different design point from pg_trgm, which optimizes more for fuzzy-ish matching and general text search than for “I run a ton of LIKE '%foo%bar%' on the same columns”.

The interesting question in prod is always the other side of that trade: write amplification and index bloat. The docs are pretty up-front that write performance and concurrency haven’t been deeply characterized yet, and they even have a section on when you should stick with pg_trgm or plain B-trees instead. If they can show that Biscuit stays sane under a steady stream of updates on moderately long text fields, it’ll be a really compelling option for the common “poor man’s search” use case where you don’t want to drag in an external search engine but ILIKE '%foo%' is killing your box.

eats_indigo•12m ago
How is the postgres ecosystem at stating when these kinds of things are ready for adoption? I can think of a usecase at work where this might be useful, but hesitant to just start throwing random opensource extensions at our monolith DB.

Ireland’s Diarmuid Early wins world Microsoft Excel title

https://www.bbc.com/news/articles/cj4qzgvxxgvo
84•1659447091•1h ago•24 comments

Backing Up Spotify

https://annas-archive.li/blog/backing-up-spotify.html
260•vitplister•2h ago•87 comments

Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates

https://www.a1k0n.net/2025/12/19/tiny-tapeout-demo.html
211•a1k0n•4h ago•24 comments

Log level 'error' should mean that something needs to be fixed

https://utcc.utoronto.ca/~cks/space/blog/programming/ErrorsShouldRequireFixing
233•todsacerdoti•3d ago•150 comments

OpenSCAD is kinda neat

https://nuxx.net/blog/2025/12/20/openscad-is-kinda-neat/
126•c0nsumer•3h ago•87 comments

Big GPUs don't need big PCs

https://www.jeffgeerling.com/blog/2025/big-gpus-dont-need-big-pcs
74•mikece•3h ago•26 comments

I spent a week without IPv4 (2023)

https://www.apalrd.net/posts/2023/network_ipv6/
75•mahirsaid•2h ago•98 comments

Gemini 3 Pro vs. 2.5 Pro in Pokemon Crystal

https://blog.jcz.dev/gemini-3-pro-vs-25-pro-in-pokemon-crystal
210•alphabetting•4d ago•64 comments

Go ahead, self-host Postgres

https://pierce.dev/notes/go-ahead-self-host-postgres#user-content-fn-1
320•pavel_lishin•5h ago•230 comments

Biscuit is a specialized PostgreSQL index for fast pattern matching LIKE queries

https://github.com/CrystallineCore/Biscuit
34•eatonphil•4d ago•7 comments

Show HN: HN Wrapped 2025 - an LLM reviews your year on HN

https://hn-wrapped.kadoa.com?year=2025
57•hubraumhugo•7h ago•31 comments

NTP at NIST Boulder Has Lost Power

https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/ACADD3NKOG2QRWZ56OSNNG7UIEKKT...
387•lpage•13h ago•179 comments

Depot (YC W23) Is Hiring an Enterprise Support Engineer (Remote/US)

https://www.ycombinator.com/companies/depot/jobs/jhGxVjO-enterprise-support-engineer
1•jacobwg•3h ago

Immersa: Open-source Web-based 3D Presentation Tool

https://github.com/ertugrulcetin/immersa
112•simonpure•7h ago•14 comments

X-59 3D Printing

https://www.nasa.gov/stem-content/x-59-3d-printing/
21•Jsebast24•4d ago•1 comments

Detailed balance in large language model-driven agents

https://arxiv.org/abs/2512.10047
30•Anon84•4d ago•2 comments

Skills Officially Comes to Codex

https://developers.openai.com/codex/skills/
208•rochansinha•13h ago•109 comments

Why do people leave comments on OpenBenches?

https://shkspr.mobi/blog/2025/12/why-do-people-leave-comments-on-openbenches/
51•sedboyz•5h ago•3 comments

Mathematicians don't care about foundations

https://matteocapucci.wordpress.com/2022/12/21/mathematicians-dont-care-about-foundations/
16•scrivanodev•3h ago•7 comments

CSS Grid Lanes

https://webkit.org/blog/17660/introducing-css-grid-lanes/
685•frizlab•23h ago•210 comments

Privacy doesn't mean anything anymore, anonymity does

https://servury.com/blog/privacy-is-marketing-anonymity-is-architecture/
327•ybceo•15h ago•220 comments

You have reached the end of the internet

https://hmpg.net/
24•raytopia•4h ago•1 comments

Mistral OCR 3

https://mistral.ai/news/mistral-ocr-3
656•pember•2d ago•119 comments

Reflections on AI at the End of 2025

https://antirez.com/news/157
174•danielfalbo•11h ago•251 comments

Charles Proxy

https://www.charlesproxy.com/
277•handfuloflight•15h ago•102 comments

Shallow trees with heavy leaves (2020)

https://cp4space.hatsya.com/2020/12/13/shallow-trees-with-heavy-leaves/
5•HeliumHydride•5d ago•0 comments

Garage – An S3 object store so reliable you can run it outside datacenters

https://garagehq.deuxfleurs.fr/
673•ibobev•1d ago•150 comments

New Quantum Antenna Reveals a Hidden Terahertz World

https://www.sciencedaily.com/releases/2025/12/251213032617.htm
122•aacker•4d ago•8 comments

Maximizing Compression of Apple II Hi-Res Images

http://deater.net/weave/vmwprod/hgr_compress/
19•deater•4d ago•2 comments

A train-sized tunnel is now carrying electricity under South London

https://www.ianvisits.co.uk/articles/a-train-sized-tunnel-is-now-carrying-electricity-under-south...
112•zeristor•12h ago•82 comments