Improving your simple website's search function will take days or weeks, not hours. If you make your own search engine, it's almost guaranteed to be worse than ElasticSearch.
I agree that implementing something like Lucene from scratch would be an uphill battle. Probably not worth the time.
Implementing your own search is indeed a bit of a rite of passage. Usually, if you go look at such implementations, you'll find they implemented 1% of the features, cut lots of corners and then came up with some benchmark that proves they are faster for some toy dataset. WAND would be a good example of something most of these things don't do.
Doug is of course a search relevance expert who has published several books on the subject. So, this is not some naive person implementing BM25 but just somebody building tools they need to do bigger things. Sometimes Elasticseach/Lucene are just overkill and it is worth having your own implementation.
You can find my own vibe coded version here: https://github.com/jillesvangurp/querylight. Nice embeddable search engine for kotlin multiplatform (works in kotlin-js, android, ios, wasm, and of course jvm). I use it in some browser based apps.
If I need a proper search engine, I use Elasticsearch or Opensearch.
In that world, using haystack and choosing a backend based on C++ is so much less hassle for deployment.
Although for many things just FTS in Postgres is fine too.
I'm sure for planet scale stuff ES is fine, but otherwise I've only found it brings pain in the kind of dev I get to do.
I might actually open source it, it's a single file anyway.
Full-text search, sure, but you can easily provide a better overall search experience by creating a custom wrapping algorithm that provides shortcuts for common access patterns of your users in your application, in addition to full-text search.
(To be fair, I've only worked on projects that use ES where it is entirely unnessacary).
Much easier to deal with and faster than elastic
Sure it’s computationally expensive, inefficient even, but for many use-cases it just works.
I’d add that for production deployments, AWS has developed a new instance family that enables OpenSearch data to be stored on S3 [1], bringing significant cost savings.
[1] https://docs.aws.amazon.com/opensearch-service/latest/develo...
niazangels•19h ago