The quality of your results will depend mostly on the quality of your embeddings
https://docs.opensearch.org/docs/1.2/search-plugins/knn/appr...
For whatever an endorsement from a random stranger is worth, we've been using opensearch for a vectordb for hybrid search across text and multimodal embeddings as well as traditional metadata and it's been great but we're not "full production" yet so I can't really speak to scale, but it's opensearch so I expect the scale to be fine most probably.
So if you want to build a very, very large index using HSNWs, you have to understand if you normally have many writes that accumulate evenly, or if your index is a mostly read-only thing that is rebuilt from time to time. Mass-insertion the first time is going to be very slow. You can parallelize it if you build N parallel HNSWs, since the searches can be composed as the union of the results (sorted by cosine similarity). But often the bottleneck is the embedding model itself.
What is really not super scalable is the size of HNSWs. They use of memory is big (Redis by default uses 8 bit quantization for this reason), and on disk they require seeks. If you have large vectors, like 1024 components, quantization is a must.
If you’re using lucene HNSW, it will scale but will eat lots and lots of Heap RAM. If you’re using FAISS or nmslib plugins keep an eye out for JNI RAM consumption as well as its outside the heap.
Overall, I’d say that it is a challenge to easily scale ANN past 100M vectors unless it’s given significant attention from the team.
Setting up a simple log ingestion on Opensearch or ELK felt like a true journey, in a bad way.
These days, getting data in and out of Elasticsearch is quite easy with dynamic field mapping. Its keeping it performant which is tricky.
The vibe of the project's community is pretty much reminiscent of a dead multiplier game. The community is not thriving which is essential for an OSS project and elasticsearch is virtually irreplaceable in this space. I do not know any enterprise customers using it because it is unproven and they have failed to show they are going to stick around for the long run.
Then every other SIEM platform is spinning up their own search platforms. Heck I even saw Cribl there in their own partner list which has its own search platform now. And elastic has a SIEM platform now with Elastic Security. Not sure the purpose of this project is now Elastic just won the battle and then later virtue signaled everyone by saying we are open source again y'all because even if we come around and slapped your engineers who said they are not going to touch proprietary code, your management is not going to pay for a migration to an untested fork with no long term commitment and which was essentially made out of spite.
simple10•13h ago
Anyone know if it's still a drop in replacement for Elasticsearch? And how does it compare on performance and features?
__s•13h ago
1.x is compatible with ES 7.10
lockhead•13h ago
darkamaul•13h ago
My company did a fairly comprehensive benchmark of the two products [0] if you are interested in comparing performances.
[0] https://blog.trailofbits.com/2025/03/06/benchmarking-opensea...
Y-bar•13h ago
Salgat•12h ago
jsiepkes•2h ago
jillesvangurp•12h ago
There are some exceptions to this and vector search would be one of those. The feature was added post fork. There are a few other things of course. E.g. search_after works slightly different on both. My client works around that. And there are a lot of newer features on both sides that are annoyingly different. Both have some sql querying capabilities now but they both have their own take on that.
Elastic still has the edge on features IMHO. Especially Kibana has a lot more features than Amazon's fork. And on the aggregation front, Elastic has done quite a bit of feature and optimization work in the last few years (that's what powers the dashboards). For performance it depends what you do. But they both heavily lean on Lucene which remains the open source search library both products use. Elastic cloud is a bit better than opensearch in AWS from what I've seen. If you self host and tune, both should be very similar.
Elastic also just tagged version 9.0, which uses the same new version of Lucene as Opensearch 3.0. I have support for both new versions in my client already (added that a few weeks ago). It now works with Elasticsearch v7, 8, and 9 and Opensearch 1,2, & 3.
A lot of my consulting clients seem to prefer Opensearch lately. That's mainly because of the less complicated licensing and the AWS support. If you have a legacy Elasticsearch setup switching it to Opensearch should be doable (depending on what you use). But expect to reindex all your data. I don't think a direct migration is possible. If you use Elastic's client libraries, you may need to switch to Opensearch specific ones. This is generally a bit painful (package names, feature differences, etc.). That's why I created kt-search a few years ago.
Salgat•12h ago
simple10•11h ago
blueelephanttea•10h ago
As you point out it was forked a number of years ago so it started from the same place (7.10). Elasticsearch is now on 9.0+ and has 27,000 more commits than OpenSearch. So I doubt it is a drop-in replacement anymore.
I have no idea how many of those 27K commits are key features, but it is clear divergence.
ignoramous•9h ago
OpenSearch was once a personal search results aggregator conceived at A9 (Amazon's Silicon Valley subsidiary): https://github.com/dewitt/opensearch
Blackthorn•7h ago
Macha•7h ago
If you're just using the standard document ingestion and search stuff, yeah, they're mostly compatible. But the fancier features that were part of the paid version in the past or have been recently developed are either not compatible or missing.