OpenSearch 3.0 Released

https://opensearch.org/blog/opensearch-3-0-enhances-vector-database-performance/

110•kmaliszewski•2mo ago

Comments

simple10•2mo ago

Just learning about OpenSearch. Looks like it's a fork of Elasticsearch from 2021 when Elasticsearch changed licensing model. https://github.com/opensearch-project/OpenSearch

Anyone know if it's still a drop in replacement for Elasticsearch? And how does it compare on performance and features?

__s•2mo ago

It is not a drop in replacement (but almost is)

1.x is compatible with ES 7.10

lockhead•2mo ago

It's slower on same hardware, but fine, stay away if you need the UI, the Kibana Fork is hellish slow and riddled with bugs.

darkamaul•2mo ago

It’s slightly more complex that this. Both OpenSearch and Elasticsearch have workflows where they excel.

My company did a fairly comprehensive benchmark of the two products [0] if you are interested in comparing performances.

[0] https://blog.trailofbits.com/2025/03/06/benchmarking-opensea...

Y-bar•2mo ago

It's worth noting that in September 2024 Elasticsearch once again returned to a fully open source license (A GPLv3).

Salgat•2mo ago

Fool me once...

jsiepkes•2mo ago

But Elastic Search is still open core. So certain "enterprise" functionally will never make it in the OSS version (unlike in OpenSearch).

Y-bar•2mo ago

I work in enterprise, a lot of which is B2B, and we use Elastic Search extensively. What's "open core"?

jillesvangurp•2mo ago

I maintain a kotlin client for both Elasticsearch and Opensearch (jillesvangurp/kt-search). There are some differences but they are mostly still API compatible for most of the commonly used features.

There are some exceptions to this and vector search would be one of those. The feature was added post fork. There are a few other things of course. E.g. search_after works slightly different on both. My client works around that. And there are a lot of newer features on both sides that are annoyingly different. Both have some sql querying capabilities now but they both have their own take on that.

Elastic still has the edge on features IMHO. Especially Kibana has a lot more features than Amazon's fork. And on the aggregation front, Elastic has done quite a bit of feature and optimization work in the last few years (that's what powers the dashboards). For performance it depends what you do. But they both heavily lean on Lucene which remains the open source search library both products use. Elastic cloud is a bit better than opensearch in AWS from what I've seen. If you self host and tune, both should be very similar.

Elastic also just tagged version 9.0, which uses the same new version of Lucene as Opensearch 3.0. I have support for both new versions in my client already (added that a few weeks ago). It now works with Elasticsearch v7, 8, and 9 and Opensearch 1,2, & 3.

A lot of my consulting clients seem to prefer Opensearch lately. That's mainly because of the less complicated licensing and the AWS support. If you have a legacy Elasticsearch setup switching it to Opensearch should be doable (depending on what you use). But expect to reindex all your data. I don't think a direct migration is possible. If you use Elastic's client libraries, you may need to switch to Opensearch specific ones. This is generally a bit painful (package names, feature differences, etc.). That's why I created kt-search a few years ago.

Salgat•2mo ago

That's what we ended up doing for our migrations. We actually had a bunch of old Elasticsearch 2.3 databases (ancient), so we stood up an OpenSearch database in parallel for each and on service startup did a one-time automatic index and bulk copy over of all the data. So far very happy with OpenSearch.

simple10•2mo ago

Ah thanks for the detail! Super useful comment.

blueelephanttea•2mo ago

> Anyone know if it's still a drop in replacement for Elasticsearch?

As you point out it was forked a number of years ago so it started from the same place (7.10). Elasticsearch is now on 9.0+ and has 27,000 more commits than OpenSearch. So I doubt it is a drop-in replacement anymore.

I have no idea how many of those 27K commits are key features, but it is clear divergence.

ignoramous•2mo ago

> Just learning about OpenSearch. Looks like ...

OpenSearch was once a personal search results aggregator conceived at A9 (Amazon's Silicon Valley subsidiary): https://github.com/dewitt/opensearch

Blackthorn•2mo ago

Sometimes, the same name refers to multiple things.

ignoramous•1mo ago

The difference here is, Amazon's repurposing a Trade Mark it owned already for something else entirely.

Macha•2mo ago

One thing that Opensearch misses that would have been very nice to have on a recent project is enrich processors (https://www.elastic.co/docs/manage-data/ingest/transform-enr...)

If you're just using the standard document ingestion and search stuff, yeah, they're mostly compatible. But the fancier features that were part of the paid version in the past or have been recently developed are either not compatible or missing.

aabhay•2mo ago

Does anyone use OpenSearch for its knn and vector capabilities? Is it any good? It’s always hard to know with systems like this whether it works at scale until your team is fighting fires.

alex_duf•2mo ago

It works with some caveats. I've seen it handle searches with millions of documents no problem, but the KNN search requires to load the entirety of the embedding graph in memory. So watch your RAM consumption.

The quality of your results will depend mostly on the quality of your embeddings

seanhunter•2mo ago

Irrespective of opensearch, if the dimension of your vector embedding is reasonably large you'll probably want an approximate nearest neighbours approach like HNSW rather than knn itself

https://docs.opensearch.org/docs/1.2/search-plugins/knn/appr...

For whatever an endorsement from a random stranger is worth, we've been using opensearch for a vectordb for hybrid search across text and multimodal embeddings as well as traditional metadata and it's been great but we're not "full production" yet so I can't really speak to scale, but it's opensearch so I expect the scale to be fine most probably.

antirez•2mo ago

I don't know about OpenSearch implementation, but recently I implemented from scratch Vector Sets for Redis using the HNSW as a data structure, and there are many other stores that use the same data structure. When HNSWs are well implemented, you can stay assured they scale very well compared to the task at hand, but you can expect insertion speed only on the order of a few thousands per second, if you are hitting a single HNSW. Reads are much faster, in Redis I get 80k/s easily (but it uses multiple cores).

So if you want to build a very, very large index using HSNWs, you have to understand if you normally have many writes that accumulate evenly, or if your index is a mostly read-only thing that is rebuilt from time to time. Mass-insertion the first time is going to be very slow. You can parallelize it if you build N parallel HNSWs, since the searches can be composed as the union of the results (sorted by cosine similarity). But often the bottleneck is the embedding model itself.

What is really not super scalable is the size of HNSWs. They use of memory is big (Redis by default uses 8 bit quantization for this reason), and on disk they require seeks. If you have large vectors, like 1024 components, quantization is a must.

binarymax•2mo ago

I use it all the time. If it’s “good” depends more on your model for embeddings, but you do need to know a bit to tune the index. Whatever algo you choose, read the paper.

If you’re using lucene HNSW, it will scale but will eat lots and lots of Heap RAM. If you’re using FAISS or nmslib plugins keep an eye out for JNI RAM consumption as well as its outside the heap.

Overall, I’d say that it is a challenge to easily scale ANN past 100M vectors unless it’s given significant attention from the team.

unethical_ban•2mo ago

I just want a quick log ingestion tool that can parse syslog easily and graph/search fields for me.

Setting up a simple log ingestion on Opensearch or ELK felt like a true journey, in a bad way.

binarymax•2mo ago

It’s surprising how challenging this is for both Elastic and Opensearch. The problem is that it’s all configuration and no convention, so you need to roll everything yourself. There should be prescribed recipes to make this simpler. If you’re using something like opentelemetry you can find help easier but it’s still annoying.

nullify88•2mo ago

It's possible but you need to buy in to the Elastic ecosystem. Stuff like *beats, logstash, etc, they can configure all sorts of index templates, and ingest pipelines depending on what you've configured it to receive.

These days, getting data in and out of Elasticsearch is quite easy with dynamic field mapping. Its keeping it performant which is tricky.

dbacar•2mo ago

I think both these tools are more on the easy side of setting up if you follow their guidelines. You can be up and running very quickly. The problems arise when you need some custom logic in processing log files. If you have simple shipping requiremts you can bypass logstash altogether . Elastic and opensearch are not the right tool for application metrics though in my opinion, for that use case just use prometheus and grafana.

wingmanjd•2mo ago

Have you tried out Graylog? Their core product does pretty decently at my $DAYJOB.

wqtz•2mo ago

I feel sad for this project. This was a reactionary project to elasticsearch's license change to say, heck with it, I will open my own elastic spinoff with AWS.

The vibe of the project's community is pretty much reminiscent of a dead multiplier game. The community is not thriving which is essential for an OSS project and elasticsearch is virtually irreplaceable in this space. I do not know any enterprise customers using it because it is unproven and they have failed to show they are going to stick around for the long run.

Then every other SIEM platform is spinning up their own search platforms. Heck I even saw Cribl there in their own partner list which has its own search platform now. And elastic has a SIEM platform now with Elastic Security. Not sure the purpose of this project is now Elastic just won the battle and then later virtue signaled everyone by saying we are open source again y'all because even if we come around and slapped your engineers who said they are not going to touch proprietary code, your management is not going to pay for a migration to an untested fork with no long term commitment and which was essentially made out of spite.

whoevercares•2mo ago

I thought Elastic as the company has been economically damaged by OpenSearch and AWS for many years.

mattmcknight•2mo ago

Yeah, I'm never working with Elastic again. I used Lucene first, then Solr, then a custom scaled version, so I never really needed elasticsearch until using AWS. We did have one project on AWS using elastic, but happily moved to opensearch. Seems fine.

dislore•2mo ago

For those of us unfortunate to use Atlassian Bitbucket, from version 9.0 onwards OpenSearch is the only supported search server [1] - it'll be interesting to see whether this view is ever flipped back to Elasticsearch in the future.

[1] https://confluence.atlassian.com/bitbucketserver/end-of-supp...

Effectively Zero-Knowledge Proofs for NP with No Interaction, No Setup

Long Overdue

ValiDrive: Quickly spot-check USB mass storage drive for fraudulent capacity

Rules Clobber Goals

The Spanish Government wants Huawei to monitor for system wiretaps

DeadliQ – AI-powered deadline tracking for your documents

Unmoved mover

Ask HN: Stylography, AI and an impending privacy nightmare?

US Government announces $200M Grok contract a week after 'MechaHitler'

The Tiny Teams Playbook

Microlasers Made from Edible Substances

Careless People (Review of the Book)

Detecting and reporting all unhandled C++ exceptions

ChatGPT made up a product feature out of thin air, so this company created it

House Republicans Vote to Block Release of Epstein Files

Survival of the Greediest

Read GitHub repos in one second in VSCode

SF Bay Area Aging Demographics

Energy expenditure and obesity across the economic spectrum

AI Breaking into Higher Dimension to Mimic Human Brain and Achieve Intelligence

Show HN: Tell the world why you unfollowed/muted a social media account

Is AI the end of coding as we know it, or just another tool?

WordPress Turmoil and the Fair Package Manager

The Pragmatic Engineer 2025 Survey: What's in your tech stack?

Graph Continuous Thought Machines

Show HN: McClane – Done-for-you lead drops from Facebook group conversations

Silicon Valley, à la Française

Energy expenditure and obesity across the economic spectrum

TikTok Creator Sued by Sylvanian Doll Maker over Brand Promotions

Ask HN: Time to Pivot Out of Engineering?

Effectively Zero-Knowledge Proofs for NP with No Interaction, No Setup

Long Overdue

ValiDrive: Quickly spot-check USB mass storage drive for fraudulent capacity

Rules Clobber Goals

The Spanish Government wants Huawei to monitor for system wiretaps

DeadliQ – AI-powered deadline tracking for your documents

Unmoved mover

Ask HN: Stylography, AI and an impending privacy nightmare?

US Government announces $200M Grok contract a week after 'MechaHitler'

The Tiny Teams Playbook

Microlasers Made from Edible Substances

Careless People (Review of the Book)

Detecting and reporting all unhandled C++ exceptions

ChatGPT made up a product feature out of thin air, so this company created it

House Republicans Vote to Block Release of Epstein Files

Survival of the Greediest

Read GitHub repos in one second in VSCode

SF Bay Area Aging Demographics

Energy expenditure and obesity across the economic spectrum

AI Breaking into Higher Dimension to Mimic Human Brain and Achieve Intelligence

Show HN: Tell the world why you unfollowed/muted a social media account

Is AI the end of coding as we know it, or just another tool?

WordPress Turmoil and the Fair Package Manager

The Pragmatic Engineer 2025 Survey: What's in your tech stack?

Graph Continuous Thought Machines

Show HN: McClane – Done-for-you lead drops from Facebook group conversations

Silicon Valley, à la Française

Energy expenditure and obesity across the economic spectrum

TikTok Creator Sued by Sylvanian Doll Maker over Brand Promotions

Ask HN: Time to Pivot Out of Engineering?

OpenSearch 3.0 Released

Comments