Will Amazon S3 Vectors Kill Vector Databases–Or Save Them?

https://zilliz.com/blog/will-amazon-s3-vectors-kill-vector-databases-or-save-them

82•Fendy•4h ago

Comments

Fendy•4h ago

what do you think?

sharemywin•3h ago

it's annoying to me that there's not a doc store with vectors. seems like the vector dbs just store the vectors I think.

jeffchuber•3h ago

chroma stores both

nkozyra•3h ago

As does Azure's AI search.

intalentive•3h ago

I just use sqlite

storus•3h ago

Pinecone allows 40k of metadata with each vector which is often enough.

whakim•2h ago

Elasticsearch and Vespa both fit the bill for this, if your scale grows beyond the purpose-built vector stores.

simonw•2h ago

Elasticsearch and MongoDB Atlas and PostgreSQL and SQLite all have vector indexes these days.

KaoruAoiShiho•2h ago

> MongoDB Atlas

It took a while but eventually opensource dies.

CuriouslyC•1h ago

My search service Lens returns exact spans from search, while having the best performance both in terms of latency and precision/recall within a budget. I'm just working on release cleanup and final benchmark validation so hopefully I can get it in your hands soon.

resters•3h ago

By hosting the vectors themselves, AWS can meta-optimize its cloud based on content characteristics. It may seem like not a major optimization, but at AWS scale it is billions of dollars per year. It also makes it easier for AWS to comply with censorship requirements.

barbazoo•3h ago

> It also makes it easier for AWS to comply with censorship requirements.

Does it, how? Why would it be the vector store that would make it easier for them to censor the content? Why not censor the documents in S3 directly, or the entries in the relational database. What is different about censoring those vs a vector store?

resters•2h ago

Once a vector has been generated (and someone has paid for it) it can be searched for and relevant content can be identified without AWS incurring any additional cost to create its own separate censorship-oriented index, etc. AWS can also add additional bits to the vector that benefit its internal goals (scalability, censorship, etc.)

Not to mention there is lock-in once you've gone to the trouble of using a specific embedding model on a bunch of content. Ideally we'd converge on backwards-compatible, open source approaches, but cloud vendors want to offer "value" by offering "better" embedding models that are not open source.

barbazoo•2h ago

And that doesn't apply to any other database/search technology AWS offers?

resters•2h ago

It does to some but not to most of it, which is why Azure and GCP offer nearly the exact same core services.

simonw•2h ago

Why would they do that? Doesn't sound like something that would attract further paying customers.

Are there laws on the books that would force them to apply the technology in this way?

resters•2h ago

Not official laws that we can read, but things like that are already in place per the Snowden revelations.

whakim•2h ago

Regardless of the merits of this argument, dedicated vector databases are all running on top of AWS/GCP/Azure infrastructure anyways.

coredog64•2h ago

This comment appears to misunderstand the control plane/data plane distinction of AWS. AWS does have limited access to your control plane, primarily for things like enabling your TAMs to analyze your costs or getting assistance from enterprise support teams. They absolutely do not have access to your dataplane unless you specifically grant it. The primary use case for the latter is allowing writes into your storage for things like ALB access logs to S3. If you were deep in a debug session with enterprise support they might request one-off access to something large in S3, but I would be surprised if that were to happen.

resters•2h ago

If that is the case why create a separate govcloud and HIPAA service?

thedougd•1h ago

HIPAA services are not separate. You only need to establish a Business Associations Addendum (BAA) with AWS and stick to HIPAA eligible services: https://aws.amazon.com/compliance/hipaa-eligible-services-re...

GovCloud exists so that AWS can sell to the US government and their contractors without impacting other customers who have different or less stringent requirements.

simonw•3h ago

This is a good article and seems well balanced despite being written by someone with a product that directly competes with Amazon S3. I particularly appreciated their attempt to reverse-engineer how S3 Vectors work, including this detail:

> Filtering looks to be applied after coarse retrieval. That keeps the index unified and simple, but it struggles with complex conditions. In our tests, when we deleted 50% of data, TopK queries requesting 20 results returned only 15—classic signs of a post-filter pipeline.

Things like this are why I'd much prefer if Amazon provided detailed documentation of how their stuff works, rather than leaving it to the development community to poke around and derive those details independently.

speedysurfer•2h ago

And what if they change their internal implementation and your code depends on the old architecture? It's good practice to clearly think about what to expose to users of your service.

altcognito•2h ago

Knowing how the service will handle certain workloads is an important aspect of choosing an architecture.

libraryofbabel•27m ago

If you can truly abstract away an internal detail, then great. But often there are design decisions that you cannot abstract away because they affect e.g. performance in a major way. For example, I don't care whether some AWS service is written in Java or Go or C++. I do care a bit about how its indexing and retrieval works, because I need to know that to plan my query workloads.

I actually think AWS did a reasonably good job of this with DynamoDB. Most of the performance tradeoffs, indexing etc., is pretty clear if you ready enough docs without exposing a ton of unnecessary internals.

alanwli•1h ago

The alternative is to find solutions that can reasonably support different requirements because business needs change all the time especially in the current state of our industry. From what I’ve seen, OSS Postgres/pgvector can adequately support a wide variety of requirements for millions to low tens of millions of vectors - low latencies, hybrid search, filtered search, ability to serve out of memory and disk, strong-consistency/transactional semantics with operational data. For further scaling/performance (1B+ vectors and even lower latencies), consider SOTA Postgres system like AlloyDB with AlloyDB ScaNN.

Full disclosure: I founded ScaNN in GCP databases and am the lead for AlloyDB Semantic Search. And all these opinions are my own.

libraryofbabel•38m ago

> Things like this are why I'd much prefer if Amazon provided detailed documentation of how their stuff works, rather than leaving it to the development community to poke around and derive those details independently.

Absolutely this. So much engineering time has been wasted on reverse-engineering internal details of things in AWS that could be easily documented. I once spent a couple days empirically determining how exactly cross-AZ least-outstanding-requests load balancing worked with AWS's ALB because the docs didn't tell me. Reverse-engineering can be fun (or at least I kinda enjoy it) but it's not a good use of our time and is one of those shadow costs of using the Cloud.

It's not like there's some secret sauce here in most of these implementation details (there aren't that many ways to design a load balancer). If there was, I'd understand not telling us. This is probably less an Apple-style culture of secrecy and more laziness and a belief that important details have been abstracted away from us users because "The Cloud" when in fact, these details do really matter for performance and other design decisions we have to make.

citizenpaul•34m ago

I have to assume that at this point its either intentional(increases profits?) or because AWS doesn't truly understand their own systems due to the culture of the company.

messe•29m ago

> because AWS doesn't truly understand their own systems due to the culture of the company.

This. There's a lot of freedom in how teams operate. Some teams have great internal documentation, others don't, and a lot of it is scattered across the internal Amazon wiki. I recall having to reach out on slack on multiple occasions to figure out how certain systems worked after diving through docs and the relevant issue trackers didn't make it clear.

cyberax•15m ago

AWS also has a pretty diverse set of hardware, and often several generations of software running in parallel. Usually because the new generation does not quite support 100% of features from the previous generation.

TheSoftwareGuy•31m ago

>It's not like there's some secret sauce here in most of these implementation details. If there was, I'd understand not telling us. This is probably less an Apple-style culture of secrecy and more laziness and a belief that important details have been abstracted away from us users because "The Cloud" when in fact, these details do really matter for performance and other design decisions we have to make.

Having worked inside AWS I can tell you one big reason is the attitude/fear that anything we put in out public docs may end up getting relied on by customers. If customers rely on the implementation to work in a specific way, then changing that detail requires a LOT more work to prevent breaking customer's workloads. If it is even possible at that point.

libraryofbabel•18m ago

And yet "Hyrum's Law" famously says people will come to rely on features of your system anyway, even if they are undocumented. So I'm not convinced this is really customer-centric, it's more AWS being able to say: hey sorry this change broke things for you, but you were relying on an internal detail. I do think there is a better option here where there are important details that are published but with a "this is subject to change at any time" warning slapped on them. Otherwise, like OP says, customers just have to figure it all out on their own.

lazide•55s ago

Sure, but the court isn’t going to consider hyrum’s law in a tort claim, but might consider AWS documentation - even with a disclaimer - with more weight.

Rely on undocumented behavior at your own risk.

whakim•8m ago

> It's not like there's some secret sauce here in most of these implementation details.

IME the implementation of ANN + metadata filtering is often the "secret sauce" behind many vector database implementations.

storus•3h ago

Does this support hybrid search (dense + sparse embeddings)? Pure dense embeddings aren't that great for specific search, they only hit meaning reliably. Amazon's own embeddings also aren't SOTA.

infecto•3h ago

That’s where my mind was rolling and also if not, can this be used in OpenSearch hybrid search?

danielcampos93•1h ago

I think you would be very surprised by the number of customers who don't care if the embeddings are SOTA. For every Joe who wants to talk GraphRAG + MTEB + CMTEB and adaptive rag there are 50 who just want whatever IT/prodsec has approved

qaq•2h ago

"I recently spoke with the CTO of a popular AI note-taking app who told me something surprising: they spend twice as much on vector search as they do on OpenAI API calls. Think about that for a second. Running the retrieval layer costs them more than paying for the LLM itself. That flips the usual assumption on its head." Hmm well start sending full documents as part of context see it flip back :).

heywoods•2h ago

Egress costs? I’m really surprised by this. Thanks for sharing.

qaq•2h ago

Sry maybe should've being more clear it was a sarcastic remark. The whole point of doing vector db search is to feed LLM with very targeted context so you can save $ on API calls to LLM.

infecto•2h ago

That’s not the whole point it’s in the intersection of reducing tokens sent but also getting search both specific and generic enough to capture the correct context data.

scosman•2h ago

Anyone interested in this space should look at https://turbopuffer.com - I think they were first to market with S3 backed vector storage, and a good memory cache in front of it.

nosequel•19m ago

Turbopuffer was mentioned in the article.

redskyluan•2h ago

Author of this article.

Yes, I’m the founder and maintainer of the Milvus project, and also a big fan of many AWS projects, including S3, Lambda, and Aurora. Personally, I don’t consider S3Vector to be among the best products in the S3 ecosystem, though I was impressed by its excellent latency control. It’s not particularly fast, nor is it feature-rich, but it seems to embody S3’s design philosophy: being “good enough” for certain scenarios.

In contrast, the products I’ve built usually push for extreme scalability and high performance. Beyond Milvus, I’ve also been deeply involved in the development of HBase and Oracle products. I hope more people will dive into the underlying implementation of S3Vector—this kind of discussion could greatly benefit both the search and storage communities and accelerate their growth.

redskyluan•2h ago

By the way, if you’re not fully satisfied with S3Vector’s write, query, or recall performance, I’d encourage you to take a look at what we’ve built with Zilliz Cloud. It may not always be the lowest-cost option, but it will definitely meet your expectations when it comes to latency and recall.

cpursley•1h ago

Postgres has pgvector. Postgres is where all of my data already lives. It’s all open source and runs anywhere. What am I missing with the specialty vector stores?

CuriouslyC•1h ago

latency, actual retrieval performance, integrated pipelines that do more than just vector search to produce better results, the list goes on.

Postgres for vector search is fine for toy products or stuff that's outside the hot loop of your business but for high performance applications it's just inadequate.

cpursley•1h ago

For the vast majority of applications, the trade off is worth keeping everything in Postgres vs operational overhead of some VC hype data store that won’t be around in 5 years. Most people learned this lesson with Mongo (postgrest jsonb is now good enough for 90% of scenarios).

cpursley•1h ago

Also, no way retrieval performance is going to match pgvector because you still have to join the external vector with your domain data in the main database at the application level, which is always going to be less performant.

CuriouslyC•55m ago

For a large class of applications, the database join is the last step of a very involved pipeline that demands a lot more performance than PGVector can deliver. There are also a large class of applications that don't even interface with the database directly, except to emit logging/traceability artifacts.

CuriouslyC•53m ago

I'm a legit postgres fanboy, my comment history will back this up, but the ops overhead and performance implications of trying to run PGvector as your core vector store for everything is just silly, you're going to be doing all sorts of postgres replication gymnastics to make up for the fact that you're using the wrong tool for the job. It's good for prototyping and small/non-core workloads, use it outside that scope at your own peril.

cpursley•28m ago

Guess I'm just not webscale™

alastairr•9m ago

Interested to hear any more on this. I've been using pinecone for ages, but they recently increased the cost floor for serverless. I've been thinking of moving everything to pgvector (1M ish, so not loads), as all the bigger meta data lives there anyway. But I'd be interested to hear any views on that.

whakim•1m ago

It depends on scale. If you're storing a small number of embeddings (hundreds of thousands, millions) and don't have complicated filters, then absolutely the convenience factor of pgvector will win out. Beyond that, you'll need something more powerful. I do think the dedicated vector stores serve a useful place in the market in that they're extremely "managed" - it is really really easy to just call an API and never worry about pre- or post- filtering or sharding your index across a large cluster.

rubenvanwyk•1h ago

I don’t think it’s either-or, this will probably become the default / go-to - if you aren’t storing your vectors in your db like Neon or Turso.

As far as I understand, Milvus is appropriate for very large scale, so will probably continue targeting enterprise.

janalsncm•33m ago

S3 vectors has a topK limit of 30, and if you add filters it may be less than that. So if you need something with higher topK you’ll need to 1) look elsewhere or 2) shard your dataset into N shards to get NxK results, which you query in parallel and merge afterwards.

I also didn’t see any latency info on their docs page https://docs.aws.amazon.com/AmazonS3/latest/API/API_S3Vector...

Spanish PM Pedro Sánchez says Israel is 'exterminating a defenceless people'

Google is reportedly looking to give other cloud providers access to its TPUs

Using Emacs Org-Mode With Databases: A getting-started guide

Thumbnails helped rats, squirrels, and more take over the world

Super-Resolution with Structured Motion

Early breakfast could help you live longer

Ask HN: Technology Teacher Needs Validation from Smarter People

Show HN: New Site for My GitHub TUI

Phison Pre-Release Firmware Linked to SSD Failures, Not Microsoft Patch

How to make team members hungry

Causal Artificial Intelligence [Free Textbook]

Salt Typhoon used domains, going back five years. Did you visit one?

ScreenShaver

LaraUtilX – A Utility Package for Laravel

Microsoft bets big on nuclear future for data centers

Ask HN: Could AI agents benefit from persistent, shared memory?

The Cause of Alzheimer's Could Be Coming from Within Your Mouth

An Animal's History of Humanity: A Brief History on the Exploitation of Animals

Blue-throated macaws learn by imitating others

Salesloft GitHub Account Compromised Months Before Salesforce Attack

Nova Launcher's founder and sole developer has left

Bug in SAP's S/4 HANA exploited in the wild, rated critical CVSS 9.9

Qantas trims CEO's bonus following July cybersecurity incident

Teen coder made first millennial Catholic saint at youthful Vatican event

Faith in God-like large language models is waning

Michigan Marvel: John King Books has a 'secret,' owner says

Laude Institute – Ship Your Research

iPhone app alerts users to nearby ICE sightings

Kradle: Eval AI with Simulations

Custom Git ignores with a global gitignore file or Git exclude

Spanish PM Pedro Sánchez says Israel is 'exterminating a defenceless people'

Google is reportedly looking to give other cloud providers access to its TPUs

Using Emacs Org-Mode With Databases: A getting-started guide

Thumbnails helped rats, squirrels, and more take over the world

Super-Resolution with Structured Motion

Early breakfast could help you live longer

Ask HN: Technology Teacher Needs Validation from Smarter People

Show HN: New Site for My GitHub TUI

Phison Pre-Release Firmware Linked to SSD Failures, Not Microsoft Patch

How to make team members hungry

Causal Artificial Intelligence [Free Textbook]

Salt Typhoon used domains, going back five years. Did you visit one?

ScreenShaver

LaraUtilX – A Utility Package for Laravel

Microsoft bets big on nuclear future for data centers

Ask HN: Could AI agents benefit from persistent, shared memory?

The Cause of Alzheimer's Could Be Coming from Within Your Mouth

An Animal's History of Humanity: A Brief History on the Exploitation of Animals

Blue-throated macaws learn by imitating others

Salesloft GitHub Account Compromised Months Before Salesforce Attack

Nova Launcher's founder and sole developer has left

Bug in SAP's S/4 HANA exploited in the wild, rated critical CVSS 9.9

Qantas trims CEO's bonus following July cybersecurity incident

Teen coder made first millennial Catholic saint at youthful Vatican event

Faith in God-like large language models is waning

Michigan Marvel: John King Books has a 'secret,' owner says

Laude Institute – Ship Your Research

iPhone app alerts users to nearby ICE sightings

Kradle: Eval AI with Simulations

Custom Git ignores with a global gitignore file or Git exclude

Will Amazon S3 Vectors Kill Vector Databases–Or Save Them?

Comments