frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Moltbook isn't real but it can still hurt you

https://12gramsofcarbon.com/p/tech-things-moltbook-isnt-real-but
1•theahura•3m ago•0 comments

Take Back the Em Dash–and Your Voice

https://spin.atomicobject.com/take-back-em-dash/
1•ingve•3m ago•0 comments

Show HN: 289x speedup over MLP using Spectral Graphs

https://zenodo.org/login/?next=%2Fme%2Fuploads%3Fq%3D%26f%3Dshared_with_me%25253Afalse%26l%3Dlist...
1•andrespi•4m ago•0 comments

Teaching Mathematics

https://www.karlin.mff.cuni.cz/~spurny/doc/articles/arnold.htm
1•samuel246•7m ago•0 comments

3D Printed Microfluidic Multiplexing [video]

https://www.youtube.com/watch?v=VZ2ZcOzLnGg
2•downboots•7m ago•0 comments

Abstractions Are in the Eye of the Beholder

https://software.rajivprab.com/2019/08/29/abstractions-are-in-the-eye-of-the-beholder/
2•whack•7m ago•0 comments

Show HN: Routed Attention – 75-99% savings by routing between O(N) and O(N²)

https://zenodo.org/records/18518956
1•MikeBee•7m ago•0 comments

We didn't ask for this internet – Ezra Klein show [video]

https://www.youtube.com/shorts/ve02F0gyfjY
1•softwaredoug•8m ago•0 comments

The Real AI Talent War Is for Plumbers and Electricians

https://www.wired.com/story/why-there-arent-enough-electricians-and-plumbers-to-build-ai-data-cen...
2•geox•11m ago•0 comments

Show HN: MimiClaw, OpenClaw(Clawdbot)on $5 Chips

https://github.com/memovai/mimiclaw
1•ssslvky1•11m ago•0 comments

I Maintain My Blog in the Age of Agents

https://www.jerpint.io/blog/2026-02-07-how-i-maintain-my-blog-in-the-age-of-agents/
2•jerpint•12m ago•0 comments

The Fall of the Nerds

https://www.noahpinion.blog/p/the-fall-of-the-nerds
1•otoolep•13m ago•0 comments

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

https://the-lexicon-project.netlify.app/
2•breadwithjam•16m ago•0 comments

How close is AI to taking my job?

https://epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job
1•cjbarber•16m ago•0 comments

You are the reason I am not reviewing this PR

https://github.com/NixOS/nixpkgs/pull/479442
2•midzer•18m ago•1 comments

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

https://familymemories.video
1•tareq_•20m ago•0 comments

How Meta Made Linux a Planet-Scale Load Balancer

https://softwarefrontier.substack.com/p/how-meta-turned-the-linux-kernel
1•CortexFlow•20m ago•0 comments

A Turing Test for AI Coding

https://t-cadet.github.io/programming-wisdom/#2026-02-06-a-turing-test-for-ai-coding
2•phi-system•20m ago•0 comments

How to Identify and Eliminate Unused AWS Resources

https://medium.com/@vkelk/how-to-identify-and-eliminate-unused-aws-resources-b0e2040b4de8
3•vkelk•21m ago•0 comments

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

https://github.com/MrTechGadget/A2C_DVI_SMD
2•mmoogle•21m ago•0 comments

CLI for Common Playwright Actions

https://github.com/microsoft/playwright-cli
3•saikatsg•22m ago•0 comments

Would you use an e-commerce platform that shares transaction fees with users?

https://moondala.one/
1•HamoodBahzar•24m ago•1 comments

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

https://github.com/ykdojo/safeclaw
3•ykdojo•27m ago•0 comments

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3
3•gmays•28m ago•0 comments

The Evolution of the Interface

https://www.asktog.com/columns/038MacUITrends.html
2•dhruv3006•29m ago•1 comments

Azure: Virtual network routing appliance overview

https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-routing-appliance-overview
3•mariuz•29m ago•0 comments

Seedance2 – multi-shot AI video generation

https://www.genstory.app/story-template/seedance2-ai-story-generator
2•RyanMu•33m ago•1 comments

Πfs – The Data-Free Filesystem

https://github.com/philipl/pifs
2•ravenical•36m ago•0 comments

Go-busybox: A sandboxable port of busybox for AI agents

https://github.com/rcarmo/go-busybox
3•rcarmo•37m ago•0 comments

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf
2•gmays•38m ago•0 comments
Open in hackernews

Will Amazon S3 Vectors kill vector databases or save them?

https://zilliz.com/blog/will-amazon-s3-vectors-kill-vector-databases-or-save-them
280•Fendy•5mo ago

Comments

Fendy•5mo ago
what do you think?
sharemywin•5mo ago
it's annoying to me that there's not a doc store with vectors. seems like the vector dbs just store the vectors I think.
jeffchuber•5mo ago
chroma stores both
nkozyra•5mo ago
As does Azure's AI search.
intalentive•5mo ago
I just use sqlite
storus•5mo ago
Pinecone allows 40k of metadata with each vector which is often enough.
whakim•5mo ago
Elasticsearch and Vespa both fit the bill for this, if your scale grows beyond the purpose-built vector stores.
simonw•5mo ago
Elasticsearch and MongoDB Atlas and PostgreSQL and SQLite all have vector indexes these days.
KaoruAoiShiho•5mo ago
> MongoDB Atlas

It took a while but eventually opensource dies.

CuriouslyC•5mo ago
My search service Lens returns exact spans from search, while having the best performance both in terms of latency and precision/recall within a budget. I'm just working on release cleanup and final benchmark validation so hopefully I can get it in your hands soon.
resters•5mo ago
By hosting the vectors themselves, AWS can meta-optimize its cloud based on content characteristics. It may seem like not a major optimization, but at AWS scale it is billions of dollars per year. It also makes it easier for AWS to comply with censorship requirements.
barbazoo•5mo ago
> It also makes it easier for AWS to comply with censorship requirements.

Does it, how? Why would it be the vector store that would make it easier for them to censor the content? Why not censor the documents in S3 directly, or the entries in the relational database. What is different about censoring those vs a vector store?

resters•5mo ago
Once a vector has been generated (and someone has paid for it) it can be searched for and relevant content can be identified without AWS incurring any additional cost to create its own separate censorship-oriented index, etc. AWS can also add additional bits to the vector that benefit its internal goals (scalability, censorship, etc.)

Not to mention there is lock-in once you've gone to the trouble of using a specific embedding model on a bunch of content. Ideally we'd converge on backwards-compatible, open source approaches, but cloud vendors want to offer "value" by offering "better" embedding models that are not open source.

barbazoo•5mo ago
And that doesn't apply to any other database/search technology AWS offers?
resters•5mo ago
It does to some but not to most of it, which is why Azure and GCP offer nearly the exact same core services.
simonw•5mo ago
Why would they do that? Doesn't sound like something that would attract further paying customers.

Are there laws on the books that would force them to apply the technology in this way?

resters•5mo ago
Not official laws that we can read, but things like that are already in place per the Snowden revelations.
whakim•5mo ago
Regardless of the merits of this argument, dedicated vector databases are all running on top of AWS/GCP/Azure infrastructure anyways.
coredog64•5mo ago
This comment appears to misunderstand the control plane/data plane distinction of AWS. AWS does have limited access to your control plane, primarily for things like enabling your TAMs to analyze your costs or getting assistance from enterprise support teams. They absolutely do not have access to your dataplane unless you specifically grant it. The primary use case for the latter is allowing writes into your storage for things like ALB access logs to S3. If you were deep in a debug session with enterprise support they might request one-off access to something large in S3, but I would be surprised if that were to happen.
resters•5mo ago
If that is the case why create a separate govcloud and HIPAA service?
thedougd•5mo ago
HIPAA services are not separate. You only need to establish a Business Associations Addendum (BAA) with AWS and stick to HIPAA eligible services: https://aws.amazon.com/compliance/hipaa-eligible-services-re...

GovCloud exists so that AWS can sell to the US government and their contractors without impacting other customers who have different or less stringent requirements.

everfrustrated•5mo ago
Product segmentation. Certain customers self-select to pay more for the same thing.
j45•5mo ago
Also, if it's not encrypted, I'm not sure if AWS or others "synthesize" customer data by a cursory scrubbing of so called client identifying information, and then try to optimize and model for those scenarios at scale.

I do feel more and more some information in the corpus of AI models was done this way. A client's name and private identifiable information might not be in the model, but some patterns of how to do things sure seem to come up from such sources.

simonw•5mo ago
This is a good article and seems well balanced despite being written by someone with a product that directly competes with Amazon S3. I particularly appreciated their attempt to reverse-engineer how S3 Vectors work, including this detail:

> Filtering looks to be applied after coarse retrieval. That keeps the index unified and simple, but it struggles with complex conditions. In our tests, when we deleted 50% of data, TopK queries requesting 20 results returned only 15—classic signs of a post-filter pipeline.

Things like this are why I'd much prefer if Amazon provided detailed documentation of how their stuff works, rather than leaving it to the development community to poke around and derive those details independently.

speedysurfer•5mo ago
And what if they change their internal implementation and your code depends on the old architecture? It's good practice to clearly think about what to expose to users of your service.
altcognito•5mo ago
Knowing how the service will handle certain workloads is an important aspect of choosing an architecture.
libraryofbabel•5mo ago
If you can truly abstract away an internal detail, then great. But often there are design decisions that you cannot abstract away because they affect e.g. performance in a major way. For example, I don't care whether some AWS service is written in Java or Go or C++. I do care a bit about how its indexing and retrieval works, because I need to know that to plan my query workloads.

I actually think AWS did a reasonably good job of this with DynamoDB. Most of the performance tradeoffs, indexing etc., is pretty clear if you ready enough docs without exposing a ton of unnecessary internals.

alanwli•5mo ago
The alternative is to find solutions that can reasonably support different requirements because business needs change all the time especially in the current state of our industry. From what I’ve seen, OSS Postgres/pgvector can adequately support a wide variety of requirements for millions to low tens of millions of vectors - low latencies, hybrid search, filtered search, ability to serve out of memory and disk, strong-consistency/transactional semantics with operational data. For further scaling/performance (1B+ vectors and even lower latencies), consider SOTA Postgres system like AlloyDB with AlloyDB ScaNN.

Full disclosure: I founded ScaNN in GCP databases and am the lead for AlloyDB Semantic Search. And all these opinions are my own.

libraryofbabel•5mo ago
> Things like this are why I'd much prefer if Amazon provided detailed documentation of how their stuff works, rather than leaving it to the development community to poke around and derive those details independently.

Absolutely this. So much engineering time has been wasted on reverse-engineering internal details of things in AWS that could be easily documented. I once spent a couple days empirically determining how exactly cross-AZ least-outstanding-requests load balancing worked with AWS's ALB because the docs didn't tell me. Reverse-engineering can be fun (or at least I kinda enjoy it) but it's not a good use of our time and is one of those shadow costs of using the Cloud.

It's not like there's some secret sauce here in most of these implementation details (there aren't that many ways to design a load balancer). If there was, I'd understand not telling us. This is probably less an Apple-style culture of secrecy and more laziness and a belief that important details have been abstracted away from us users because "The Cloud" when in fact, these details do really matter for performance and other design decisions we have to make.

citizenpaul•5mo ago
I have to assume that at this point its either intentional(increases profits?) or because AWS doesn't truly understand their own systems due to the culture of the company.
messe•5mo ago
> because AWS doesn't truly understand their own systems due to the culture of the company.

This. There's a lot of freedom in how teams operate. Some teams have great internal documentation, others don't, and a lot of it is scattered across the internal Amazon wiki. I recall having to reach out on slack on multiple occasions to figure out how certain systems worked after diving through docs and the relevant issue trackers didn't make it clear.

cyberax•5mo ago
AWS also has a pretty diverse set of hardware, and often several generations of software running in parallel. Usually because the new generation does not quite support 100% of features from the previous generation.
TheSoftwareGuy•5mo ago
>It's not like there's some secret sauce here in most of these implementation details. If there was, I'd understand not telling us. This is probably less an Apple-style culture of secrecy and more laziness and a belief that important details have been abstracted away from us users because "The Cloud" when in fact, these details do really matter for performance and other design decisions we have to make.

Having worked inside AWS I can tell you one big reason is the attitude/fear that anything we put in out public docs may end up getting relied on by customers. If customers rely on the implementation to work in a specific way, then changing that detail requires a LOT more work to prevent breaking customer's workloads. If it is even possible at that point.

libraryofbabel•5mo ago
And yet "Hyrum's Law" famously says people will come to rely on features of your system anyway, even if they are undocumented. So I'm not convinced this is really customer-centric, it's more AWS being able to say: hey sorry this change broke things for you, but you were relying on an internal detail. I do think there is a better option here where there are important details that are published but with a "this is subject to change at any time" warning slapped on them. Otherwise, like OP says, customers just have to figure it all out on their own.
lazide•5mo ago
Sure, but the court isn’t going to consider hyrum’s law in a tort claim, but might consider AWS documentation - even with a disclaimer - with more weight.

Rely on undocumented behavior at your own risk.

vlovich123•5mo ago
Has Amazon ever been taken to court for things like this? I really don't think this is a legal concern.
lazide•5mo ago
Amazon is involved in so many lawsuits right now, I honestly can’t tell. I did some google searches and gave up after 5+ pages.
teaearlgraycold•5mo ago
I don't buy the legal angle. But if I was an overworked Amazon SWE I'd also like to avoid the work of documentation and a proper migration the next time implementation is changed.
TheSoftwareGuy•4mo ago
You're right, people absolutely do rely on internal behavior intentionally and sometimes even unintentionally. And we tried our hardest not to break any of those customers either. but the point is that putting something in the docs is seen as a promise that you can rely on it. And going back on a promise is the exact opposite of the "Earns Trust" leadership principal that everyone is evaluated against.
wubrr•5mo ago
Right now, it is basically impossible to reliably build full applications with things like DynamoDB (among other AWS products), without relying on internal behaviour which isn't explicitly documented.
JustExAWS•5mo ago
I am also a former AWS employee. What non public information did you need for DDB?
tracker1•5mo ago
Try ingesting the a complete WHOIS dump into DDB sometime. This was before autoscaling worked at all when I tried... but it absolutely wasn't anything one can consider fun.

In the end, after multiple implementations, finally had to use a Java Spring app on a server with a LOT of ram just to buffer the CSV reads without blowing up on the pushback from DDB. I think the company spent over $20k in the couple months on different efforts in a couple different languages (C#/.Net, Node.js, Java) across a couple different routes (multiple queues, lambda, etc) just to get the initial data ingestion working a first time.

The Node.js implementation was fastest, but would always blow up a few days in without the ability to catch with a debugger attached. The queues and lambda experiments had throttling issues similar to the DynamoDB ingestion itself, even with the knobs turned all the way up. I don't recall what the issue with the .Net implementation was at the time, but it blew up differently.

I don't recall all the details, and tbh I shouldn't care, but it would have been nice if there was some extra guidance of trying to take in a few gb of csv into DynamoDB at the time. To this day, I still hate ETL work.

JustExAWS•5mo ago
https://docs.aws.amazon.com/amazondynamodb/latest/developerg...
tracker1•5mo ago
Cool... though that would make it difficult to get the hundred or so CSVs into a single table, since it isn't supported I guess stitching them before processing would be easy enough... also, no idea when that feature became available.
JustExAWS•5mo ago
It’s never been a good idea to batch ingest a lot of little single files using any ETL process on AWS, whether it be DDB, Aurora MySQL/Postgres using “load data from S3…”, Redshift batch import from S3, or just using Athena (yeah I’ve done all of them).
tracker1•5mo ago
These weren't "little" single files... just separated by tld iirc.
everfrustrated•5mo ago
Why would you expect an OLTP db like DDB to work for ETL? You'd have the same problems if you used Postgres.

It's not like AWS is short on ETL technologies to use...

scarface_74•5mo ago
Even in an OlTP db, there is often a need to bulk import and export data. AWS has methods in most supported data stores - ElasticSearch, DDB, MySQL, Aurora, Redshift, etc to bulk insert from S3.
cyberax•5mo ago
A tool to look at hot partitions, for one thing.
JustExAWS•5mo ago
It should handle that automatically

https://aws.amazon.com/blogs/database/part-2-scaling-dynamod...

cyberax•5mo ago
The keyword here is "should" :) Back then DynamoDB also had a problem with scaling the data can be easily split into partitions, but it's never merged back into fewer partitions.

So if you scaled up and then down, you might have ended with a lot of partitions that got only a few IOPS quota each. It's better now with burst IOPS, but it still is a problem sometimes.

cbsmith•5mo ago
I've built several DynamoDB apps, and while you might have some expectations of internal behaviour, you can build apps that are pretty resilient to change of the internal behaviour but rely heavily on the documented behaviour. I actually find the extent of the opacity a helpful guide on the limitations of the service.
catlifeonmars•5mo ago
Agree. TTL 48h SLA comes to mind.
mannyv•5mo ago
Totally incorrect for Dynamo.

It was probably correct for Cognito 1.0.

simonw•5mo ago
Thanks for this, that's a really insightful comment.
scarface_74•5mo ago
You have been quoted Simon Willison on his blog - his blog is popular on HN.

https://simonwillison.net/2025/Sep/8/thesoftwareguy/#atom-ev...

UltraSane•5mo ago
Just add an option to re-enable spacebar heating.
thiagowfx•5mo ago
https://www.hyrumslaw.com/
whakim•5mo ago
> It's not like there's some secret sauce here in most of these implementation details.

IME the implementation of ANN + metadata filtering is often the "secret sauce" behind many vector database implementations.

javier2•5mo ago
Its likely not specified, because they want to keep their right to improve or change it later. Documenting too detailed leads to way harder changes
kenhwang•5mo ago
Did you have an account manager or support contract with AWS? IME, they're more than willing to set up a call with one of their engineers to disclose implementation details like this after your company signs an NDA.
ithkuil•5mo ago
OTOH once you document something you need to do more work when you change the behaviour
BobbyJo•5mo ago
> This is probably less an Apple-style culture of secrecy and more laziness and a belief that important details have been abstracted away from us users

As someone who had worked in providing infra to third parties, I can say that providing more detail than necessary will hurt your chances with some bigger customers. Giving them more information than they need or ask for makes your product look more complicated.

However sophisticated you think a customer of this product will be, go lower.

yupyupyups•5mo ago
>So much engineering time has been wasted on reverse-engineering internal details of things

It feels that this true for proprietary software in general.

tw04•5mo ago
Detailed documentation would allow for a fair comparison of competing products. Opaque documentation allows AWS to sell "business value" to upper management while proclaiming anyone asking for more detail isn't focused on what's important.
apwell23•5mo ago
That would increase surface area of the abstraction they are trying to expose. This is not a case of failure to document.

One should only "poke around" an abstraction like this for fun and curiosity and not with intention of putting the finding to real use.

storus•5mo ago
Does this support hybrid search (dense + sparse embeddings)? Pure dense embeddings aren't that great for specific search, they only hit meaning reliably. Amazon's own embeddings also aren't SOTA.
infecto•5mo ago
That’s where my mind was rolling and also if not, can this be used in OpenSearch hybrid search?
danielcampos93•5mo ago
I think you would be very surprised by the number of customers who don't care if the embeddings are SOTA. For every Joe who wants to talk GraphRAG + MTEB + CMTEB and adaptive rag there are 50 who just want whatever IT/prodsec has approved
qaq•5mo ago
"I recently spoke with the CTO of a popular AI note-taking app who told me something surprising: they spend twice as much on vector search as they do on OpenAI API calls. Think about that for a second. Running the retrieval layer costs them more than paying for the LLM itself. That flips the usual assumption on its head." Hmm well start sending full documents as part of context see it flip back :).
heywoods•5mo ago
Egress costs? I’m really surprised by this. Thanks for sharing.
qaq•5mo ago
Sry maybe should've being more clear it was a sarcastic remark. The whole point of doing vector db search is to feed LLM with very targeted context so you can save $ on API calls to LLM.
infecto•5mo ago
That’s not the whole point it’s in the intersection of reducing tokens sent but also getting search both specific and generic enough to capture the correct context data.
j45•5mo ago
It's possible to create linking documents between the documents to help smooth out things in some cases.
heywoods•5mo ago
No worries. I should probably make sure I have at least a token understanding of the topic cloud based architecture before commenting next time haha.
andreasgl•5mo ago
They’re likely using an HNSW index, which typically requires a lot of memory for large data sets.
dahcryn•5mo ago
if they use AzureSearch, I fully understand it. Those things are hella expensive
scosman•5mo ago
Anyone interested in this space should look at https://turbopuffer.com - I think they were first to market with S3 backed vector storage, and a good memory cache in front of it.
nosequel•5mo ago
Turbopuffer was mentioned in the article.
k9294•5mo ago
Turbopuffer is awesome, really recommend it. Also they have extra features like automatic recall tuning based on you data, option to choose read after write guarantees (trading latency for consistency or vice versa), BM25 search, filtering on the filed and many more.

Really recommend to check them out if you need a vector DB. I tried qdrant and zilli cloud solutions and in terms of operational simplicity turbopuffer just killing it.

https://turbopuffer.com/docs/query

redskyluan•5mo ago
Author of this article.

Yes, I’m the founder and maintainer of the Milvus project, and also a big fan of many AWS projects, including S3, Lambda, and Aurora. Personally, I don’t consider S3Vector to be among the best products in the S3 ecosystem, though I was impressed by its excellent latency control. It’s not particularly fast, nor is it feature-rich, but it seems to embody S3’s design philosophy: being “good enough” for certain scenarios.

In contrast, the products I’ve built usually push for extreme scalability and high performance. Beyond Milvus, I’ve also been deeply involved in the development of HBase and Oracle products. I hope more people will dive into the underlying implementation of S3Vector—this kind of discussion could greatly benefit both the search and storage communities and accelerate their growth.

redskyluan•5mo ago
By the way, if you’re not fully satisfied with S3Vector’s write, query, or recall performance, I’d encourage you to take a look at what we’ve built with Zilliz Cloud. It may not always be the lowest-cost option, but it will definitely meet your expectations when it comes to latency and recall.
pradn•5mo ago
Thanks for writing a balanced article - much easier to take your arguments seriously! And a sign of expertise.
Shakahs•5mo ago
While your technical analysis is excellent, making judgements about workload suitability based on a Preview release is premature. Preview services have historically had significantly lower performance quotas than GA releases. Lambda for example was limited to 50 concurrent executions during Preview, raised to 100 at GA, and now the default limit is 1,000.
cpursley•5mo ago
Postgres has pgvector. Postgres is where all of my data already lives. It’s all open source and runs anywhere. What am I missing with the specialty vector stores?
CuriouslyC•5mo ago
latency, actual retrieval performance, integrated pipelines that do more than just vector search to produce better results, the list goes on.

Postgres for vector search is fine for toy products or stuff that's outside the hot loop of your business but for high performance applications it's just inadequate.

cpursley•5mo ago
For the vast majority of applications, the trade off is worth keeping everything in Postgres vs operational overhead of some VC hype data store that won’t be around in 5 years. Most people learned this lesson with Mongo (postgrest jsonb is now good enough for 90% of scenarios).
cpursley•5mo ago
Also, no way retrieval performance is going to match pgvector because you still have to join the external vector with your domain data in the main database at the application level, which is always going to be less performant.
CuriouslyC•5mo ago
For a large class of applications, the database join is the last step of a very involved pipeline that demands a lot more performance than PGVector can deliver. There are also a large class of applications that don't even interface with the database directly, except to emit logging/traceability artifacts.
jitl•5mo ago
i'll take a 100ms turbopuffer vector search plus a 50ms postgres-select-where-id-in over a 500ms all-in-one pgvector + join query.

When you only need to hydrate like 30 search result item IDs from Postgres or memcached i don't see the join being "too expensive" to do in memory.

CuriouslyC•5mo ago
I'm a legit postgres fanboy, my comment history will back this up, but the ops overhead and performance implications of trying to run PGvector as your core vector store for everything is just silly, you're going to be doing all sorts of postgres replication gymnastics to make up for the fact that you're using the wrong tool for the job. It's good for prototyping and small/non-core workloads, use it outside that scope at your own peril.
cpursley•5mo ago
Guess I'm just not webscale™
alastairr•5mo ago
Interested to hear any more on this. I've been using pinecone for ages, but they recently increased the cost floor for serverless. I've been thinking of moving everything to pgvector (1M ish, so not loads), as all the bigger meta data lives there anyway. But I'd be interested to hear any views on that.
whakim•5mo ago
At 1M embeddings I'd think pgvector would do just fine assuming a sufficiently powerful database.
CuriouslyC•5mo ago
It depends on your flow honestly. If you're just using your vectors for where filters on domain objects and you don't have hundreds of millions of vectors PGVec is fine. If you have any sort of workflow where you need low latency access to vectors and reliable random read performance, or where vector work is the bottleneck on performance, PGVec goes tits up.
j45•5mo ago
Appreciate the clarification. I have been using it for small / medium things and it's been OK.

The everything postgres as long as reasonably possible approach is fun, but not something I expect to last for ever.

whakim•5mo ago
It depends on scale. If you're storing a small number of embeddings (hundreds of thousands, millions) and don't have complicated filters, then absolutely the convenience factor of pgvector will win out. Beyond that, you'll need something more powerful. I do think the dedicated vector stores serve a useful place in the market in that they're extremely "managed" - it is really really easy to just call an API and never worry about pre- or post- filtering or sharding your index across a large cluster. But they also have weaknesses in that they're usually optimized around small(er) scale where the bulk of their customers lie, and they don't really replace an actual search system like ElasticSearch.
rubenvanwyk•5mo ago
I don’t think it’s either-or, this will probably become the default / go-to - if you aren’t storing your vectors in your db like Neon or Turso.

As far as I understand, Milvus is appropriate for very large scale, so will probably continue targeting enterprise.

janalsncm•5mo ago
S3 vectors has a topK limit of 30, and if you add filters it may be less than that. So if you need something with higher topK you’ll need to 1) look elsewhere or 2) shard your dataset into N shards to get NxK results, which you query in parallel and merge afterwards.

I also didn’t see any latency info on their docs page https://docs.aws.amazon.com/AmazonS3/latest/API/API_S3Vector...

mediaman•5mo ago
And a topk of 30 also means reranking of any sort is out, except for maybe limited reranking of 30->10, but that seems kind of pointless with today’s LLMs that can handle a bit more context.
janalsncm•5mo ago
Yeah exactly, so you could do something like shard by the first 4 bits of md5 of the text (gives you 16 buckets) but now you’re adding extra complexity to work around their limitations.
catlifeonmars•5mo ago
3) ask TAM for a service quota increase
conradev•5mo ago

  At a glance, it looks like a lightweight vector database running on top of low-cost object storage—at a price point that is clearly attractive compared to many dedicated vector database solutions.
They also didn’t mention LanceDB, which fits this description but with an open source component: https://lancedb.github.io/lancedb/
kjfarm•5mo ago
This may be because LanceDB is the most attractive with a price point of standard S3 storage ($0.023/GB vs $0.06/GB). I also like that Lancedb works with S3 compatible stores, such as Backblaze B2 which is even cheaper (~70% cheaper).
nickpadge•5mo ago
I love lancedb. It’s the only way I’ve found to performantly and cheaply serve 50m+ records of 768 dimensions. Runs on s3 a bit too slow, but on EFS can still be a few hundred millis.
factsaresacred•5mo ago
For low cost, there's also Cloudflare Vectorize ($0.05 per 100 million stored vectors), which nobody seems to know exists: https://www.cloudflare.com/developer-platform/products/vecto...
hbcondo714•5mo ago
It would be great to have the vector database run on the edge / on-device for offline-first and be privacy-focused. https://objectbox.io/ does this but i would like to see AWS and others offer this as well.
greenavocado•5mo ago
I am already using Qdrant very heavily for code dev (RAG) and I don't see that changing any time soon because its the primary choice for the tools I use and it works well
j45•5mo ago
The cloud is someone else's computer.

If it's this sensitive, there's a lot of companies staying on the sidelines until they can compute in person, or limiting what and how they use it.

giveita•5mo ago
Betteridge can answer No to two questions at once!
teaearlgraycold•5mo ago
> Not too long ago, AWS dropped something new: S3 Vectors. It’s their first attempt at a vector storage solution

Nitpick: AWS previously funded pgvector (the slow down in development indicates to me they have stopped). Their hosted database solutions supported the extension. That means RDS and Aurora were their first vector storage solutions.

softwaredoug•5mo ago
I’m not sure S3 vectors is a true vector database/search engine in the way something like Elasticsearch, Turbopuffer or Milvus is. It’s more a convenient building block for simple high scale retrieval.

I think of a search system doing quite a lot from sparse/lexical/hybrid search, metadata filtering, numerical ranking (recency/popularity/etc), geo, fuzzy, and whatever other indices at its core. These are building blocks for getting initial candidates.

Then you need to be able to combine all these into one result set for your users - usually with a query DSL where you can express a ranking function. Then there’s usually ancillary features that come up (highlighting, aggregations, etc).

So while S3 vectors is a fascinating primitive, I’m not sure I’d reach for it outside specific circumstances.

anonu•5mo ago
If you like to die in a slow and expensive way - sure.
jhhh•5mo ago
"That gap isn’t just theoretical—it shows up in real bills."

"That’s not linear growth—it’s a quantum leap"

"The performance and recall were fantastic—but the costs were brutal"

"it’s not a one-size-fits-all solution—it’s the right tool for the right job."

"S3 Vectors is excellent for cold, cheap, low-QPS scenarios—but it’s not the engine you want to power a recommendation system"

"S3 Vectors doesn’t spell the end of vector databases—it confirms something many of us have been seeing for a while"

"that’s proof positive that vector storage is a real necessity—not just “indexes wrapped in a database."

"the vector database market isn’t being disrupted—it’s maturing into a tiered ecosystem where different solutions serve different performance and cost needs"

"The golden age of vector databases isn’t over—it’s just beginning."

"The bigger point is that Milvus is evolving into a system that’s not only efficient and scalable, but AI-native at its core—purpose-built for how modern applications actually work."

turing_complete•5mo ago
Since when was everything no longer "announced" or "released", but "dropped"? Is this an LLMism?
Urahandystar•5mo ago
No you're just old. Come sit with us in a nice comfy chair.
fragmede•5mo ago
Started in the 1988, with music, then expanded from there.

https://english.stackexchange.com/questions/632983/has-drop-...

iknownothow•5mo ago
S3 has much bigger fish in its sight than the measely vector db space. If you see the subtle improvements in features of S3 in recent years, it is clear as day, at least to me, that they're going after the whale that is Databricks. And they're doing it the best way possible - slowly and silently eating away at their moat.

AWS Athena hasn't received as much love for some reason. In the next two years I expect major updates and/or improvements. They should kill off Redshift.

antonvs•5mo ago
> … going after the whale that is Databricks.

Databricks is tiny compared to AWS, maybe 1/50th the revenue. But they’re both chasing a big and fast-growing market. I don’t think it’s so much that AWS is going after Databricks as that Databricks happens to be in a market that AWS is interested in.

iknownothow•5mo ago
I agree, Databricks is one of many in the space. If S3 makes Databricks redundant, then they also make others like Databricks redundant too.
physicsguy•5mo ago
The biggest killer of vector dbs is that normal DBs can easily store embeddings, and the vector DBs just don’t then offer enough of a differentiator to be a separate product.

We found our application was very sensitive to context aware chunking too. You don’t really get control of that in many tools.

vincirufus•4mo ago
This could be game changing