frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Replacing EBS and Rethinking Postgres Storage from First Principles

https://www.tigerdata.com/blog/fluid-storage-forkable-ephemeral-durable-infrastructure-age-of-agents
60•mfreed•1d ago

Comments

cpt100•18h ago
pretty cool
thr0w•3h ago
Postgres for agents, of course! It makes too much sense.
jacobsenscott•2h ago
The agent stuff is BS for the pointy hairs. This seems to address real problems I've had with PG though.
akulkarni•5m ago
Yeah, I know what you mean. I used to roll my eyes every time someone said “agentic,” too. But after using Claude Code myself, and seeing how our best engineers build with it, I changed my mind. Agents aren’t hype, they’re genuinely useful, make us more productive, and honestly, fun to work with. I’ve learned to approach this with curiosity rather than skepticism.
akulkarni•9m ago
Thanks! We agree :-)

We just launched a bunch around “Postgres for Agents” [0]:

forkable databases, an MCP server for Postgres (with semantic + full-text search over the PG docs), a new BM25 text search extension (pg_textsearch), pgvectorscale updates, and a free tier.

[0] https://www.tigerdata.com/blog/postgres-for-agents

the8472•2h ago
Though AWS instance-attached NVMe(oF?) still has less IOPS per TB than bare metal NVMe does.

    E.g. i8g.2xlarge, 1875 GB, 300k IOPS read
    vs. WD_BLACK SN8100, 2TB, 2300k IOPS read
everfrustrated•35m ago
You can't do those rates 24x7 on a WD_BLACK tho.
0xbadcafebee•1h ago
There's a ton of jargon here. Summarized...

Why EBS didn't work:

  - EBS costs for allocation
  - EBS is slow at restores from snapshot (faster to spin up a database from a Postgres backup stored in S3 than from an EBS snapshot in S3)
  - EBS only lets you attach 24 volumes per instance
  - EBS only lets you resize once every 6–24 hours, you can't shrink or adjust continuously
  - Detaching and reattaching EBS volumes can take 10s for healthy volumes to 20m for failed ones, so failover takes longer
Why all this matters:

  - their AI agents are all ephemeral snapshots; they constantly destroy and rebuild EBS volumes
What didn't work:

  - local NVMe/bare metal: need 2-3x nodes for durability, too expensive; snapshot restores are too slow
  - custom page-server psql storage architecture: too complex/expensive to maintain
Their solution:

  - block COWs
  - volume changes (new/snapshot/delete) are a metadata change
  - storage space is logical (effectively infinite) not bound to disk primitives
  - multi-tenant by default
  - versioned, replicated k/v transactions, horizontally scalable
  - independent service layer abstracts blocks into volumes, is the security/tenant boundary, enforces limits
  - user-space block device, pins i/o queues to cpus, supports zero-copy, resizing; depends on Linux primitives for performance limits
Performance stats (single volume):

  - (latency/IOPS benchmarks: 4 KB blocks; throughput benchmarks: 512 KB blocks)
  - read: 110,000 IOPS and 1.375 GB/s (bottlenecked by network bandwidth
  - write: 40,000–67,000 IOPS and 500–700 MB/s, synchronousy replicated
  - single-block read latency ~1 ms, write latency ~5 ms
hedora•1h ago
Thanks for the summary.

Note that those numbers are terrible vs. a physical disk, especially latency, which should be < 1ms read, << 1ms write.

(That assumes async replication of the write ahead log to a secondary. Otherwise, write latency should be ~ 1 rtt, which is still << 5ms.)

Stacking storage like this isn’t great, but PG wasn’t really designed for performance or HA. (I don’t have a better concrete solution for ansi SQL that works today.)

graveland•1h ago
(I'm on the team that made this)

The raw numbers are one thing, but the overall performance of pg is another. If you check out https://planetscale.com/blog/benchmarking-postgres-17-vs-18 for example, in the average QPS chart, you can see that there isn't a very large difference in QPS between GP3 at 10k iops and NVMe at 300k iops.

So currently I wouldn't recommend this new storage for the highest end workloads, but it's also a beta project that's still got a lot of room for growth! I'm very enthusiastic about how far we can take this!

mfreed•7m ago
A few datapoints that might help frame this:

- EBS typically operates in the millisecond range. AWS' own documentation suggests "several milliseconds"; our own experience with EBS is 1-2 ms. Reads/writes to local disk alone are certainly faster, but it's more meaningful to compare this against other forms of network-attached storage.

- If durability matters, async replication isn't really the right baseline for local disk setups. Most production deployments of Postgres/databases rely on synchronous replication -- or "semi-sync," which still waits for at least one or a subset of acknowledgments before committing -- which in the cloud lands you in the single-digit millisecond range for writes again.

znpy•1h ago
Reminds me of about ten years ago when a large media customer was running NetApp on cloud to get most of what you just wrote on AWS (because EBS features sucked/sucks very bad and are also crazy expensive).

I did not set that up myself, but the colleague that worked on that told me that enabling tcp multipath for iscsi yielded significant performance gains.

bradyd•1h ago
> EBS only lets you resize once every 6–24 hours

Is that even true? I've resized an EBS instance a few minutes after another resize before.

electroly•1h ago
AWS documents it as "After modifying a volume, you must wait at least six hours and ensure that the volume is in the in-use or available state before you can modify the same volume" but community posts suggest you can get up to 8 resizes in the six hour window.
jasonthorsness•34m ago
The 6-hour counter is most certainly, painfully true. If you work with an AWS rep please complain about this in every session; maybe if we all do they will reduce the counter :P.
thesz•1h ago
What does EBS mean?

It is used in first line of the text but no explanation was given.

karanbhangui•1h ago
https://aws.amazon.com/ebs/
maherbeg•1h ago
This has a similar flavor to xata.io's SimplyBlock based storage system * https://xata.io/blog/xata-postgres-with-data-branching-and-p... * https://www.simplyblock.io/

It's a great way to mix copy on write and effectively logical splitting of physical nodes. It's something I've wanted to build at a previous role.

stefanha•1h ago
@graveland Which Linux interface was used for the userspace block driver (ublk, nbd, tcmu-runner, NVMe-over-TCP, etc)? Why did you choose it?

Also, were existing network or distributed file systems not suitable? This use case sounds like Ceph might fit, for example.

graveland•53m ago
There's some secret sauce there I don't know if I'm allowed to talk about yet, so I'll just address the existing tech that we didn't use: most things either didn't have a good enough license, cost too much, would take a TON of ramp-up and expertise we don't currently have to manage and maintain, but generally speaking, our stuff allows us to fully control it.

Entirely programmable storage so far has allowed us to try a few different things to try and make things efficient and give us the features we want. We've been able to try different dedup methods, copy-on-write styles, different compression methods and types, different sharding strategies... All just as a start. We can easily and quickly create a new experimental storage backends and see exactly how pg performs with it side-by-side with other backends.

We're a kubernetes shop, and we have our own CSI plugin, so we can also transparently run a pg HA pair with one pg server using EBS and the other running in our new storage layer, and easily bounce between storage types with nothing but a switchover event.

unsolved73•1h ago
TimescaleDB was such a great project!

I'm really sad to see them waste the opportunity and instead build an nth managed cloud on top of AWS, chasing buzzword after buzzword.

Had they made deals with cloud providers to offer managed TimescaleDB so they can focus on their core value proposition they could have won the timeseries business, but ClickHouse made them irrelevant and Neon already has won the "Postgres for agents" business thanks to a better architecture than this.

akulkarni•15m ago
Thanks for the kind words about TimescaleDB :-)

We think we're still building great things, and our customers seem to agree.

Usage is at an all-time high, revenue is at an all-time high, and we’re having more fun than ever.

Hopefully we’ll win you back soon.

tayo42•34m ago
Are they not using aws anymore? I found that confusing. It says they're not using ebs, not using attached nvme, but I didn't think there were other options in aws?
runako•25m ago
Thanks for the writeup.

I'm curious whether you evaluated solutions like zfs/Gluster? Also curious whether you looked at Oracle Cloud given their faster block storage?

Ventoy: Create Bootable USB Drive for ISO/WIM/IMG/VHD(x)/EFI Files

https://github.com/ventoy/Ventoy
141•wilsonfiifi•1h ago•49 comments

987654321 / 123456789

https://www.johndcook.com/blog/2025/10/26/987654321/
267•ColinWright•4d ago•41 comments

Affinity Studio Now Free

https://www.affinity.studio/get-affinity
28•dagmx•15m ago•16 comments

US declines to join more than 70 countries in signing UN cybercrime treaty

https://therecord.media/us-declines-signing-cybercrime-treaty?
126•pcaharrier•1h ago•57 comments

Show HN: In a single HTML file, an app to encourage my children to invest

https://roberdam.com/en/dinversiones.html
121•roberdam•5h ago•209 comments

Uv is the best thing to happen to the Python ecosystem in a decade

https://emily.space/posts/251023-uv
2061•todsacerdoti•21h ago•1150 comments

ZOZO's Contact Solver (for physics-based simulations)

https://github.com/st-tech/ppf-contact-solver
9•vintagedave•48m ago•1 comments

Estimating the Perceived 'Claustrophobia' of New York City's Streets (2024)

http://mfranchi.net/posts/claustrophobic-streets/
50•jxmorris12•2h ago•29 comments

Replacing EBS and Rethinking Postgres Storage from First Principles

https://www.tigerdata.com/blog/fluid-storage-forkable-ephemeral-durable-infrastructure-age-of-agents
60•mfreed•1d ago•24 comments

Free software scares normal people

https://danieldelaney.net/normal/
18•cryptophreak•1h ago•6 comments

Typst's Math Mode Problem

https://laurmaedje.github.io/posts/math-mode-problem/
71•marcianx•5d ago•28 comments

Tell HN: Azure outage

839•tartieret•1d ago•765 comments

Language models are injective and hence invertible

https://arxiv.org/abs/2510.15511
171•mazsa•6h ago•121 comments

Spinning Up an Onion Mirror Is Stupid Easy

https://flower.codes/2025/10/23/onion-mirror.html
133•speckx•1w ago•50 comments

Minecraft removing obfuscation in Java Edition

https://www.minecraft.net/en-us/article/removing-obfuscation-in-java-edition
896•SteveHawk27•23h ago•385 comments

Some Smalltalk about Ruby Loops

https://tech.stonecharioteer.com/posts/2025/ruby-loops/
66•birdculture•1w ago•18 comments

Frozen DuckLakes for Multi-User, Serverless Data Access

https://ducklake.select/2025/10/24/frozen-ducklake/
5•g0xA52A2A•5d ago•0 comments

How ancient people saw themselves

https://worldhistory.substack.com/p/how-ancient-people-saw-themselves
199•crescit_eundo•4d ago•124 comments

The Aesthete's Progress

https://sydneyreviewofbooks.com/essays/the-aesthetes-progress
10•pepys•6d ago•1 comments

Raspberry Pi Pico Bit-Bangs 100 Mbit/S Ethernet

https://www.elektormagazine.com/news/rp2350-bit-bangs-100-mbit-ethernet
228•chaosprint•16h ago•64 comments

3D solar tower increases capacity factor 50%, triples solar surface area

https://www.pv-magazine.com/2025/10/27/3d-solar-tower-increases-capacity-factor-50-triples-solar-...
45•geox•2h ago•24 comments

Hello-World iOS App in Assembly

https://gist.github.com/nicolas17/966a03ce49f949dd17b0123415ef2e31
141•pabs3•13h ago•50 comments

Kafka is Fast – I'll use Postgres

https://topicpartition.io/blog/postgres-pubsub-queue-benchmarks
478•enether•1d ago•333 comments

Dithering – Part 1

https://visualrambling.space/dithering-part-1/
395•Bogdanp•21h ago•84 comments

Keep Android Open

http://keepandroidopen.org/
2592•LorenDB•1d ago•816 comments

The Internet runs on free and open source software and so does the DNS

https://www.icann.org/en/blogs/details/the-internet-runs-on-free-and-open-source-softwareand-so-d...
228•ChrisArchitect•21h ago•39 comments

Board: New game console recognizes physical pieces, with an open SDK

https://board.fun/
251•nicoles•1d ago•121 comments

GLP-1 therapeutics: Their emerging role in alcohol and substance use disorders

https://academic.oup.com/jes/article/9/11/bvaf141/8277723?login=false
241•PaulHoule•2d ago•167 comments

Tailscale Peer Relays

https://tailscale.com/blog/peer-relays-beta
338•seemaze•23h ago•100 comments

AOL to be sold to Bending Spoons for $1.5B

https://www.axios.com/2025/10/29/aol-bending-spoons-deal
272•jmsflknr•23h ago•249 comments