frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Supertoast tables

https://hatchet.run/blog/supertoast-tables
45•abelanger•6h ago

Comments

debarshri•4h ago
I think anything interesting more cleaner route would be to create a plugin in postgres and introduce a type that upload the large file in s3.

It would reduce thr complexity.

Postgres plugins are very underrate and under utilized

levkk•4h ago
All queries run inside transactions, and a slow lane like S3 will cause delays, which will in turn block vacuum and cause more problems than it will solve. Most deployments of Postgres (e.g., RDS) won't let you install custom extensions either, although they do have their own S3 extension (which I wouldn't recommend you use).

The right place to manage this is either in the app or in a proxy, before the data touches Postgres.

carderne•4h ago
How does this work with self-hosting? Is the assumption that self-hosters won’t run into this problem?

For most use-cases I’d probably prefer to just delete the payloads some time after the job completes (persisting that data is business logic problem). And keep the benefits of “just use Postgres”, which you guys seem to have outgrown.

abelanger•4h ago
Candidly we're still trying to figure that out: all of the plumbing is there in the open source, but the actual implementation of writes to S3 are only on the cloud version. This is partially because we're loath to introduce additional dependencies, and partially because this job requires a decent amount of CPU and memory and would have to run separate from the Hatchet engine, which adds complexity to self-hosted setups. That said, we're aware of multi-TB self-hosted instances, and this would be really useful for them - so it's important that we can get this into the open source.

The payloads are time-partitioned (in either case) so we do drop them after the user-defined retention period.

philsnow•2h ago
Unexpectedly, I love the animated ascii diagrams, very cogmind-esque.

Anybody know how they designed those?

atombender•1h ago
I really wish there was a seamless system for this. Once you try to do this kind of thing, you run into all sorts of rabbit holes and cans of worms.

For example, coalescing blobs into "superblobs" to avoid a proliferation of small objects means you invent a whole system for tracking "subfiles" within a bigger file.

And you'll need a compacting job to ensure old, deleted data is expunged, which may be more important than you think if the data has to be erased for privacy or legal reasons.

Object storage has no in-place mutation, so this compaction has to be transactionally safe and must be careful not to leave behind cruft on failure, and so on.

Furthermore, storing blobs in object storage without keeping a local inventory of them is, in my experience, a disaster. For example, if your database has tenants or some other structural grouping, something simple like finding out how much blob storage a specific tenant has is a very time-consuming operation on S3/GCS/etc. because you need to filter the whole bucket by prefix. So for every blob you store, you want to have a database table of what they are so that the only object operations you do are reads and writes, not metadata operations.

Sure, you have things like inventory reports on GCS that can help, but I would still say that you need to track this stuff transactionally. The database must be the source of truth, and the object storage must never be used as a database.

And so on.

This need to be able to store many small objects in object storage is coming up more and more for me, as is the desire to mutate them in-place or at least append. For example, imagine you want to build a kind of database which stores a replicated copy of itself in the cloud. There is no way to do this in S3-like object storage without representing this as a series of immutable "snapshots" and "deltas". It's fast to append this way, but you run into the problem of eventually needing to compact, and you absolutely have to batch up the uploads in order to avoid writing too many small objects.

So lately I've pondered using something else for this type of work, like a key/value database, like FoundationDB or TiKV, or even something like Ceph. I wonder if anyone else has tried that?

huntaub•1h ago
Well, I think this is what our company, Archil, is working on. We basically built an SSD clustering layer that proxies/caches/and assembles requests into object storage so that you can run a POSIX file system directly on top.

There's also some really great projects like SlateDB in this space, which could be more like what you're looking for (~RocksDB like API that runs on S3).

ovaistariq•40m ago
Well we have made small objects work well on Tigris (https://www.tigrisdata.com/). And we have several use cases of folks using it as KV store. Funny that you mention FoundationDB, we use that for our metadata storage.

There Was a Time before Mathematica (2013)

https://writings.stephenwolfram.com/2013/06/there-was-a-time-before-mathematica/
1•masfuerte•51s ago•0 comments

Russian Ransomware Administrator Pleads Guilty to Wire Fraud Conspiracy

https://www.justice.gov/usao-md/pr/russian-ransomware-administrator-pleads-guilty-wire-fraud-cons...
1•737min•2m ago•0 comments

Show HN: Rust-First L3 Limit Order Book Backtesting Engine with Python Bindings

https://github.com/chasemetoyer/Backtesting-Engine
1•chasemetoyer•2m ago•0 comments

Show HN: Ovumcy – self-hosted menstrual cycle tracker

https://github.com/terraincognita07/ovumcy
1•terrain07•3m ago•0 comments

Show HN: Sheila, an AI agent that replaced our accounting flow

https://soapbox.pub/blog/announcing-sheila/
3•knewter•11m ago•1 comments

Qualcomm CEO: 'Resistance Is Futile' as 6G Mobile Revolution Approaches

https://fortune.com/2026/03/03/qualcomm-ceo-resistance-is-futile-6g-mobile-revolution-approaches/
2•m463•12m ago•1 comments

Show HN: NeoNetrek – modernizing the internet's first team game (1988)

https://neonetrek.com
1•yuriksan•14m ago•0 comments

Show HN: Natural language queries for Prometheus Kafka metrics (StreamLens)

https://github.com/muralibasani/streamlens
1•muralibasani•14m ago•0 comments

Satellite firm pauses imagery after revealing Iran's attacks on US bases

https://arstechnica.com/space/2026/03/satellite-firm-pauses-imagery-after-revealing-irans-attacks...
1•consumer451•16m ago•0 comments

China Suspected in Breach of FBI Surveillance Network

https://www.wsj.com/politics/national-security/china-suspected-in-breach-of-fbi-surveillance-netw...
3•JumpCrisscross•16m ago•0 comments

Show HN: I created list of directories (1000) to create free backlinks

https://kitful.ai/directories
1•eashish93•18m ago•0 comments

Fishing crews in the Atlantic keep accidentally dredging up chemical weapons

https://arstechnica.com/health/2026/03/fishing-crews-in-the-atlantic-keep-accidentally-dredging-u...
2•jnord•20m ago•0 comments

The National Videogame Museum Has Acquired the Mythical Nintendo PlayStation

https://www.engadget.com/gaming/the-national-videogame-museum-has-acquired-the-mythical-nintendo-...
2•breve•23m ago•0 comments

C# Strings Silently Kill Your SQL Server Indexes in Dapper

https://consultwithgriff.com/dapper-nvarchar-implicit-conversion-performance-trap
7•PretzelFisch•24m ago•0 comments

Show HN: I open-sourced my Steam game, 100% written in Lua, engine is also open

https://github.com/willtobyte/reprobate
1•delduca•24m ago•0 comments

The White House: Touchdown

https://twitter.com/WhiteHouse/status/2030051395294941427
2•TheAlchemist•25m ago•3 comments

Capability-Tiered AI Governance Architecture (CEGP)

https://github.com/babyblueviper1/ai-governance-architecture
2•babyblueviper1•27m ago•1 comments

A new chapter for the Nix language, courtesy of WebAssembly

https://determinate.systems/blog/builtins-wasm/
2•birdculture•28m ago•0 comments

Shipping a Button in 2026 [video]

https://www.youtube.com/watch?v=xE9W9Ghe4Jk
1•Dhvani35729•29m ago•0 comments

Show HN: Stream-native AI that never sleeps, an alternative to OpenClaw

https://github.com/timeplus-io/PulseBot
1•gangtao•35m ago•0 comments

Show HN: Flompt – Visual prompt builder that decomposes prompts into blocks

https://github.com/Nyrok/flompt
1•hkonte•35m ago•0 comments

FBI investigating 'suspicious' cyber activity on system holding wiretaps

https://abcnews.com/Technology/wireStory/fbi-investigating-suspicious-cyber-activity-system-holdi...
1•campuscodi•36m ago•0 comments

Show HN: key-carousel - Key rotation for LLM agents

https://github.com/HalfEmptyDrum/Key-Carousel
4•EmptyDrum•36m ago•1 comments

Device that can extract 1k liters of clean water a day from desert air

https://www.tomshardware.com/tech-industry/device-that-can-extract-1-000-liters-of-clean-water-a-...
3•PaulHoule•39m ago•0 comments

Show HN: Sqry – semantic code search using AST and call graphs

https://sqry.dev
2•verivusai•39m ago•0 comments

The Window Chrome of Our Discontent

https://pxlnv.com/blog/window-chrome-of-our-discontent/
3•zdw•41m ago•0 comments

When Batteries Heat Up, This Membrane "Sweats" It Out

https://axial.acs.org/nanoscience/when-batteries-heat-up-this-membrane-sweats-it-out
1•geox•41m ago•0 comments

Show HN: Stratum - a pure JVM columnar SQL engine using the Java Vector API

https://datahike.io/stratum/
1•whilo•42m ago•1 comments

Wild crows in Sweden help clean up cigarette butts

https://www.samodobrevijesti.com/en/news/wild-crows-in-sweden-help-clean-up-cigarette-butts/
10•jhncls•42m ago•4 comments

Show HN: BLOBs in MariaDB's Memory Engine – No More Disk Spills for Temp Tables

https://jira.mariadb.org/browse/MDEV-38975
1•arcivanov•45m ago•1 comments