What makes it useful:
- Combine free tiers — set per-backend quotas to match each provider's free limit and the proxy fills them in order (pack mode) or evenly (spread mode). 10GB + 10GB + 10GB = 30GB of free offsite storage
- Multi-cloud replication — set replication.factor: 2 and every object automatically lands on two different providers. Instant redundancy, zero client-side changes
- Full S3 API — works with aws cli, rclone, boto3, any S3 SDK. SigV4 auth, multipart uploads, range reads, batch deletes, the works
- Virtual buckets — multiple apps can share the orchestrator with isolated namespaces and independent credentials
- Monthly usage limits — cap API requests, egress, and ingress per backend so you never blow past a free tier
- Write safety — all metadata and quota updates happen inside PostgreSQL transactions. Object location inserts and quota counter changes are atomic — if anything fails, the whole operation rolls back. Orphaned objects from partial failures get caught by a persistent cleanup queue with exponential backoff retry instead of silently leaking storage
- TLS and mTLS — native TLS termination with configurable min version (1.2/1.3), plus mutual TLS support for environments where you want to restrict access to clients with a valid certificate. Certificate reload on SIGHUP for zero-downtime rotation
- Multi-instance / split-mode deployment — run with -mode all (default), -mode api (request serving only), or -mode worker (background tasks only). Scale API instances independently from workers behind a load balancer.
- Trusted proxy awareness — configure trusted CIDR ranges so rate limiting targets real client IPs from X-Forwarded-For, not your load balancer (rebalancer, replicator, cleanup queue, lifecycle) use PostgreSQL advisory locks so only one worker runs each task at a time — no duplicate work, no coordination needed
- Circuit breaker — if the metadata DB goes down, reads keep working via broadcast to all backends. Writes fail cleanly
- Automatic rebalancing — if you add a new backend, the rebalancer redistributes objects across all of them
- Backend draining — need to remove a provider? s3-orchestrator admin drain <backend> live-migrates all objects off that backend to the remaining pool with progress tracking. Once drained, admin remove-backend cleans up the database records (optionally purging the S3 objects too). No downtime, no manual file shuffling — swap providers without your clients noticing
- Web dashboard — storage summary, backend status, file browser, upload/delete, log viewer
- Production observability — Prometheus metrics (60+ gauges/counters), OpenTelemetry tracing, structured audit logging with request ID correlation
- Lifecycle rules — auto-expire objects by prefix and age
- Config hot-reload — update credentials, quotas, rate limits, replication, and rebalance settings without restarting via SIGHUP
- Comes with production ready Kubernetes and Nomad manifests/jobs that can be run with, a custom grafana dashboard utilizing the exported metrics
A bit nervous to share this but I think it is ready to be seen and maybe somebody else would find it useful.
munch-o-man•1h ago
The coolest way to test this out is to just clone it and then run either:
make nomad-demo
make kubernetes-demo
that will spin up the docker-compose crap used for integration testing (two minio instances and a postgres) then start kubernetes via k3d or nomad via -dev mode, build the docker image, ingest it, run it, and print out a handy list of urls for different dashboards/metrics/ui/etc. The grafana dashboard in the repo is automatically ingested by grafana in the two "-demo" modes so you can literally run one command to run it and immediately play with the ui, see visualizations of the metrics, and start playing with it in a safe sandboxed environment.
For people that aren't just trying to get as much free storage as possible the storage and api/ingress/egress quotas can still be super useful in cost management since you can cap yourself.
The other cool use is if you needed data replicated across two different clouds for [reasons] this will do all that work for you if you set a replication factor and your application doesn't have to know anything about it...just point it at this instead of the actual s3 backend.
Also the ability to drain a backend could be super useful if you are trying to get off a certain cloud without taking downtime.
This is engineered to be highly durable...instead of failing it degrades and returns to healthy when conditions improve and the postgres is back...and it stops all writes when postgres is down since no usage would be tracked.
also, if you have an existing bucket that you want to bring under management by the s3-orchestrator it has sync functionality...the only thing it can't import is the monthly api-calls/ingress/egress from before the sync.
I'm open to all advice and comments. Pretty nervous sharing this.