frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: How do you do store-and-forward telemetry at the edge?

4•Aydarbek•1d ago
I’m researching patterns for edge / gateway telemetry where the network is unreliable (remote sites, industrial, fleets, etc.) and you need offline buffering + bounded disk + replay once connectivity returns.

Questions for folks running this in production:

What do you use today? (MQTT broker + ??, Kafka/Redpanda/NATS, Redis Streams, custom log files, embedded DB, etc.)

Where do you buffer during outages: append-only log, SQLite/RocksDB, queue-on-disk, something else?

How do you handle backpressure when disk is near full? (drop policy, compression, sampling, prioritization)

What’s your failure nightmare: corruption, replay storms, duplicates, “stuck” consumer offsets, disk-full, clock skew?

What guarantees do you actually need: zero-loss vs “best effort” (and where do you draw that line)?

What metrics/alerts matter most on gateways? (queue depth, replay rate, oldest event age, fsync latency, disk usage, etc.)

I’d love to learn what works, what breaks, and what you wish existing tools did better.

Comments

Aydarbek•1d ago
Disclosure: I built an OSS single-binary, HTTP-native durable event log aimed at this edge “store-and-forward + replay” problem. Repo: github.com/A1darbek/ayder

If anyone is open to a tiny design-partner pilot (30–60 min): run docker compose → ingest some telemetry → simulate outage (kill -9 / disconnect) → restart → verify replay + zero loss. I’ll do white-glove onboarding and turn the learnings into a short case study (can be anonymous).

deangiberson•1d ago
Edge(FluentBit -> Logs -> cron(compress -> encrypt)) -> Cloud(S3 -> Trigger -> Lambda decrypt -> S3 -> Trigger -> Lambda decompress -> S3 > Trigger -> Lambda to CloudWatch)

I have a system that runs on edge services and captures everything to logs through FluentBit. Then there's a cron job that compresses, encrypts, and tries to send the logs to device specific S3 buckets. If the on device logs get too big they start dropping old logs first, with a heuristic for certain logs being more/less important. When devices reconnect to the cloud they start pushing logs as quickly as they can, the cloud infra backfills metrics as they arrive.

Once in S3, triggers start a series of lambdas to decrypt, decompress, analysis. Works well, easy to reason about.

The backend can easily be swapped out for something else. The harder part is the log compress/encrypt/rotate. It's important that you don't treat all logs exactly the same. Some are much more important and should be preserved over others.

Aydarbek•1d ago
This is gold, thank you. The “easy to reason about” part is exactly what I’m going for.

A couple quick questions if you don’t mind:

Roughly what volume are you pushing per device (MB/day or events/sec), and what’s your typical offline window?

What’s your biggest failure mode today: disk-full/rotate policy, encryption key handling, replay storms on reconnect, or Lambda fanout/cost?

I’m thinking Ayder could replace the “rotate → ship” backend with a durable local log + priority queues + replay, but you’re right that the hardest part is the policy (what to drop first, how to bound disk, and how to preserve critical streams). If you’re open, I’d love to learn what heuristics you ended up with.

OpenAI API and ChatGPT are down

7•themanmaran•1h ago•1 comments

Ask HN: Is it time for HN to implement a form of captcha?

61•Rooster61•6h ago•95 comments

I built an AI agent that deploys a PR to production

2•amouehsan•56m ago•0 comments

Ask HN: Where is legacy codebase maintenance headed?

3•AnnKey•3h ago•1 comments

Ask HN: Any Microsoft employees/devs here? What's happening to Microsoft?

100•thehamkercat•2d ago•77 comments

Ask HN: Who wants to be hired? (January 2026)

167•whoishiring•6d ago•397 comments

Ask HN: How do you use 5–10 minute gaps productively?

41•pea•4d ago•54 comments

Developing a high level language over Zig

2•ziyaadsaqlain•14h ago•2 comments

Ask HN: Who is hiring? (January 2026)

353•whoishiring•6d ago•333 comments

Ask HN: How would you decouple from the US?

18•yawa_me_worht•16h ago•7 comments

Implementing NaN Boxing in a Stack-Based VM

4•tracyspacy•1d ago•0 comments

Ask HN: We built an air-gapped document vault with encrypted print and export

3•KevinG777•20h ago•6 comments

RevisionDojo, a YC startup, is running astroturfing campaigns targeting kids?

451•red-polygon•3d ago•86 comments

Cancelled 2x Cursor Ultra plans, here's why

8•throwawayround•6h ago•6 comments

Ask HN: What's a standard way for apps to request text completion as a service?

5•nvader•3d ago•3 comments

Ask HN: Is anyone aware of a LinkedIn mirror like xcancel.com for X?

11•danielfalbo•1d ago•7 comments

Ask HN: Anyone else seeing porn images in YouTube ad preview images?

4•OhMeadhbh•1d ago•6 comments

Ask HN: How do you do store-and-forward telemetry at the edge?

4•Aydarbek•1d ago•3 comments

Git analytics that works across GitHub, GitLab, and Bitbucket

3•akhnid•2d ago•1 comments

Amazon Prime AI overviews can't even get the basics right

43•PyWoody•2d ago•13 comments

Ask HN: Has anyone else been struggling with search lately?

31•areoform•2d ago•18 comments

Ask HN: How do small teams make sure recurring tasks don't slip?

7•batels•2d ago•15 comments

Ask HN: Reading list for being a better engineer?

44•drekipus•4d ago•16 comments

Anyone building software for wearable tech?

16•ssc23•3d ago•15 comments

I made a lofi page for late night work

19•onmyway133•3d ago•8 comments

My Logitech mouse became unusable, Logi Options+ can't validate certificate

12•enescakir•1d ago•10 comments

Ask HN: What did you learn in 2025?

20•kiernanmcgowan•5d ago•8 comments

What do people usually do with spare Android phones? Any practical use cases?

18•AndroidShare•4d ago•21 comments

Tell HN: I'm having the worst career winter of my life

98•mariogintili•6d ago•126 comments

Ask HN: What's the future of software testing and QA?

23•sjgeek•4d ago•18 comments