frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie

7•leo_e•2h ago
We’re hitting a classic distributed systems wall and I’m looking for war stories or "least worst" practices.

The Context: We maintain a distributed stateful engine (think search/analytics). The architecture is standard: a Control Plane (Coordinator) assigns data segments to Worker Nodes. The workload involves heavy use of mmap and lazy loading for large datasets.

The Incident: We had a cascading failure where the Coordinator got stuck in a loop, DDOS-ing a specific node.

The Signal: Coordinator sees Node A has significantly fewer rows (logical count) than the cluster average. It flags Node A as "underutilized."

The Action: Coordinator attempts to rebalance/load new segments onto Node A.

The Reality: Node A is actually sitting at 197GB RAM usage (near OOM). The data on it happens to be extremely wide (fat rows, huge blobs), so its logical row count is low, but physical footprint is massive.

The Loop: Node A rejects the load (or times out). The Coordinator ignores the backpressure, sees the low row count again, and retries immediately.

The Core Problem: We are trying to write a "God Equation" for our load balancer. We started with row_count, which failed. We looked at disk usage, but that doesn't correlate with RAM because of lazy loading.

Now we are staring at mmap. Because the OS manages the page cache, the application-level RSS is noisy and doesn't strictly reflect "required" memory vs "reclaimable" cache.

The Question: Attempting to enumerate every resource variable (CPU, IOPS, RSS, Disk, logical count) into a single scoring function feels like an NP-hard trap.

How do you handle placement in systems where memory usage is opaque/dynamic?

Dumb Coordinator, Smart Nodes: Should we just let the Coordinator blind-fire based on disk space, and rely 100% on the Node to return hard 429 Too Many Requests based on local pressure?

Cost Estimation: Do we try to build a synthetic "cost model" per segment (e.g., predicted memory footprint) and schedule based on credits, ignoring actual OS metrics?

Control Plane Decoupling: Separate storage balancing (disk) from query balancing (mem)?

Feels like we are reinventing the wheel. References to papers or similar architecture post-mortems appreciated.

Comments

otterley•22m ago
It's not clear whether you're using Kubernetes, but the Kubernetes way of dealing with this problem is to declare a memory reservation (i.e., a request) along with the container specification. The amount of the reservation will be deducted from the host's available memory for scheduling purposes, regardless of whether the container actually consumes the reserved amount. It's also a best practice to configure the memory limit to be identical to the reservation, so if the container exceeds the reserved amount, the kernel will terminate it via the OOM killer.

Of course, for this to work, you have to figure out what that reserved amount should be. That is an exercise for the implementer (i.e., you).

See https://kubernetes.io/docs/concepts/configuration/manage-res...

> Attempting to enumerate every resource variable (CPU, IOPS, RSS, Disk, logical count) into a single scoring function feels like an NP-hard trap.

Yeah, don't do that. Figure out what resources your applications need and the declare them, and let the scheduler find the best node based on the requirements you've specified.

> We are trying to write a "God Equation" for our load balancer. We started with row_count, which failed. We looked at disk usage, but that doesn't correlate with RAM because of lazy loading.

A few things come to mind...

First, you're talking about a load balancer, but it's not clear that you're trying to balance load! A good metric to use for load balancing is one whose value is proportional to response latency.

It smells like you're trying to provision resources based on an optimistic prediction of your working set size. Perhaps you need a more pessimistic prediction. It might also be that you're relying too heavily on the kernel to handle paging, when what you really need is a cache tuned for your application that is scan-resistant, coupled with O_DIRECT for I/O.

majke•17m ago
> Coordinator sees Node A has significantly fewer rows (logical count) than the cluster average. It flags Node A as "underutilized."

Ok, so you are dealing with a classic - you measure A, but what matters is B. For "load" balancing a decent metric is, well, response time (and jitter).

For data partitioning - I guess number of rows is not the right metric? Change it to number*avg_size or something?

If you can't measure the thing directly, then take a look at stuff like "PID controller". This can be approach as a typical controller loop problem, although in 99% doing PID for software systems is an overkill.

bcoates•15m ago
Memory pressure (and a lot of other overload conditions) usually makes latency worse--does that show up in your system? Latency backpressure is a pretty conventional thing to do. You're going to want some way to close the loop back to your load balancer, if you're doing open-loop control (sending a "fair share" of traffic to each node and assuming it can handle it) issues like you describe will keep coming up.

This is a Hard Problem and you might be trying to get away with an unrealistically small amount of overprovisioning.

Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie

7•leo_e•2h ago•3 comments

Ask HN: Hearing aid wearers, what's hot?

323•pugworthy•17h ago•188 comments

GhostBin A lightweight pastebin, built with Go and Redis

2•sanaf•2h ago•0 comments

Ask HN: Good resources to learn financial systems engineering?

127•_1tan•1d ago•24 comments

A logging loop in GKE cost me $1,300 in 3 days – 9.2x my actual infrastructure

5•nthypes•4h ago•2 comments

Tell HN: Cursor charged 19 subscriptions, won't refund

3•devtailz•4h ago•2 comments

Tell HN: Declaration of Independence is 100% AI, according to AI Checker

8•whatamidoingyo•4h ago•2 comments

Ask HN: Photos corrupted on Google Pixel phones over time?

5•poolnoodle•7h ago•2 comments

Don't obsess with security and privacy unless they are your core business

6•amano-kenji•13h ago•10 comments

Malicious Bun Script Found in NPM Package Bumps

3•kothariji•10h ago•1 comments

Tell HN: Wanted to Give Dang Appreciation

37•razodactyl•20h ago•3 comments

Ask HN: What tools do you pay for today that feel overpriced or frustrating?

5•psicombinator•16h ago•6 comments

Why isn't There a open-source (project) game?

3•triilman•19h ago•6 comments

ZetaShare Building private file transfer with WebRTC

3•masterdegrees•22h ago•0 comments

Malware in PostHog NPM packages

9•roskoalexey•11h ago•7 comments

Ask HN: Is America in Recession?

20•register•1d ago•28 comments

Ask HN: Advice for feeling like a failure in PhD?

8•phdthrowaway1•1d ago•2 comments

Ask HN: How do you balance creativity, love for the craft, and money?

14•introvertmac•2d ago•9 comments

Boring Laser Eyes Simulator: Add laser beams to your eyes with your webcam

3•frankhsu•2d ago•0 comments

Ask HN: Working in a language that isn't your native one. How hard was it?

8•william-cooke•2d ago•16 comments

Ask HN: Where can you find old NetBSD packages?

12•GaryBluto•2d ago•5 comments

Facebook has made it impossible to delete Pages – dark patterns everywhere

47•ramharts•5d ago•16 comments

Ask HN: First Steps with a Patent Troll?

10•throwawaynvmbr•2d ago•11 comments

Ask HN: Is it time to measure Inflation and CPI without the government?

16•cyrusradfar•2d ago•6 comments

Ask HN: What is the current state of the art in BIG (>5TB) cloud backups?

22•jacobwilliamroy•5d ago•20 comments

Ask HN: Current state of Android USB tethering?

8•namesarehard•2d ago•0 comments

Ask HN: Where could I find early adopters?

4•nonmaskable•23h ago•1 comments

Ask HN: How would you architect a RAG system for 10M+ documents today?

21•Ftrea•3d ago•8 comments

Fun weekend task – Calculate your crypto relief or regret

6•shouldabought•3d ago•1 comments

Ask HN: Are you still working with a website that requires Internet Explorer?

11•urnicus•3d ago•10 comments