How to make things slower so they go faster

https://www.gojiberries.io/how-to-make-things-slower-so-they-go-faster-a-jitter-design-manual/

129•neehao•3d ago

Comments

evaXhill•1d ago

Good post, bit too “mathy” but makes me think of “Asynchronous computing @Facebook: Driving efficiency and developer productivity at Facebook scale”. Where they touch on capacity optimization (queuing + time shifting), capacity regulation along with user delay tolerance (bc not all jobs, even at the same priority level, are equal)

SpaceManNabs•1d ago

I think the issue with the math is that it doesn't read well.

For example, the paragraphs around the paragraph with "compute the exact Poisson tail (or use a Chernoff bound)" and that paragraph itself could be better illustrated with lines of math instead of mostly language.

I think you do need some math if you want to approach this probabilistically, but I agree that might not be the most accessible approach, and a hard threshold calculation is more accessible and maybe just as good.

cogman10•1d ago

For something like this, annotated graphs and examples (IMO) work a lot better than formulas in explaining the problem and solution.

Particularly because distributed computer systems aren't pure math problems to be solved. Load often comes from usage which is often closer to random inputs rather than predicable variables. Further, how load is processed depends on a bunch of things from the OS scheduler to the current load on the network.

It can be hard to really intuitively understand that a bottlenecked system processes the same load slower than an unbound system.

ignoramous•1d ago

> Asynchronous computing @Facebook: Driving efficiency and developer productivity ... optimization (queuing + time shifting), capacity regulation along with user delay tolerance ...

  We can infer a more detailed priority by understanding how long each of these asynchronous requests can be delayed ... For each job to be executed, we try to execute it as close as possible to its delay tolerance.

  ... we defer jobs with a long delay tolerance so that the workload is spread over a longer time window. Queueing plays an important role in selecting the most urgent job to execute first.

  ...  Time shifting ... optimize capacity in Async:

  1. Predictive compute collects the data people used yesterday. Predicated on which data people may need, it precomputes before peak hours and stores the data in cache ... This moves the computing lift from peak hours to off-peak hours and trades a little cache miss for better efficiency. 

  2. Deferred compute ... schedules a job as part of user request handling but runs at much later time. For instance, the "people you may know" list is processed during off-peak hours, then shown when people are online (generally during peak hours).

https://engineering.fb.com/2020/08/17/production-engineering... / https://archive.vn/A87hl

motorest•1d ago

> Asynchronous computing @Facebook

I feel tha I'm missing something obvious. Isn't this doc reinventing the wheel in terms of what very basic task queue systems do? It describes task queues and task prioritization, and how it supports tasks that cache user data. What am I missing?

taeric•1d ago

I thought this would be about Braess's paradox. I suppose that is a contrary wording of this? How making things faster slows things down?

A few fun videos covering this. I first saw Steve Mould's. He links to Up and Atom. Both are fun.

jmclnx•1d ago

I do something I think is simpler, and it is extremely portable. I use this on Linux and *BSD.

I just call nanosleep(2) based upon the amount if data processed. This is set by a parameter file that contains the sleep time and amount of data to determine when to sleep.

In programs I know will execute for a very long time, if the parameter file changes, parameters are adjusted during the run. Plus I will catch cancel signals to create a restart file should the program be cancelled.

LtWorf•1d ago

Why?

yardshop•1d ago

I thought this might be about the saying I've heard a bunch recently, "Slow is smooth, and smooth is fast."

I've mostly heard it in the context of building and construction videos where they are approaching a new skill or technique and have to remind themselves to slow down.

Going slowly and being careful leads to fewer mistakes, which will be a "smoother" process and ends up taking less time, whereas going too fast and making mistakes means work has to be redone and ultimately takes longer.

On rereading it, I see some parallels: When one is trying to go too fast, and is possibly becoming impatient with their progress, their mental queue fills up and processing suffers. If one accepts a slower pace, one's natural single-tasking capability will work better, and they will make better progress as a result.

And maybe its just my selection bias working hard to confirm that he actually is talking about what I want him to say!

anonymars•1d ago

Kind of like the tortoise and the hare?

bluedino•1d ago

> "Slow is smooth, and smooth is fast."

Common to hear this in auto racing and probably a lot of other fields

c0nsumer•1d ago

Mountain biking as well, to counter new riders trying to ride fast from the get-go and crashing and getting hurt.

eszed•1d ago

Yeah, the phrase goes back at least to Bill Miliken's monumental Race Car Vehicle Dynamics, where I first encountered it. The specific idea is that going slow(er) into a corner allows you to hit the apex precisely, optimally rotate the car, and get on the power sooner, which gets you a higher exit speed, which compounds all the way to the next corner. It's what fast drivers have done - probably since racing was done with horses - but it's counter-intuitive to beginners.

hbarka•1d ago

The military places a lot of emphasis on this as a training principle. Practice over and over.

wallflower•1d ago

Very common. In fact, I think the hardest part of learning to play a musical instrument is the tendency to want to play at normal speed before you are ready. The idea that you can play something fast accurately when you can’t even play it slowly accurately is the classic mental and psychological conundrum.

There is a saying: “You don’t rise your level when performing. You fall to your level of practice.”

JasonSage•1d ago

The saying is confusing and I would suggest makes the opposite claim. It’s common in sports. You practice at an uncomfortable pace to normalize it, even making mistakes, because if you can’t practice at game speed you won’t be able to compete at game speed. In that context there’s room for both, and I’d say the same for music—you need slow, deliberate practice and also reps in “performance” mode, and it’s probably too reductive to say you should “only” be doing either at any point in time.

milesvp•1d ago

What I find fascinating, is how much this concept scales to places it seems like it shouldn't. I had taken the idea to heart early in my life for anything that require dexterity. But it wasn't until mid career that I saw it work at an organizational level. At one point the team I was on stopped promising so much. We essentially decided to slow down. I don't quite remember what lead us to this mindset, though I know our weekly retrospectives were part of it (we had some really good retros, like I cry at the thought that I will likely never have that level of mutual trust in a team again). And, what was sort of unexpected, was that our velocity basically went up. We knew we wanted to make sure we focused on higher value items, and push back on low quality requests, but the amount of requests we could accommodate also went up along with the average value. I still don't fully understand the theory behind it, certainly we were using a lot of cycles on low value things, but just promising fewer deliverables allowed us to deliver more. I know that brains are bad at time slicing, but this seems to also expand to the organizational level too...

persedes•1d ago

Isn't this essentially the idea behind agile? I'm not too deep into the agile theory, but the Phoenix project is always a very good read (albeit stressful if you work in software teams lol)

Austizzle•1d ago

They said this a bunch in the movie F1 that's playing in theatres right now, so that could be why there's been an uptick in usage

bluedino•1d ago

We encounter things like this all the time, they're fun to play around with. Assuming you have the ability to collect useful performance metrics...

In a simple, ideal world, your developers can issue the same number of jobs as you have CPUs available. Until you run into jobs that take more memory than is available. Or that access more disk/network IO than is available.

So you setup temporary storage, or in-memory storage, or stagger the jobs so only a couple of them hit the disks at a time, and then you measure performance in groups of 4 or 8 to see when performance falls off, or stand up an external caching server, or whatever else you can come up with to work within your budget and available resources.

mparnisari•1d ago

"Synchronized demand is the moment a large cohort of clients acts almost together. In a service with capacity ... requests per second and background load ..., the usable headroom is ..."

To the author of the article: I stopped reading after the first two sentences. I have no idea what you are talking about.

cstrahan•1d ago

"Synchronized demand is the moment a large cohort of clients acts almost together."

Imagine everyone in a particular timezone browsing Amazon as they sit down for their 9 to 5; or an outage occurring, and a number of automated systems (re)trying requests just as the service comes back up. These clients are all "acting almost together".

"In a service with capacity mu requests per second and background load lambda_0, the usable headroom is H = mu - lambda_0 > 0"

Subtract the typical, baseline load (lambda_0) from the max capacity (mu), and that gives you how much headroom (H) you have.

The signal processing definition of headroom: the "space" between the normal operating level of a signal and the point at which the system can no longer handle it without distortion or clipping.

So headroom here can be thought of "wiggle room", if that is a more intuitive term to you.

fuckaj•1d ago

Is the pragmatic solution to return 503 and have clients back off.

Or, if possible make latency a feature (embrace the queue!). For service to service internal stuff e.g. something like a request to hard delete something, this can always be a queue.

And obviously you can scale up as the queue backs up.

I do love the maths tho!

mikewarot•1d ago

If you have a computation that can be expressed as a definite and finite sequence of steps, such as computing the next token from an LLM, you could unwind all the loops and spread it amongst a grid of FPGAs. It would be large, expensive, and power hungry, but doable.

If you do that, you're likely to have a latency on the order of almost a millisecond, putting the previous tokens in one end would get you the logits for the next at a rate of let's say 1000 tokens per second... impressive at current rates.

You could also take that same array, and program in several latches along the way to synchronize data at selected points, and enabling pipelining. This might produce a slight (10%) increase in latency, so a 10% or so loss in throughput for a single stream. However, it would allow you to have multiple independent streams flowing through the FPGAs. Instead of serving 1 customer at 1000 tokens/second, you might have 10 or more customers each with their 900 tokens/second.

Parallelism and pipelining are the future of compute.

mhb•1d ago

Braess Paradox: "Braess' paradox is the observation that adding one or more roads to a road network can slow down overall traffic flow through it."

https://en.wikipedia.org/wiki/Braess%27_paradox

dijit•1d ago

I thought it was Jevons Paradox.

https://en.wikipedia.org/wiki/Jevons_paradox

I guess it's the same underlying principle for both paradoxii.

ricudis•1d ago

Segmentation fault, core dumped - you cannot use Latin inflections in Greek words

delifue•1d ago

Can you explain what's the same underlying principle between Jevons paradox and Braess paradox?

dijit•1d ago

That you don’t necessarily solve a problem by increasing the capacity.

In fact, increasing capacity can make the problems worse due to the new capacity being thought of as available by many people at the same time.

IAmBroom•17h ago

That is a non-English plural for paradox.

Also, the plural should be quantified when possible: one paradox, two tridox, three quatrodox...

persedes•1d ago

Always wondered if this could also be expressed "simply" with the Reynolds number to determine how to keep your flow laminar.. But then again how does one map software capabilities to SI units :D

johnthescott•1d ago

Fast is fine, But accuracy is final. You must learn to be slow in a hurry.

        - Wyatt Earp

fuckaj•1d ago

Thinks looks very interesting but is there a good textbook as this is a bit tricky to follow. There are gaps of expected knowledge here.

The Therac-25 Incident

WebLibre: The Privacy-Focused Browser

Malleable Software Will Eat the SaaS World

Claude for Chrome

Scientist exposes anti-wind groups as oil-funded. Now they want to silence him

Gemini 2.5 Flash Image

Dissecting the Apple M1 GPU, the end

Light pollution prolongs avian activity

GNU Artanis – A fast web application framework for Scheme

Chinese astronauts make rocket fuel and oxygen in space

Rv, a new kind of Ruby management tool

The man with a Home Computer (1967) [video]

Bypass PostgreSQL catalog overhead with direct partition hash calculations

Reverse Engineered Raspberry Pi Compute Module 5

Neuralink 'Participant 1' says his life has changed

One universal antiviral to rule them all?

US Intel

Show HN: FilterQL – A tiny query language for filtering structured data

Japan has opened its first osmotic power plant

Molluscs of the Multiverse: molluscan diversity in Magic: The Gathering

SpaCy: Industrial-Strength Natural Language Processing (NLP) in Python

A teen was suicidal. ChatGPT was the friend he confided in

Uncomfortable Questions About Android Developer Verification

Denmark summons top US diplomat over alleged Greenland influence operation

iOS 18.6.1 0-click RCE POC

Show HN: Regolith – Regex library that prevents ReDoS CVEs in TypeScript

Why do people keep writing about the imaginary compound Cr2Gr2Te6?

The McPhee method for writing deeply reported nonfiction

Michigan Supreme Court: Unrestricted phone searches violate Fourth Amendment

LiteLLM (YC W23) is hiring a back end engineer