At this point, I wonder if instead of relying on daemonsets, you just gave every namespace a vector instance that was responsible for that namespace and pods within. ElasticSearch or whatever you pipe logging data to might not be happy with all those TCP connections.
Just my SRE brain thoughts.
Vector is a daemonset, because it needs to tail the log files on each node. A single vector per namespace might not reside on the nodes that each pod is on.
There were recent changes to the NodeJS Prometheus client that eliminates tag names from the keys used for storing the tag cardinality for metrics. The memory savings wasn’t reported but the cpu savings for recording data points was over 1/3. And about twice that when applied to the aggregation logic.
Lookups are rarely O(1), even in hash tables.
I wonder if there’s a general solution for keeping names concise without triggering transposition or reading comprehension errors. And what the space complexity is of such an algorithm.
shanemhansen•1d ago
hinkley•7h ago
There’s about a factor of 3 improvement that can be made to most code after the profiler has given up. That probably means there are better profilers than could be written, but in 20 years of having them I’ve only seen 2 that tried. Sadly I think flame graphs made profiling more accessible to the unmotivated but didn’t actually improve overall results.
zahlman•6h ago
The sympathy is also needed. Problems aren't found when people don't care, or consider the current performance acceptable.
> There’s about a factor of 3 improvement that can be made to most code after the profiler has given up. That probably means there are better profilers than could be written, but in 20 years of having them I’ve only seen 2 that tried.
It's hard for profilers to identify slowdowns that are due to the architecture. Making the function do less work to get its result feels different from determining that the function's result is unnecessary.
hinkley•5h ago
All of which have gotten perhaps an order of magnitude worse in the time since I started on this theory.
Negitivefrags•6h ago
If you see a database query that takes 1 hour to run, and only touches a few gb of data, you should be thinking "Well nvme bandwidth is multiple gigabytes per second, why can't it run in 1 second or less?"
The idea that anyone would accept a request to a website taking longer than 30ms, (the time it takes for a game to render it's entire world including both the CPU and GPU parts at 60fps) is insane, and nobody should really accept it, but we commonly do.
javier2•5h ago
avidiax•5h ago
Uber could run the complete global rider/driver flow from a single server.
It doesn't, in part because all of those individual trips earn $1 or more each, so it's perfectly acceptable to the business to be more more inefficient and use hundreds of servers for this task.
Similarly, a small website taking 150ms to render the page only matters if the lost productivity costs less than the engineering time to fix it, and even then, only makes sense if that engineering time isn't more productively used to add features or reliability.
oivey•5h ago
paulryanrogers•4h ago
oivey•4h ago
Aeolun•1h ago
vlovich123•32m ago
azornathogron•5h ago
Negitivefrags•5h ago
wizzwizz4•4h ago
Negitivefrags•3h ago
hinkley•5h ago
antonymoose•5h ago
hinkley•4h ago
I always liked Shaw’s “The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.”
jesse__•5h ago
I'm curious, what're the profilers you know of that tried to be better? I have a little homebrew game engine with an integrated profiler that I'm always looking for ideas to make more effective.
hinkley•5h ago
The common element between attempts is new visualizations. And like drawing a projection of an object in a mechanical engineering drawing, there is no one projection that contains the entire description of the problem. You need to present several and let brain synthesize the data missing in each individual projection into an accurate model.