How we tracked down a Go 1.24 memory regression

https://www.datadoghq.com/blog/engineering/go-memory-regression/

191•gandem•6mo ago

Comments

nitinreddy88•6mo ago

I am more interested to learn about Swiss tables than bug fix :)

What are the best places to learn modern implementations of traditional data structures. Many of these utilise SIMD for last mile usage of modern hardware

skavi•6mo ago

could read one of the implementations. there’s the original abseil implementation and rust’s in the hashbrown crate. probably many more.

gandem•6mo ago

OP here, I wrote another blog post that explains how Swiss Tables work, see https://news.ycombinator.com/item?id=44597562

woadwarrior01•6mo ago

I'd recommend reading the Swiss table design notes[1] in the Abseil documentation. You might also like F14 maps[2] from Folly.

[1]: https://abseil.io/about/design/swisstables

[2]: https://engineering.fb.com/2019/04/25/developer-tools/f14/

SkiFire13•6mo ago

In addition to this comment's siblings resources, I also suggest this really good Cppcon presentation on Swisstable https://www.youtube.com/watch?v=ncHmEUmJZf4

neuroelectron•6mo ago

Great write up. It almost made me miss my old DevOps job.

pjmlp•6mo ago

I have done multiple roles throughout my career.

What I love when doing DevOps, being outside most of the whole FE / BE discussions regarding sprints, tickets, endless discussion with product teams, the plurality of the technology stack.

What I don't like, many teams only remember that we exist when things go wrong, and usually we're the only ones staying late or doing weekends when it happens, debugging black boxes.

Debugging these kind of issues without access to Go's source code, and talking over some kind of ticket system with "Go support team", isn't the same kind of fun.

dh2022•6mo ago

I am somewhat surprised to see the bucket memory layout which is: [k1/v1],[k2,v2],[k3/v3] etc. where k1,k2,k3 are keys and v1,v2,v3 are values. The CPU cache will not contain more than one [k,v] pair - because the CPU cache line is about 64 bytes and the size of [k,v] pair was about 56 bytes.

So iterating through the bucket looking for a key will require each iteration to fetch the next [k,v] pair from RAM.

Compare this with the following layout: k1,k2,k3,… followed by v1,v2,v3. Looking up the first key in the bucket will end up loading at least one more key in the CPU cache-line. And this should make iterations faster.

The downside of this approach is if the lookup almost all the time results in the first key in the bucket. Then [k1,v1],[k2,v2],k3,v3] packing is better-because the value is also in the CPU cache line .

I am wondering if people on this forum knowvmore about this trade-off. Thanks!!

aaronbee•6mo ago

The trade off is discussed here: https://github.com/golang/go/issues/70835

tialaramex•6mo ago

We're not "iterating through the bucket" in the sense you mean. There's a control word which tells us which slots might have our key, and so we never need to look at keys which do not match the byte from our hash used in the control word.

In most cases there are zero or one matches in the control word, so the interleaving could not help us, but it would still hurt us if N=1 and it's a match, which is the common happy path when keys looked up always or almost always exist by design.

Al Lowe on model trains, funny deaths and working with Disney

Hoot: Scheme on WebAssembly

First Proof

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Stories from 25 Years of Software Development

Reinforcement Learning from Human Feedback

The Waymo World Model

Start all of your commands with a comma (2009)

France's homegrown open source online office suite

Vocal Guide – belt sing without killing yourself

Software factories and the agentic moment

The AI boom is causing shortages everywhere else

Coding agents have replaced every framework I used

A Fresh Look at IBM 3270 Information Display System

72M Points of Interest

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Making geo joins faster with H3 indexes

British drivers over 70 to face eye tests every three years

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Ga68, a GNU Algol 68 Compiler

Show HN: I spent 4 years building a UI design tool with only the features I use

An Update on Heroku

Show HN: If you lose your memory, how to regain access to your computer?

What Is Ruliology?

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev