I spent almost a month chasing a silent data corruption issue that turned out to be floating-point non-determinism between x86 and ARM chips. It completely changed how I look at "reliable" memory.
What was your "white whale" bug of the year?
I spent almost a month chasing a silent data corruption issue that turned out to be floating-point non-determinism between x86 and ARM chips. It completely changed how I look at "reliable" memory.
What was your "white whale" bug of the year?
I have a write-online table in MariaDB and ordering of records is important. I have realised that the database has no such thing as append-only table that stores records in the order they are submitted into the database. Every record has one or more indices, and it is these indices that dictate the ordering and only for the data they index. What I have overlooked is when a transaction A starts, then transaction B starts, the transaction A might have records with smaller keys, as it started sooner, but transaction B commits first with higher keys, which means I end up with out-of-order entries. This is not too bad, actually, it depends on the context and in my case the context was that there were readers constantly waiting for new records. And so if a reader reads records after transaction B commits but not before transaction A commits, the reader will never see new records from transaction A. I have solved it by blocking the readers based on number of active transactions with ordering being considered.
I have wrote about it in this blog post, in the "Event Log and proper ordering of events" section https://gethly.com/blog/how-of-gethly/event-sourcing-right-w...
I spent 6 months chasing 'ghosts' in my backtests that turned out to be floating-point drift between my Mac and the production Linux server. I realized exactly what you said: if state isn't replayable bit-for-bit, it's not engineering.
I actually ended up rewriting HNSW using Q16.16 fixed-point math just to force 'reality to line up' again. It’s painful to lose the raw speed of AVX floats, but getting 'Engineering' back was worth it. check it out(https://github.com/varshith-Git/Valori-Kernel)
guntis_dev•1mo ago
After profiling, I found two bottlenecks: converting frames to RGB was happening on the CPU and was quite costly, so I rendered the decoded YUV frames directly on the GPU without conversion. Second, I moved all logic off the main thread since our heavy UI was competing for the same resources.
The main thread thing was that I was iterating through the frame buffer multiple times per second to select the appropriate frame for rendering. When heavy UI animations occurred, the main thread would block, causing the iteration to complete late - by then, the target frame's timestamp had passed, so it would get skipped and only the next frame would be drawn, creating visible stuttering.