frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
472•klaussilveira•7h ago•116 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
811•xnx•12h ago•487 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
157•isitcontent•7h ago•17 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
155•dmpetrov•7h ago•67 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
31•matheusalmeida•1d ago•1 comments

A century of hair samples proves leaded gas ban worked

https://arstechnica.com/science/2026/02/a-century-of-hair-samples-proves-leaded-gas-ban-worked/
91•jnord•3d ago•12 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
50•quibono•4d ago•6 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
260•vecti•9h ago•122 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
207•eljojo•10h ago•134 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
328•aktau•13h ago•158 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
327•ostacke•13h ago•86 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
411•todsacerdoti•15h ago•219 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
22•kmm•4d ago•1 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
337•lstoll•13h ago•241 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
52•phreda4•6h ago•9 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
4•romes•4d ago•0 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
195•i5heu•10h ago•144 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
115•vmatsiiako•12h ago•38 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
152•limoce•3d ago•79 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
244•surprisetalk•3d ago•32 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
996•cdrnsf•16h ago•420 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
25•gfortaine•5h ago•3 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
45•rescrv•15h ago•17 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
67•ray__•3h ago•28 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
38•lebovic•1d ago•11 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
78•antves•1d ago•59 comments

How virtual textures work

https://www.shlom.dev/articles/how-virtual-textures-really-work/
30•betamark•14h ago•28 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
7•gmays•2h ago•2 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
41•nwparker•1d ago•11 comments

Evolution of car door handles over the decades

https://newatlas.com/automotive/evolution-car-door-handle/
41•andsoitis•3d ago•62 comments
Open in hackernews

Bytes before FLOPS: your algorithm is (mostly) fine, your data isn't

https://www.bitsdraumar.is/bytes-before-flops/
66•bofersen•2mo ago

Comments

jmole•2mo ago
> worst case scenario being the flat profile where program time is roughly evenly distributed

It sounds like the “worst case“ here is that the program is already optimized.

bofersen•2mo ago
Author here, kinda sorta. I should've been a bit more specific than that. You can have a profile showing a function taking up 99% of the time, but when you dive into it, there's no clear bottleneck. But just because there's no bottleneck, that doesn't mean it's optimized; vice versa-a well-optimized program can have a bottleneck that's already been cycle-squeezed to hell and back.

What I wanted to say was that a spiky profile provides a clear path to optimizing a piece of code, whereas a flat profile usually means there are more fundamental issues (inefficient memory management, pointer chasing all over the place, convoluted object system, etc.).

saghm•2mo ago
It sounds like a flat profile essentially is a local optimum, compared to cases where there's a path "upwards" along a hill to some place more optimal that doesn't require completely changing your strategy.
bofersen•2mo ago
That's actually a good observation, yeah. It's often the case that you dig deeper and deeper and find some incomprehensible spaghetti and just say "fuck it, I'll just do what I can here, should be enough".
Narishma•2mo ago
Not necessarily. It could just be uniformly slow with no particular bottleneck.
lmm•2mo ago
This is a narrative commonly heard from profiler skeptics, but I've never seen a real example.
hansvm•2mo ago
I've seen a few of these in my career, if I understand the author correctly. You have a big ball of mud that can theoretically be 10x or 100x faster, but the costs are diffuse and can't be solved by just finding a hotspot and optimizing it.

It often happens for good reasons. Features get added over time, there are some scars from a mocking framework, simpler (faster) solutions don't quite work because they're supporting X which supports Y which supports Z (dead code, but nobody noticed), people use full datetime handling when they mean to access performance counters, the complexity of the thing means that you blow your branch prediction cache size budget, etc....

The solution is to deeply understand the problem (lots of techniques, but this comment isn't a blog post) and come up with a solution, like a ground-up rewrite of some or all of the offending section.

colonCapitalDee•2mo ago
Great article. Can confirm, writing performance focused C# is fun. It's great having the convenience of async, LINQ, and GC for writing non-hot path "control plane" code, then pulling out Vector<T>, Span<T>, and so on for the hot path.

One question, how portable are performance benefits from tweaks to memory alignment? Is this something where going beyond rough heuristics (sequential access = good, order of magnitude cache sizes, etc) requires knowing exactly what platform you're targeting?

bofersen•2mo ago
Author here. First of all, thanks for the compliment! It’s tough to get myself to write these days, so any motivation is appreciated.

And yes, once all the usual tricks have been exhausted, the nest step is looking at the cache/cache line sizes of the exact CPU you’re targeting and dividing the workload into units that fit inside the (lowest level possible) cache, so it’s always hot. And if you’re into this stuff, then you’re probably aware of cache-oblivious algorithms[0] as well :)

Personally, I almost never had the need to go too far into platform-specific code (except SIMD, of course), doing all the stuff in the post is 99% of the way there.

And yeah, C# is criminally underrated, I might write a post comparing high-perf code in C++ and C# in the future.

[0]: https://en.wikipedia.org/wiki/Cache-oblivious_algorithm

hansvm•2mo ago
One other trick I use reasonably often is using something more complicated than AoS or SoA layouts. Reasons vary (the false sharing padding in your article is one example), but cache lines are another good one. You might, e.g., want an AoSoA structure to keep the SoA portion of things on a single cache line if you know you'll always need both data elements (the entire struct), want to pack as much data in a cache line as possible, and also want that data to be aligned.

Great article by the way.

pjc50•2mo ago
Enjoying the C# appreciation.

>> C# has an awesome situation in here with its support for value types (ref structs), slices (spans), stack allocation, SIMD intrinsics (including AVX512!). You can even go bare-metal and GC-free with bflat.

There's been a really solid effort by the maintainers to improve performance in C# , especially with regard to keeping stuff off the heap. I think it's a fantastic language for doing backends in. It's unfortunate that one of the big language users, Unity, has not yet updated to the modern runtime.

VorpalWay•2mo ago
To the list of profiling tools I would like to add KDAB Hotspot and KDE Heaptrack.

The former, hotspot, is a visualiser for perf data, and it deals ok with truly massive files that made perfetto and similar just big down. It also supports visualing off-CPU profiles ("why is my program slow but not CPU bound?").

The latter, heaptrack, is a tool with very similar UI to hotspot (I think the two tools share some code even) to profile malloc/free (or new/delete). Sometimes the performance issue is as simple as not reusing a buffer but reallocating it over and over inside a loop. And sometimes you wonder where all the memory is going.

namibj•2mo ago
Second that, they are very powerful and fast.
TinkersW•2mo ago
Good article, but for profiling you are missing the big daddy of them all, Tracy.