frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Mermaid Formatter – CLI and library to auto-format Mermaid diagrams

https://github.com/chenyanchen/mermaid-formatter
1•astm•11m ago•0 comments

RFCs vs. READMEs: The Evolution of Protocols

https://h3manth.com/scribe/rfcs-vs-readmes/
2•init0•18m ago•1 comments

Kanchipuram Saris and Thinking Machines

https://altermag.com/articles/kanchipuram-saris-and-thinking-machines
1•trojanalert•18m ago•0 comments

Chinese chemical supplier causes global baby formula recall

https://www.reuters.com/business/healthcare-pharmaceuticals/nestle-widens-french-infant-formula-r...
1•fkdk•21m ago•0 comments

I've used AI to write 100% of my code for a year as an engineer

https://old.reddit.com/r/ClaudeCode/comments/1qxvobt/ive_used_ai_to_write_100_of_my_code_for_1_ye...
1•ukuina•23m ago•1 comments

Looking for 4 Autistic Co-Founders for AI Startup (Equity-Based)

1•au-ai-aisl•33m ago•1 comments

AI-native capabilities, a new API Catalog, and updated plans and pricing

https://blog.postman.com/new-capabilities-march-2026/
1•thunderbong•34m ago•0 comments

What changed in tech from 2010 to 2020?

https://www.tedsanders.com/what-changed-in-tech-from-2010-to-2020/
2•endorphine•39m ago•0 comments

From Human Ergonomics to Agent Ergonomics

https://wesmckinney.com/blog/agent-ergonomics/
1•Anon84•42m ago•0 comments

Advanced Inertial Reference Sphere

https://en.wikipedia.org/wiki/Advanced_Inertial_Reference_Sphere
1•cyanf•44m ago•0 comments

Toyota Developing a Console-Grade, Open-Source Game Engine with Flutter and Dart

https://www.phoronix.com/news/Fluorite-Toyota-Game-Engine
1•computer23•46m ago•0 comments

Typing for Love or Money: The Hidden Labor Behind Modern Literary Masterpieces

https://publicdomainreview.org/essay/typing-for-love-or-money/
1•prismatic•47m ago•0 comments

Show HN: A longitudinal health record built from fragmented medical data

https://myaether.live
1•takmak007•50m ago•0 comments

CoreWeave's $30B Bet on GPU Market Infrastructure

https://davefriedman.substack.com/p/coreweaves-30-billion-bet-on-gpu
1•gmays•1h ago•0 comments

Creating and Hosting a Static Website on Cloudflare for Free

https://benjaminsmallwood.com/blog/creating-and-hosting-a-static-website-on-cloudflare-for-free/
1•bensmallwood•1h ago•1 comments

"The Stanford scam proves America is becoming a nation of grifters"

https://www.thetimes.com/us/news-today/article/students-stanford-grifters-ivy-league-w2g5z768z
3•cwwc•1h ago•0 comments

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

https://cheekypint.substack.com/p/elon-musk-on-space-gpus-ai-optimus
2•simonebrunozzi•1h ago•0 comments

X (Twitter) is back with a new X API Pay-Per-Use model

https://developer.x.com/
3•eeko_systems•1h ago•0 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
3•neogoose•1h ago•1 comments

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

https://github.com/mabrucker85-prog/Project_Lance_Core
2•mav5431•1h ago•1 comments

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

https://phys.org/news/2026-02-scientists-levitating-crystals.html
3•sizzle•1h ago•0 comments

When Michelangelo Met Titian

https://www.wsj.com/arts-culture/books/michelangelo-titian-review-the-renaissances-odd-couple-e34...
1•keiferski•1h ago•0 comments

Solving NYT Pips with DLX

https://github.com/DonoG/NYTPips4Processing
1•impossiblecode•1h ago•1 comments

Baldur's Gate to be turned into TV series – without the game's developers

https://www.bbc.com/news/articles/c24g457y534o
3•vunderba•1h ago•0 comments

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

https://www.youtube.com/watch?v=40SnEd1RWUU
2•dangtony98•1h ago•0 comments

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

https://github.com/bowang-lab/EchoJEPA
1•euvin•1h ago•0 comments

Disablling Go Telemetry

https://go.dev/doc/telemetry
1•1vuio0pswjnm7•1h ago•0 comments

Effective Nihilism

https://www.effectivenihilism.org/
1•abetusk•1h ago•1 comments

The UK government didn't want you to see this report on ecosystem collapse

https://www.theguardian.com/commentisfree/2026/jan/27/uk-government-report-ecosystem-collapse-foi...
5•pabs3•1h ago•0 comments

No 10 blocks report on impact of rainforest collapse on food prices

https://www.thetimes.com/uk/environment/article/no-10-blocks-report-on-impact-of-rainforest-colla...
3•pabs3•1h ago•0 comments
Open in hackernews

Quantifying pass-by-value overhead

https://owen.cafe/posts/struct-sizes/
116•todsacerdoti•3mo ago

Comments

anonymous908213•3mo ago
> Don’t pass around data of size 4046-4080 bytes or 8161-8176 bytes, by value (at least not on an AMD Ryzen 3900X).

What a fascinating CPU bug. I am quite curious as to how that came to pass.

sgarland•3mo ago
Me too, and I hope this article gets more traction.
jasonthorsness•3mo ago
Apparently some sizes are cursed!

It would be great to repeat the author’s tests on other CPU models

TuxSH•3mo ago
I wonder what the page size is on his system (and what effective alignment his pointers have). If it's 4K, the sizes look really close to 0x1000 and 0x2000 - maybe crossing page boundaries?
astrange•3mo ago
It's because of cache addressing conflicts. If two addresses have the same cache key your cache suddenly doesn't work anymore. And many CPUs just use the low bits instead of hashing it.
Veserv•3mo ago
L1 caches are usually N-way associative, so that should only become a consistent problem if you access N distinct addresses with the same key (in this case the same offset (with likely 64-byte granularity) relative to a 4K boundary).
srcmax•3mo ago
That's called 4k aliasing. 4K aliasing occurs when you store one memory location, then load from another memory location which is 4KB offset from original.
jmalicki•3mo ago
(and if it is not apparent to some readers, most modern x86-based systems use 64 byte cache line sizes, which is sort of analogous to disk block size - quite a few memory operations tend to happen in 64 byte chunks under the covers - the ones that don't are "special")
nabla9•3mo ago
To my knowledge all x86 based systems and ARM and Qualcomm designed chips all use 64 byte cache lines.

Apple's M2 uses 128-byte cache line.

themafia•3mo ago
Would you expect different performance with 2M page sizes? Is this a TLB issue or just a fundamental hardware issue?
gpderetta•3mo ago
Apparently these ~4k spikes are showing up only on AMD, and not on Intel, which is the one known to suffer from the 4k aliasing problem.

I wonder if it has to do with a non-ideal implementation of virtual address resolution for the next page.

codedokode•3mo ago
I usually use ChatGPT for such microbenchmarks (of course I design it myself and use LLM only as dumb code generator, so I don't need to remember how to measure time with nanosecond precision. I still have to add workarounds to prevent compiler over-optimizing the code). It's amazing, that when you get curious (for example, what is the fastest way to find an int in a small sorted array: using linear, binary search or branchless full scan?) you can get the answer in a couple minutes instead of spending 20-30 minutes writing the code manually.

By the way, the fastest way was branchless linear scan up to 32-64 elements, as far as I remember.

lurquer•3mo ago
In C++, I’ve noticed that ChatGPT is fixated on unordered_maps. No matter the situation, when I ask what container would be wise to use, it’s always inordered_maps. Even when you tell it the container will have at most a few hundred elements (a size that would allow you to iterate their a vector to find what your are looking for before the unordered_map even has its morning coffee) it pushes the map… with enough prodding, it will eventually concede that a vector pretty much beats everything for small .size()’s.
bmandale•3mo ago
I agree with chatgpt here
remexre•3mo ago
isn't std::unordered_map famously slow, and you really want the hashmap from abseil, or boost, or folly, or [...]
jcelerier•3mo ago
> (a size that would allow you to iterate their a vector to find what your are looking for before the unordered_map even has its morning coffee)

I don't know about this, whenever I've benchmarked it on my use cases, unordered_map started to become faster than vector at well below 100 elements

themafia•3mo ago
> I still have to add workarounds to prevent compiler over-optimizing the code

Yet remembering how to measure time with nanosecond precision is the burden?

> By the way, the fastest way was branchless linear scan up to 32-64 elements, as far as I remember.

The analysis presented in the article is far more interesting, qualified, and useful that what you've produced here.

jklowden•3mo ago
There is no pass-by-value overhead. There are only implementation decisions.

Pass by value describes the semantics of a function call, not implementation. Passing a const reference in C++ is pass-by-value. If the user opts to pass "a copy" instead, nothing requires the compiler to actually copy the data. The compiler is required only to supply the actual parameter as if it was copied.

duped•3mo ago
Unfortunately "the compiler is required to supply the actual parameter as if it was copied" is leaky with respect to the ABI and linker. In C and C++ you cannot fully abstract it.
mattnewport•3mo ago
This might be true in the abstract but it's not true of actual compilers dealing with real world calling conventions. Absent inlining or whole program optimization, calling conventions across translation units don't leave much room for flexibility.

The semantics of pass by const reference are also not exactly the same as pass by value in C++. The compiler can't in general assume a const reference doesn't alias other arguments or global variables and so has to be more conservative with certain optimizations than with pass by value.

themafia•3mo ago
> Passing a const reference in C++ is pass-by-value.

I can cast the const away. The implementation does not hide this detail. The semantics therefore must be understood by the programmer.

Ericson2314•3mo ago
You are thinking "call by value". The author probably used "pass" not "call" specifically to avoid this.
kazinator•3mo ago
There is no difference. Call-by-alue is the older term, and I believe still preferred in CS acdemia.
gpderetta•3mo ago
I think that call-by-value/call-by-name/call-by-need[1] are more about strict vs lazy evaluation, as opposed to by-value/by-reference semantics.

[1] there is also call-by-push-value, but i was never able to wrap my mind around it.

layer8•3mo ago
> Passing structs up to size 256 is very cheap, and uses SIMD registers.

Presumably this means for all arguments combined? If for example you pass four pointers each pointing to a 256-byte struct, you probably don’t want to pass all four structs (or even just one or two of the four?) by value instead.

adastra22•3mo ago
If you’re actually passing pointers to heap allocated objects, the pointer is the value.
hyghjiyhu•3mo ago
If I understand correctly he is passing a single struct whose size he varies. This struct is being memcpy'd to stack (except for very small struct sizes that can be passed directly in registers) and so basically we are looking at a curve of memcpy performance by size.
pizlonator•3mo ago
I would ignore this benchmark because it’s not going to predict anything for real world code.

In real world code, your caches and the CPU’s pipeline are influenced by some complex combination of what happens at the call site and what else the program is doing. So, a particular kind of call will perform better or worse than another kind of call depending on what else is happening.

The version of this benchmark that would have had predictive power is if you compared different kinds of call across a sufficiently diverse sampling of large programs that used those calls and also did other interesting things.