frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
100•theblazehen•2d ago•22 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
654•klaussilveira•13h ago•189 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
944•xnx•19h ago•549 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
119•matheusalmeida•2d ago•29 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
38•helloplanets•4d ago•38 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
48•videotopia•4d ago•1 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
227•isitcontent•14h ago•25 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
14•kaonwarb•3d ago•17 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
219•dmpetrov•14h ago•113 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
327•vecti•16h ago•143 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
378•ostacke•19h ago•94 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
487•todsacerdoti•21h ago•241 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
359•aktau•20h ago•181 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
286•eljojo•16h ago•167 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
409•lstoll•20h ago•276 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
21•jesperordrup•4h ago•12 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
87•quibono•4d ago•21 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
59•kmm•5d ago•4 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
3•speckx•3d ago•2 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
31•romes•4d ago•3 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
250•i5heu•16h ago•194 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
15•bikenaga•3d ago•3 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
56•gfortaine•11h ago•23 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1062•cdrnsf•23h ago•444 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
144•SerCe•9h ago•133 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
180•limoce•3d ago•97 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
287•surprisetalk•3d ago•41 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
147•vmatsiiako•18h ago•67 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
72•phreda4•13h ago•14 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
29•gmays•9h ago•12 comments
Open in hackernews

The Cost of a Closure in C: The Rest

https://thephd.dev/the-cost-of-a-closure-in-c-c2y-followup
66•ingve•1mo ago

Comments

andsoitis•1mo ago
> Cost of a Closure in C

C does not have closures. You could simulate closures, but it is neither robust not automatic compared to languages tha truly support them.

aragilar•1mo ago
C does not currently have closures, the post it looking at their performance properties with an eye for what form closures should be added to the standard.
skavi•1mo ago
I think maybe this post would make more sense if you were familiar to its antecedent [0] which compares existing closure extensions to C. IIRC the comparison was in the context of considering designs for standardization.

[0]: https://thephd.dev/the-cost-of-a-closure-in-c-c2y

pjmlp•1mo ago
Completely missed the point of the article, this is about ongoing C2y efforts at WG14.
flohofwoe•1mo ago
The post is about a potential new feature of C. The author is working on the C committee and has contributed new C features in the past.
userbinator•1mo ago
Those graph axes units are... perplexing, to say the least.

So, we need to swap to the logarithmic graphs to get a better picture

I wish more people would know about decibels.

throw_await•1mo ago
Logarithmic axes are inifinitely better than decibels, because dB are just to overloaded semantically
Someone•1mo ago
>> So, we need to swap to the logarithmic graphs to get a better picture

> I wish more people would know about decibels.

Huh? Is there any difference? https://en.wikipedia.org/wiki/Decibel:

“The decibel […] expresses the ratio of two values of a power or root-power quantity on a logarithmic scale”

Panzerschrek•1mo ago
Why struggling using qsort? std::sort from C++ is much better in terms of usability and performance.
pjmlp•1mo ago
Because some people will never move beyond C, no matter what.
tliltocatl•1mo ago
People will never move beyond C (among other reasons) because C allows precise control over memory allocation. Closures and precise control over memory allocation doesn't play very well together.
spacechild1•1mo ago
> Closures and precise control over memory allocation doesn't play very well together.

How so? In C++ a lambda is just a regular type that does not allocate any memory by itself. You have in fact precise control over how/where a lambda is allocated.

flohofwoe•1mo ago
True, but once a capture needs to survive the parent function scope you'll need to store it somewhere, either via a std::function object which has opaque rules on whether a heap allocation happens or via your own std::function implementation where you define the heap allocation rules but then will have to face discussions about why you're rewriting the stdlib ;)

Any C implementation of capturing lambdas has the same problem of course, that's why the whole idea doesn't really fit into the C language IMHO.

fuhsnn•1mo ago
I did a version with explicit capture objects for fun: https://github.com/fuhsnn/c-extensions#extended-capture

Basically, with each capturing lambda, a context type _Ctxof(fn) is created that users can then declare on stack or on heap themselves.

uecker•1mo ago
Interesting! Can you tell me about your experience? How does it compare to the versions in N3654? (where this idea comes from) https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3654.pdf
fuhsnn•1mo ago
- Non-capturing isn't that much simpler to implement for the front-end, because they still need to distinguish between a file-scope name or a shadowing local in parent function. And once you have the code to do that distinction, it's not that far away from implementing captures.

- From implementor point of view I just don't like the idea of a new category of function qualifiers, so I used "_Wideof(int(void))" over "int(void) _Wide", and "int fn({...}, void)" over "int fn(void) _Closure/_Capture(...)".

- During prototype scope of the inner function, parameters may have VM type that reference from parent's local scope, so the capturing (or not) effectively start at prototype scope, earlier than C++'s model. My syntax has capture clause right before parameters so it is somewhat manage-able in one pass; for function qualifier syntax the parser probably need to either delay semantic checking the VMs or skip ahead to find _Closure/_Capture before processing the parameters.

- Both N3654-example6 and N3694-4.4.6 implicitly create capture context on stack then provide reference to it (N3654 with &_Closure(), N3694 with inner function's name). My version is lower level in that only the type of capture context (_Ctxof(fn)) is implicitly created, users then call a helper (modeled after va_start) to initialize the context object. I believe it also reduces surprises from by-value captures, since users clearly see where they initialize a context object relative to local variable changes.

- Sans syntax, N3654's "specified argument" is pretty close to my implementation, just need to change the source of context pointer from r10 to the argument in function prologue.

- I implemented wide function pointers as generalized "function pointers with context pointer pushed in r10", so it's also possible to make wide pointers from regular functions, they'll just be called with an extra nullptr in r10. I think it's more flexible this way.

What I did was mostly "the compiler will need these anyways" so should be adaptable to whatever WG14 settled on, or if I ever want to implement C++11 lambda.

uecker•1mo ago
Thanks! What I do not quite understand is the need for a capture clause that lists the captured variables. As a user, I do not want to list my captures. As a code reviewer I certainly do not want value capture that create a modifiable copy with the same name, or any ambiguity on whether something is a copy or not. I just want to access my variables. So what is really the point here?

Regarding "also possible to make wide pointers from regular functions", this exactly the reason why I like the qualifer version: The usual conversion rules for qualifiers would allow this conversion implicitly, while back-conversion is not allowed (or only with cast).

pjmlp•1mo ago
Urban myths by people frozen in time, systems programming languages predate C by a decade, and it isn't as if the world stopped outside Bell Labs in the 1970's.
tliltocatl•1mo ago
Which systems (as in zero-runtime) language would you choose if not C (even if we disregard the fact that no SoC vendor support anything but C)?

- Ada? Many cool features, but that's also the problem - it's damn complicated.

- C++ combines simplicity of Ada with safety of C.

- Rust and Zig are only ~10 years old and haven't really stabilized yet. They also start to suffer from npm-like syndrome, which is much more problematic for a systems language.

- ATS? F#? Not all low-level stuff needs (or can afford) this level of assurance.

- Idris? Much nicer than ATS, but it's even younger than Rust and still a running target (and I'm not sure if zero-runtime support is there yet).

I mean, yes, C is missing tons of potentially useful features (syntactic macros, for one thing), but closures are not one of them.

tialaramex•1mo ago
It is insane to assert that Rust hasn't "really stabilized yet" in comparison to C, a language which has been replaced twice (as C17 and C23) after Rust 1.0
1718627440•1mo ago
Except that those are more like bug fixes, it's unlikely to have something newer than C11 and most are still on C99.
pjmlp•1mo ago
Show us that you haven't read ISO C documents without telling us.
1718627440•1mo ago
True, at most I read excerpts when I have a question. Can you tell me what gave it away? I thought the saying is that C17 is the sane version of C11 and C23 has quite some changes, but is way to new to be counted on.
uecker•1mo ago
C17 is indeed a bug fix release. C23 finally removed some features that were deprecated a long time ago already in the first standardized version (auto as storage classifier, K&R function definitions, empty parenthesis as "arbitrary arguments") and also support for ones' complement. So yes, C is extremely backwards compatible.
1718627440•1mo ago
Auto as storage classifier was deprecated? TIL.
uecker•1mo ago
Ah no, sorry, my mistake. And it is still a storage classifier, but now (in C23) it is mentioned that it will become a type specifier.
dfawcus•1mo ago
Actually lots is still on C89.

I'm trying to drag one program at $employer up to C99 (plus C11 _Generic), so I can then subsequently drag it to the bits of C23 which GCC 13 supports.

This all takes times, and having to convince colleagues during code reviews.

What C23 has done is authorise some of the extensions which GCC has had for some time as legitimate things (typeof, etc).

However the ability to adopt is also limited by what third party linters in use at $employer may also support.

tliltocatl•1mo ago
C17 didn't introduce any changes to the language. C23 mostly standardized existing compiler-specific goodies. None of this broke existing code. When I run `sudo dnf upgrade gcc` I can be 99.9% sure my old code still compiles.

Compare e. g. https://github.com/rust-lang/rust/pull/102750/ (I'm not following Rust development closely, picked up the first one). Yes, developers do rely on non-guaranteed behavior, that's their nature. C would likely standardize old behavior. Basically all of the C standard is a promoted vendor-specific extension - for better or worse.

Here is another good one: https://internals.rust-lang.org/t/type-inference-breakage-in...

tialaramex•1mo ago
C17 ostensibly "didn't change" the language but it did still need to fix a bunch of stuff and it chose to do this by just issuing an entirely new language standards document.

C23 on the other hand landed a bunch of hard changes, and "it didn't break my code" only matches the reality most people observe with Rust too. The changes you mention didn't break my Rust either.

But some trivial C did break because C23 is a new language

https://c.godbolt.org/z/n9vhMGYW5

In fact those Rust changes aren't language changes unlike C23, the first you linked is a choice for the compiler to improve on layout it didn't guarantee, anybody who was relying on that layout (and there were a few) was never promised this wouldn't change, next version, next week or indeed the next time they compiled the same code. You can even ask Rust's compiler to help light a fire under you on this by arbitrarily changing some layout between builds, so that stuff which depends on unpromised layout breaks earlier, reminding you to actually tell Rust e.g. repr(C) "I need the same layout guarantees for my type T as I'd get in C" or repr(transparent) "I need to ensure my type wrapper T has the same layout as the type I'm wrapping"

The second isn't a language change at all, it's a change to a library feature, so now we're down to "C isn't stable until the C library is forever unchanging" which hopefully you recognise as entirely silly. Needless to say nobody does that.

pjmlp•1mo ago
You can start by learning that even C has a tiny runtime, followed by history of systems programming languages, starting with JOVIAL in 1958.

Even if not perfect, "Typescript for C" came out in 1983.

tliltocatl•1mo ago
> even C has a tiny runtime

Which functions are required for a C program to run aside from those that it run explicitly? memcpy? It's basically a missing CPU instruction. malloc will never get called unless you call it yourself.

> followed by history of systems programming languages, starting with JOVIAL in 1958.

All system languages (e. g. BLISS, PL/S, NEWT) designed as such before C was vendor-specific. Some of these had nice things missing from C, but none propagated beyond their original platform. And today we have no option but C.

> "Typescript for C" came out in 1983.

C++ is not just "not perfect", it is far worse in every way. Let's let people overload everything. And let's overload shift operator to do IO. And make every error message no less than 1000 lines, with actual error somewhere in the middle. Let's break union and struct type punning because screw you that's why. You say C macros are unreadable? Behold, template metaprogramming! C is not perfect, but it has the justification of being minimal. C++ is a huge garbage dumpster on fire.

pjmlp•1mo ago
Everything that runs before main(), and on exit, floating point emulation when needed and IEEE support, signal handlers, threading support since C11, at least, then depends on what else Compiler C has as extensions.

C was also vendor specific to Bell Labs.

C++ is definitely Typescript for C.

1718627440•1mo ago
Because C is a much nicer language than C++ by some measures.
dfawcus•1mo ago
Nah - more that a lot of commercial code is written in it; and it doesn't make sense to replace (or rewrite) it at this time.

For example, I'm maintaining some 20 year old C code, which the employer adopted around 10 years ago. It will likely stay in use at least until the current product is replaced, whenever that may be.

1718627440•1mo ago
Let me rephrase that: I feel myself addressed by "some people will never move beyond C, no matter what" and I prefer C over C++, because it is a much simpler language. Each time I am leaving the cozy world of C and write C++ I am annoyed and miss things.
pjmlp•1mo ago
Nah, C++ gives you the safety tools, even if with some flaws, that WG14 does not have any interest to ever add to C.
flohofwoe•1mo ago
Anectodal and probably my bubble, but quite a few people had moved from C to C++ long time ago, but then got fed up with where modern C++ is heading (mostly around C++14 and C++17) and went back to C.
pjmlp•1mo ago
Except it doesn't offer the safety alternatives that C++ has, thus it is a big unsafe bunch of code.

Using C++ doesn't mean having to use the whole standard.

Now if C type safety actually was like Modula-2, Object Pascal or Zig, that would not be as bad.

flohofwoe•1mo ago
> Except it doesn't offer the safety alternatives that C++ has, thus it is a big unsafe bunch of code.

C++ adds more new unsafe concepts on top of C than C ever had though ;) (see std::view, std::range or good old iterator invalidation - at least in C the unsafety is in your face and not hidden under layers of stdlib code)

flohofwoe•1mo ago
FWIW, when the C compiler has access to the qsort implementation body and the comparer function like the C++ compiler has, it will also stamp out a specialized version that will in most cases be identical with the C++ compiler output for std::sort (assuming it implements the same sorting algorithm as C's qsort).

That whole 'zero cost abstraction' idea is not unique to C++ since all the important work to make the 'zero cost thing' happen is performed down in the optimizer passes, and those are identical between C and C++.

kccqzy•1mo ago
Yeah but C++ encourages you to put things into templates which are inside headers, so of course the implementation body is exposed to the compiler all the time. Whereas a C compiler usually won’t have access to the qsort implementation body.
Panzerschrek•1mo ago
I see no inlining of qsort in at least two major compilers: https://godbolt.org/z/59xehaqnv.
flohofwoe•1mo ago
Yes, that's why C++ std::sort is usually recommended over C qsort. But either compiling with -flto or providing your own fully inlineable qsort should do the trick.
tialaramex•1mo ago
If you care about performance and are willing to use a different programming language surely it would make more sense to use Rust's unstable sort ?
djaouen•1mo ago
Lisp solves the performance hit of closures with macros. However, given what macros look like in C, I hope it never amounts to that!
skavi•1mo ago
There’s no necessary performance hit for closures. The performance cost here is caused by these closures needing to conform to a function pointer looking interface in order to be generally useful in C.
djaouen•1mo ago
Iirc, in Lisp, closures are heap-allocated, unless they are created at compile-time with macros. Therefore, there is the added overhead of malloc and free calls with each closure created. Am I wrong here?
BoingBoomTschak•1mo ago
Yes, in that normal closures being stack or heap allocated is an implementation detail; at least in CL. SBCL can stack allocate some closures if DYNAMIC-EXTENT is used: https://www.sbcl.org/manual/#Stack-allocation-1
pfdietz•1mo ago
Or I believe if it can determine the closure has dynamic extent, for example if it's passed to a standard function for which it knows the closure will not escape.
BoingBoomTschak•1mo ago
Indeed, it's newish (from this summer https://github.com/sbcl/sbcl/releases/tag/sbcl-2.5.6) and mainly thanks to C. Zhang's work.

https://github.com/sbcl/sbcl/commit/021414445224e692ebb9c48c... https://github.com/sbcl/sbcl/commit/2c56c5901df1a5ea98ae19e9...

pfdietz•1mo ago
Lisp implementations typically don't use malloc and (especially) free. There is GC overhead.
kazinator•1mo ago
Definitely. Lisp is a family of languages that has now existed for almost 70 years. There simply isn't one way that closures work in "Lisp". Some implementations of languages in the lisp family have compilers that have sophisticated ways of handling closures. They carefully classify what is or is not captured by closure (and what is shared with other closures and what is mutably shared), and try to guess whether a closure can escape from the environment where it's created. Different code generation strategies, and strategies for representing the environment, apply to different situations.
uecker•1mo ago
Since it mentions my paper again: "It also brings into question whether a “lean” approach that grabs the “environment pointer” or the “stack frame” pointer directly, as in n3654 is a good idea to start with." which is annoying since the approach promoted in n3654 is not even tested here. At least he continues with "But, it’s premature to condemn n3654 because it’s unknown if the problem is the fact that the use of accessing variables through what is effectively __builtin_stack_address and a trampoline is why performance sucks so bad, or if it’s the way the trampoline screws with the stack". So why insinuating it when you have no idea? It is well known that the trampoline approach is costly, and it seems particular bad on arm. But n3654 does not promote trampolines and instead wide function pointers which is similar to std::function_ref, but without wasting the static chain register and avoiding the need to create a thunk when converting a regular function pointer.

I am also not entirely sure whether "manorboy" is a good benchmark, but for estimating the function call overhead it should be ok. For the access to variables part, other considerations are far more important IMHO. So I am not sure I would put too much weight on the accessing "k" part of the post.

n3654 contains a criticism of the lambda approach from a language-design perspective. The issue is that it seems not to be a good fit for C. Copying values is always cheap, but this obviously works in toy examples but not necessarily for interesting data structures. In C++ you could then capture a pointer as a value, but then you haven't avoided an indirection either. In general, this is fine in C++ as smart pointer than deal with the memory management, but in C having a captured value in a lambda you can not have explicit access to anymore does not make too much sense.

The "unfortunately, is that unlike C++ there are no templates in C" is also interesting. I fled from C++ back to C exactly because of templates. In a performance context, the fallacy is that you can always create super-optimized code using compile-time techniques that absolutely shines in microbenchmarks (such as this one) but cause a lot of bloat on larger scale (and long compilations times). If you want this, I think you should stick to C++.

dfawcus•1mo ago
Well his "Normal Functions" (benchmarks/closures/source/normal_functions.cpp in his repo) looks quite similar to what I had with my GNU nested functions using a stand in "wide pointer", and hence no generated trampoline.

(https://news.ycombinator.com/item?id=46243298)

Which rather suggests to me that such a scheme, but generated by the compiler, should have a similar performance to said "Normal Functions" and hence similar to his preferred lambda form.

Since his benchmark environment is so unwieldy, I may have a go at extracting those two code sets to a standalone environment, and measure them so see...

uecker•1mo ago
So here are my preliminary benchmarks with my own implementation on an AMD EPYC 9334 32-Core processo. I need to double checks things - so take this with a grain of salt for now. Time is in seconds for 100000 iterations of manorboy(10). So far, the only implementation which clearly sucks is std::function<>. Even trampolines are suprisingly good (but I can imagine that they are much worse on other CPUs / architectures)

  xgcc (GCC) 16.0.0 20260103 (experimental)
  1.50 gcc -ftrampoline-impl=stack -Wl,-no-warn-execstack
  1.11 gcc -ftrampoline-impl=stack -Wl,-no-warn-execstack -DREFARG
  7.21 gcc -ftrampoline-impl=heap
  7.34 gcc -ftrampoline-impl=heap -DREFARG
  0.93 gcc -DWIDEPTR
  1.38 gcc -DWIDEPTR -DREFARG
  1.40 gcc -DDIRECT
  1.05 gcc -xc++ -std=c++26 -DFUNCREF -DDEDUCING
  19.68 gcc -xc++ -std=c++26 -DDEDUCING
  20.73 gcc -xc++ -std=c++26
  6.31 gcc -xc++ -std=c++26 -DDEDUCING -DREFARG
  6.31 gcc -xc++ -std=c++26 -DREFARG
  Debian clang version 16.0.6 (15~deb12u1)
  21.11 clang -xc++
  6.16 clang -xc++ -DREFARG
  1.66 clang -fblocks
  1.70 clang -fblocks -DREFARG
uecker•1mo ago
Here some more bechmarking: https://codeberg.org/uecker/nested-functions-benchmarking
dfawcus•1mo ago
Ah - thanks. I'll have a play with some of my systems, and see what it shows.
tialaramex•1mo ago
> I fled from C++ back to C exactly because of templates

Obviously I sympathize as someone who spent the front half of their career writing C and has no interest in writing C++ beyond toy examples. Nevertheless, IMO you're cutting off a choice here, forcing yourself to make potentially sub-optimal choices.

Sometimes the templates would have been better and writing X-macros is just worse, ergonomically at least. Perhaps C++ people use templates more often than they should, but I'm sure "never" is not the correct amount.

Still, if C gets fat pointers that's at least a step in the right direction. So I must wish you the best with that part.

uecker•1mo ago
I used to write expression template libraries, so I know what C++ can do. I just figured out I am a lot more productive in C than in C++ and that in the time I waste on C++ I can do a lot more in C. I also do rarely use macros in C, so what C++ people get wrong is assuming that C programmer would do the same thing just using macros. This is not the case. One has to realize that compile-time programming is 95% of times a waste of time and effort and the same thing can be achieved in an easier and better way.

I also recently removed some code written with templates in C++ from one of my projects, because this single file caused compilation times for a ca. 450 file C project to increasse by 50% from 20s to ca. 30s.

1718627440•1mo ago
> Still, if C gets fat pointers that's at least a step in the right direction.

AFAIK the C standard doesn't prevent an implementation from using fat pointers, this is one of the reasons why the conversion from pointer to integer only works in one direction. This is actually necessary for segmented memory. The compiler is allowed to optimize based on the assumption, which allocation a pointer comes from (including the allocations boundaries, i.e. size), even if bytewise the pointers would be equal, so you could argue, that C abstract machine already has fat pointers.

uecker•1mo ago
Indeed. C implementation could just use fat pointers, but it is unrealistic that they now break their ABI. Also the cost of fat pointers is higher when not needed. So the plan is to give them a new wide pointer type.
oflebbe•1mo ago
Actually the author is building onto my suggestion https://github.com/soasis/idk/pull/17/files which I created several years ago, but adapted a bit too fast while submitting. And yes it looks very similar to GNU nested functions, since I started tinkering with these first.

I am really not sure if all these observations mentioned in the article are 100% correct, though.

First : Code seems to be compiled with clang. On Linux with gcc the native function one is way faster than the clang one.

Second: The author does run the code on ARM64/MacOS .

At least on my ryzen CPU on Linux with gcc the "normal C code" is way faster than anything else. Not that we do not need to thing about "closure" type functionality, but one should be careful to extrapolate implementations from one compiler on one platform to the rest of the pack.

Regarding N3654 I am not sure how to benchmark it here, since C could potentially use __builtin_call_with_static_chain , but I am not sure how to write the function to use the chain for accessing the variables.

I tried to estimate N3654 it by using "tinygo" which is AFAIK using the usual Calling ABI, but it was a factor of two slower than clang. Even "go" with its very specific ABI is still much slower. I discovered this isn't representative since runtime calling costs had been totally shadowed by costs of allocations.

Even the rust example I am usually using http://www.reddit.com/r/rust/comments/2t80mw/the_man_or_boy_... is much slower than anything else, presumably because of the "Cell" needed

TLDR: This micro benchmark might be misleading

uecker•1mo ago
See my other comment in this thread with my preliminary benchmarking results. I have a patched GCC with two new builtins __builtin_static_chain and __builtin_nested_code (needs a better name), that give you the static chain pointer and the code pointer without creating a trampoline. Then I put both into a structure to simulate a wide pointer. Later I call it with __builtin_call_with_static_chain.

A trick one can do is to let it create the trampoline and then read off the two pointer from the position in the code where it is stored. Not portable and you still have the overhead for creating the trampoline, but you do not need the executable stack anymore.

greatgib•1mo ago
This article is so badly written, it makes it hard to understand something that should be easy to understand...