frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Some bits on malloc(0) in C being allowed to return NULL

https://utcc.utoronto.ca/~cks/space/blog/programming/CZeroSizeMallocSomeNotes
41•ingve•1d ago

Comments

bobmcnamara•4h ago
Ages ago I worked with a system where malloc(0) incremented a counter and returned -1.

free(-1) decremented the counter.

This way you could check for leaks :p

o11c•4h ago
Noncompliant, since `malloc(0)` is specified to return a unique pointer if it's not `NULL`.

On most platforms an implementation could just return adjacent addresses from the top half of the address space. On 32-bit platforms it doesn't take long to run out of such address space however, and you don't want to waste the space for a bitmap allocator. I suppose you could just use a counter for each 64K region or something, so you can reuse it if the right number of elements has been freed ...

LPisGood•3h ago
Noncompliant, but what could this reasonably impact?
bobmcnamara•3h ago
> Noncompliant, since `malloc(0)` is specified to return a unique pointer if it's not `NULL`.

I know I've seen that somewhere, but may I ask what standard you're referring to?

masfuerte•3h ago
It's POSIX.

> Each [...] allocation shall yield a pointer to an object disjoint from any other object. The pointer returned points to the start (lowest byte address) of the allocated space. If the space cannot be allocated, a null pointer shall be returned. If the size of the space requested is 0, the behavior is implementation-defined: either a null pointer shall be returned, or the behavior shall be as if the size were some non-zero value, except that the behavior is undefined if the returned pointer is used to access an object.

https://pubs.opengroup.org/onlinepubs/9799919799/functions/m...

MaxBarraclough•2h ago
Not just POSIX, also the ISO C standard itself. https://en.cppreference.com/w/c/memory/malloc
masfuerte•2h ago
That doesn't say the pointer has to be unique.
jcranmer•2h ago
cppreference isn't the standard, and while the text they write looks like it's the same verbiage that would be authoritative, it's not. (And there's some criticism of it from standards committee members in that regard).

The current C standard text says:

> The order and contiguity of storage allocated by successive calls to the aligned_alloc, calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it can be assigned to a pointer to any type of object with a fundamental alignment requirement and size less than or equal to the size requested. It can then be used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object. The pointer returned points to the start (lowest byte address) of the allocated space. If the space cannot be allocated, a null pointer is returned. If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned to indicate an error, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

So yeah, the allocations are required to be unique (at least until it's free'd).

bobmcnamara•2h ago
It is in ANSI 89, under memory management functions.
o11c•3h ago
Pointers are frequently used as keys for map-like data structures. This introduces collisions that the programmer can't check for, whereas NULL is very often special-cased.
bobmcnamara•3h ago
> Noncompliant, since `malloc(0)` is specified to return a unique pointer if it's not `NULL`.

I know I've seen that somewhere, but may I ask what standard you're referring to?

If I recall correctly, this was an archaic stackless microcontroller. The heap support was mostly a marketing claim.

jmgao•3h ago
C89: https://port70.net/%7Ensz/c/c89/c89-draft.html

If the size of the space requested is zero, the behavior is implementation-defined; the value returned shall be either a null pointer or a unique pointer.

f1shy•3h ago
Isn’t -1 basically 0xffff which is a constant pointer? What am I missinterpreting?
comex•3h ago
If you call malloc(0) multiple times (without freeing in between) and get -1 each time, then the pointer is not unique.
fredoralive•3h ago
Presumably the ANSI C standard or one of the later editions? They also cover the standard library as well as the language. (Presumably the bit about "Each such allocation shall yield a pointer to an object disjoint from any other object." if the random C99 draft I found via google is accurate to the final standard - I suppose you might question if this special use is technically an allocation of course).

Of course, microcontrollers and the like can have somewhat eccentric implementations of languages of thing and perhaps aren't strictly compliant, and frankly even standard compliant stuff like "int can be 16 bits" might surprise some code that doesn't expect it.

o11c•2h ago
(you duped your comment under the other subthread)

From C89, §7.10.3 "Memory management functions":

> If the size of the space requested is > zero, the behavior is implementation-defined; the value returned shall be either a null pointer or a > unique pointer.

The wording is different for C99 and POSIX, but I went back as far as possible (despite the poor source material; unlike later standards C89 is only accessible in scans and bad OCR, and also has catastrophic numbering differences). K&R C specifies nothing (it's often quite useless; people didn't actually write against K&R C but against the common subset of extensions of platforms they cared about), but its example implementation adds a block header without checking for 0 so it ends up doing the "unique non-NULL pointer" thing.

sgerenser•4h ago
I might be missing something, but how does this help in checking for leaks? I mean, I guess you could use it to check for leaks specifically of 0-sized allocations, but wouldn’t it be better just to return NULL and guarantee that 0-sized allocations never use any memory at all?
bobmcnamara•3h ago
At the end of main, if the count wasn't balanced, then you knew you had a mismatch between malloc()/free().

If malloc() had returned a real pointer, you'd have to free that too.

> wouldn’t it be better just to return NULL and guarantee that 0-sized allocations never use any memory at all?

Better: takes less memory Worse: blinds you to this portability issue.

Someone•3h ago
> At the end of main, if the count wasn't balanced, then you knew you had a mismatch between malloc()/free().

A mismatch between malloc(0) and free(-1).

You’d know nothing about calls to malloc with non-zero sizes.

sgerenser•1h ago
Yeah, exactly, that’s my point. How many programs have memory leaks limited to (or even just materially affected by) 0-sized allocations? I’d have to imagine its a very small minority.
spacechild1•3h ago
> but wouldn’t it be better just to return NULL and guarantee that 0-sized allocations never use any memory at all?

This works if you are only interested in the overall memory balance. However, if you want to make sure that all malloc() calls are matched by a free() call, you need to distinguish between NULL and a successfull zero-sized allocation, otherwise you run into troubles when you call free on an "actual" NULL pointer (which the standard defines as a no-op).

sweetjuly•3h ago
Does this work in practice? Now you have a bunch of invalid but non-NULL pointers flying around. NULL checks which would normally prevent you from accessing invalid pointers now will pass and send you along to deref your bogus pointer.

Even hacking the compiler to treat -1 as equal to NULL as well wouldn't work since lots of software won't free NULL-like pointers.

AaronDinesh•4h ago
Why should it be allowed to return a valid pointers anyways? Surely it should always return NULL?
Joker_vD•4h ago
For instance, because you are prohibited from passing NULL to e.g. memcpy and lots of other library functions from memory.h/string.h, even when you explicitly specify a size of 0.

Another use was to use it to mint unique cookies/addresses, but malloc(1) works for this just as well.

TZubiri•3h ago
Mmmmh, cookies
snickerbockers•4h ago
It's not a valid pointer because you can't use the indirection operator on it. Returning a value other than NULL makes sense because an allocation of size zero is still an allocation.

Additionally the actual amount of memory malloc allocates is implementation-defined so long as it is not less than the amount requested, but accessing this extra memory is undefined behavior since processes don't know if it exists or not. a non-NULL return could be interpreted as malloc(0) allocating more than zero bytes.

Some implementations don't actually perform the allocation until theres a pagefault from the process writing to or reading from that memory so in that sense a non-NULL return is valid too.

I'd argue that malloc(0)==NULL makes less sense because there's no distinction between failure and success.

The only real problem is specifying two alternate behaviors and declaring them both to be equally valid.

cjensen•3h ago
There are three reasonable choices: (a) return the null pointer (b) return a valid unique pointer and (c) abort().

The point of the original C Standard was to make rules about these things AND not break existing implementations. They recognized that (a) and (b) were in existing implementations and were reasonable, and they chose not to break the existing implementations when writing the standard.

This is similar to the extremely unfortunate definition of the NULL macro. There were two existing styles of implementation (bare literal 0 and (void *) 0) and the Standard allows either style. Which means the NULL macro is not entirely safe to use in portable code.

commandlinefan•3h ago
> return a valid unique pointer

A pointer to what, though? If the requester asked for 0 bytes of memory, you'd either be pointing to memory allocated for another purpose (!) or allocating a few bytes that weren't asked for.

> This makes people unhappy for various reasons

I read through all the links trying to figure out what those reasons might be and came up empty, I'm still curious why anybody would expect or rely on anything except a null pointer in this instance.

tedunangst•3h ago
You can copy from a zero sized pointer with memcpy, but not NULL.
DSMan195276•3h ago
> allocating a few bytes that weren't asked for.

FWIW the alignment guarantees of `malloc()` mean it often will have to allocate more than you ask for (before C23 anyway). You can't 'legally' use this space, but `malloc()` also can't repurpose it for other allocations because it's not suitably aligned.

That said I still agree it's a hack compared to just using `malloc(1)` for this purpose, it's well-defined and functionally equivalent if you're looking for a unique address. The fact that you don't know what `malloc(0)` is going to do makes it pretty useless anyway.

Joker_vD•2h ago
> before C23 anyway

Did they change "suitably aligned for any object type" to "suitably aligned for any object type with size less than or equal to what was requested" or something like in C23?

JdeBP•1h ago
See https://news.ycombinator.com/item?id=44390258 .
AaronAPU•3h ago
The only requirement which seems reasonable to me, is that the address be unique. Since the allocation size is zero, it should never be accessed for read or write, but the address itself may need to be used for comparisons.

If you’re pointing to a zero sized data it shouldn’t matter what it’s pointing to. Even outside valid address space. Because you shouldn’t be reading or writing more than 0 bytes anyway.

spacechild1•3h ago
> or allocating a few bytes that weren't asked for.

You are always allocating bytes you weren't asked for: the allocation metadata and some extra bytes to satisfy the alignment requirement. If you absolutely don't want to allocate memory, you probably shouldn't have called malloc() in the first place :)

mcherm•2h ago
The behavior of malloc(x) for any positive value x is to either return NULL (meaning that the system was unable to provide a new chunk of memory to use) OR to return a unique pointer to X bytes of data which the program can use.

By extension, if x == 0, doesn't it make sense for the system to either return NULL OR to return a pointer to 0 bytes of memory which the program can use? So the standard promises exactly that: to return either NULL or else a unique pointer where that the program has permission to use zero bytes starting at that pointer.

carra•4h ago
Not the best choice to begin the title with "some bits" in this context. My mind was trying to understand this sentence in a completely different way...
eesmith•3h ago
I maintained a program which failed on, as I recall, AIX (mentioned in the essay) because malloc(0) returned NULL.

It's been 30 years so I've forgotten the details. My solution was to always allocate size+1 since memory use was far from critical.

randomNumber7•2h ago
I never had the use case to allocate 0 bytes of memory.

If I would allocate 0 bytes of memory and get a pointer to it, I wouldn't care what the value of the pointer is since I am not allowed to dereference it anyways.

But then again, why would I allocate 0 bytes of memory?

Lvl999Noob•2h ago
Can someone tell me a usecase where you want multiple allocations of size 0, each one with a unique address, and each one unique from any other allocation (hence necessarily removing that pointer from being allocated to anything else) but can't use malloc(1) instead?

I think it would be much better if malloc(0) just returned 1 or -1 or something constant. If the programmer needs the allocation to have a unique address, they can call malloc(1) instead.

hansvm•2h ago
It's occasionally useful to want multiple allocations of size 0, each one with a valid address -- generic containers parsing something as a some sort of sequence object and you want all code interacting with it to do something valid. I'd be hard-pressed to see where you'd need those to be unique though. Basically any integer should be fine.
tptacek•2h ago
I get the complexity of the standards issue here, but if you cared about this, wouldn't you just wrap malloc with something trivial that provided the semantic you wanted to depend on (NULL or some sentinel pointer).
a-dub•2h ago
would be interesting to see if there's a difference in how the 0-page is handled in systems under this condition...