Instead, what is missing is an automatic deallocator, one that's automatically called when the variable goes out of scope. For example:
{
vec(int) v={}; /* no extra room allocated */
vec_push(v,1); /* space allocated */
... use v ...
} /* here the space is dellocated, then v is released */
This example doesn't use the same definition of vec as TFA, but something more similar to 'span' by the same author.It is also not clear how you get tap completion with code generation. But you could also get tab completion here, somebody just has to add this to the tab completion logic.
I think this is the wrong decision (for a generic array library).
Is Martin claiming that realloc is "often" maintaining a O(1) growable array for us?
That's what the analogous types in C++ or Rust, or indeed Java, Go, C# etc. provide.
I then mention that for other use cases, you can maintain a capacity field only in the part of the code where you need this.
Whether this is the right design for everybody, I do not know, but so far it is what I prefer for myself.
In all my current code where I tried it makes no noticeable difference, and I am not a fan of premature optimization. But then, I could always switch to the alternative API.
One valid reason might be that I can't rely on realloc not be poor, but then I would rather use my own special allocation function. Other valid reasons would be to have very precise control or certain guarantees, but then I would prefer a different interface. In any case, I do not think that this logic belongs into my vector. But it is also possible that I change my mind on this...
However one possible issue is if someone pushes and pops repeated just at the boundary where f increases in value. To address that you would have to use more advanced techniques, and I think "cheat" by inspecting internal structures of the allocator.
Edit: malloc_usable_size could be used for this purpose I think.
I try do this here (this code is not tested and may not be up-to-date): https://github.com/uecker/noplate/blob/main/src/vec.h#L30
The issue with the boundary is what I meant with hysteresis in the article.
E.g. in production code this
if (!vec_ptr) // memory out
abort();
for (int i = 0; i < 10; i++)
vec_push(int, &vec_ptr, i);
should really be if (!vec_ptr) // memory out
abort();
for (int i = 0; i < 10; i++)
if (! vec_push(int, &vec_ptr, i))
abort();
but it doesn't really roll of the tongue.For those afraid of C++: you don't have to use all of it at once, and compilers have been great for the last few decades. You can easily port C code to C++ (often you don't have to do anything at all). Just try it out and reassess the objections you have.
Now, Resource Aquisition Is Initialization is correct, but the corollary is not generally true, which is to say, my variable going out of scope does not generally mean I want to de-aquire that resource.
So, sooner or later, everything gets wrapped in a reference counting smart pointer. And reference counting always seemed to me to be a primitive or last-resort memory managment strategy.
But it does! When an object goes out of scope, nobody can/shall use it anymore, so of course it should release its (remaining) resources. If you want to hold on the object, you need to revisit its lifetime and ownership, but that's independent from RAII.
In fact, we are at the moment ripping out some template code in a C code base which has some C++ for cuda in it, and this one file with C++ templates almost doubles the compilation time of the complete project (with ~700 source files). IMHO it is grotesque how bad it is.
That said, generic programming in C isn't that bad, just very annoying.
To me the best approach is to write the code for a concrete type (like Vec_int), make sure everything is working, and then do the following:
A macro Vec(T) sets up the struct. It can then be wrapped in a typedef like typedef Vec(int) Vec_i;
For each function, like vec_append(...), copy the body into a macro VEC_APPEND(...).
Then for each relevant type T: copy paste all the function declarations, then do a manual find/replace to give them some suffix and fill in the body with a call to the macro (to avoid any issues with expressions being executed multiple times in a macro body).
Is it annoying? Definitely. Is it unmanageable? Not really. Some people don't even bother with this last bit and just use the macros to inline the code everywhere.
Some macros can delegate to void*-based helpers to minimize the bloating.
EDIT: I almost dread to suggest this but CMake's configure_file command works great to implement generic files...
The first is to put this into an include file
#define type_argument int
#include <vector.h>
Then inside vector.h the code looks like regular C code, except where you insert the argument. foo_ ## type_argument ( ... )
The other is to write generic code using void pointers or container_of as regular functions, and only have one-line macros as type safe wrappers around it. The optimizer will be able to specialize it, and it avoids compile-time explosion of code during monomorphization,I do not think that templates are less annoying in practice. My experience with templates is rather poor.
Sometimes the best option is an external script to instantiate a template file.
BTW, here is some generic code in C using a variadic type. I think this quite nice. https://godbolt.org/z/jxz6Y6f9x
Running a program for meta programming are always a possibility, and I would agree that sometimes the best solution.
T ## _foo (T foo, ...)
is that much different from <T>::foo (T foo, ...)
Same for: foo (Object * a)
vs: foo (void * a)
#include <vector.h(int32_t)>
#include <vector.h(int64_t)>
The written `vector.h(type_argument)` file could just be a regular C header or an m4 file which has `type_argument` in its template. When requesting `vector.h(int32_t)` the FUSE filesystem would effectively give the output of calling `gcc -E` or `m4` on the template file as the content of the file being requested.Eg, if `vector.h(type_argument)` was an m4 file containing:
`#ifndef VECTOR_'type_argument`_INCLUDED'
`#define VECTOR_'type_argument`_INCLUDED'
typedef struct `vector_'type_argument {
size_t length;
type_argument values[];
} `vector_'type_argument;
...
#endif
Then `m4 -D type_argument=int32_t vector.h(type_argument)` gives the output: #ifndef VECTOR_int32_t_INCLUDED
#define VECTOR_int32_t_INCLUDED
typedef struct vector_int32_t {
size_t length;
int32_t values[];
} vector_int32_t;
...
#endif
But the idea is to make it transparent so that existing tools just see the pre-processed file and don't need to call `m4` manually. We would need to mount each include directory that uses this approach using said filesystem. This shouldn't require changing a project's structure as we could use the existing `include/` or `src/` directory as input when mounting, and just pick some new directory name such as `cfuse/include` or `cfuse/src`, and mount a new directory `cfuse` in the project's root directory. The change we'd need to make is in any Makefiles or other parts of the build, where instead of `gcc -Iinclude` we'd have `gcc -Icfuse/include`. Any non-templated headers in `include/` would just appear as live copies in cfuse/include/, so in theory this could work without causing anything to break.Its only real issue is that people will constantly tell you how bad it is and how their language of choice is so much better. But if you look at how things work out in practice, you can usually do things very nicely in C.
Perfect hashing that you’d ideally use two different approaches for depending on whether the platform has a cheap popcount (hi AArch32), but to avoid complicating the build you give up and emulate popcount instead. Hundreds of thousands of lines of asynchronous I/O code written in a manual continuation-passing style, with random, occasionally problematic blocking synchronization sprinkled all over because the programmer simply could not be bothered anymore to untangle this nested loop, and with a dynamic allocation for each async frame because that’s the path of least resistance. The intense awkwardness of the state-machine / regular-expression code generators, well-developed as they are. Hoping the compiler will merge the `int` and `long` code paths when their machine representations are identical, but not seeing it happen because functions must have unique addresses. Resorting to .init_array—and slowing down startup—because the linker is too rigid to compute this one known-constant value. And yes, polymorphic datastructures.
I don’t really see anybody do noticeably better than C; I think only Zig and Odin (perhaps also Hare and Virgil?) are even competing in the same category. But I can’t help feeling that things could be much better. Then I look at the graveyard of attempted extensions both special-purpose (CPC[1]) and general (Xoc[2]) and despair.
Many example I see where people argue for metaprogramming features are not all convincing to me. For example, there was recently a discussion about Zig comp-time. https://news.ycombinator.com/item?id=44208060 This is the Zig example: https://godbolt.org/z/1dacacfzc Here is the C code: https://godbolt.org/z/Wxo4vaohb
Or there was a recent example where someone wanted to give an example for C++ coroutines and showed pre-order tree traversal (which I can't find at the moment), but the C code using vec(node) IMHO was better: https://godbolt.org/z/sjbT453dM compared to the C++ coroutine version: https://godbolt.org/z/fnGzszf3j (from https://news.ycombinator.com/item?id=43831628 here). Edited to add source.
That's also a major reason why you'd use C rather than C++. The C++ ABI is terrible for language interoperability. It's common for C++ libraries to wrap their API in C so that it can be used from other language's FFIs.
Aside from that another reason we prefer C to C++ is because we don't want vtables. I think there's room for a `C+` language, by which I mean C+templates and not C+classes - perhaps with an ABI which is a subset of the C++ ABI but superset of the C ABI.
indeed, i have spoken to a lot of my colleagues about just that. if overloading is not allowed, perhaps there is still some hope for a backwards compatible abi ?
We might be able to make this ABI compatible with C if no templates are used, which wouldn't cause breaking changes - but for other compilers to be able to use templates they would need to opt-in to the new scheme. For that we'd probably want to augment libffi to include completely new functions for dealing with templates. Eg, we'd have an ffi_template_type, and an ffi_prep_template for which we supply its type arguments - then an ffi_prep_templated_cif for calls which use templates, and so forth. It would basically be a new API - but probably still more practical than trying to support the C++ ABI.
Another issue is that if we compile some library with templates and expose them in the ABI, we need some way to instantiate the template with new types which were not present when the library was compiled. There's no trivial solution to this. We'd really need to JIT-compile the templates.
may you please elaborate on _why_ you think this is needed ?
His code is here: https://github.com/drh/cii
A neat project that was posted here a while back uses it: https://x.com/kotsoft/status/1792295331582869891
struct Vec {
void *data;
size_t len;
size_t cap;
size_t sizeof_ty;
}
I then use a macro to define a new type IntVec {
struct Vec inner;
int ty[0];
}
Using the zero sized filed I can do typeof(*ty) to get some type safety back.All of the methods are implemented on the base Vec type and have a small wrapper which casts/assets the type of the things you are trying to push.
uecker•3d ago