Making a StringBuffer in C, and questioning my sanity

https://briandouglas.ie/string-buffer-c/

27•coneonthefloor•3d ago

Comments

ranger_danger•3h ago

You might be interested in https://github.com/antirez/sds

fsckboy•1h ago

neat, i like it, has some of the same ideas i've used in my string packages

but i did see a place to shave a byte in the sds data struct. The null terminator is a wasted field, that byte (or int) should be used to store the amount of free space left in the buffer (as a proxy for strlen). When there is no space left in the buffer, the free space value will be.... a very convenient 0 heheh

hey, OP said he wants to be a better C programmer!

ranger_danger•1h ago

> The null terminator is a wasted field

I think that would break its "Compatible with normal C string functions" feature.

fsckboy•1h ago

nooooo you don't understand. when the buffer is not full, the string will be zero terminated "in buffer" (which is how it works as is anyway). when the buffer is full, the "free count" at the end will do double duty, both as a zero count and a zero terminater

ranger_danger•1h ago

But calling "normal C string functions" don't know about the "free count" byte, right? So it wouldn't be updated... unless I'm misunderstanding something.

fsckboy•40m ago

normal c string functions don't know about any of this package's improvements, I'm not sure you understand what the package does.

    +--------+-------------------------------+-----------+
    | Header | Binary safe C alike string... | Null term |
    +--------+-------------------------------+-----------+
             |
             `-> Pointer returned to the user.

his trick is to create a struct with fields in the header for extra information about the string, and then a string buffer also in the struct. but on instantiation, instead of returning the address of the struct/header, he returns the address of the string, so it could be passed to strlen and return the right answer, or open and open the right file, all compatible-like.

but if you call "methods" on the package, they know that there is a header with struct fields below the string buffer and it can obtain those, and update them if need be.

He doesn't document that in more detail in the initial part of the spec/readme, but an obvious thing to add in the header would be a strlen, so you'd know where to append without counting through the string. But without doing something like that, there is no reason to have a header. Normal string functions can "handle" these strings, but they can't update the header information. I'm just extending that concept to the byte at the end also.

this type of thing falls into what the soulless ginger freaks call UB and want to eliminate.

(soulless ginger freaks? a combination of "rust colored" and https://www.youtube.com/watch?v=EY39fkmqKBM )

ranger_danger•23m ago

> instead of returning the address of the struct

Yes I'm pretty sure I understand this part.

> an obvious thing to add in the header would be a strlen

The length is already in the header from what I can tell: https://github.com/antirez/sds/blob/master/sds.h#L64

But my point was that if something like your "free count" byte existed at the end, I would think it couldn't be relied upon because functions such as s*printf that might truncate, don't know about that field, and you don't want later "methods" to rely on a field that hasn't been updated and then run off the end.

And from what I can tell from the link above, there isn't actually a "free count" defined anywhere in the struct, the buffer appears to be at the end of the struct, with no extra fields after it.

Maybe I'm misunderstanding something?

improgrammer007•3h ago

I would rather focus on solving the main problem than reinvent the wheel. Just use C++ if perf is critical which gives you all these things for free. In this day and age the reasons for using C as your main language should be almost zero.

o11c•3h ago

Hm, this implementation seems allergic to passing types by value, which eliminates half of the allocations. It also makes the mistake of being mutable-first, and provides some fundamentally-inefficient operations.

The main mistake that this makes in common with most string implementations make is to only provide a single type, rather than a series of mostly-compatible types that can be used generically in common contexts, but which differ in ways that sometimes matter. Ownership, lifetime, representation, etc.

remexre•1h ago

How would you recommend doing that sort of "subtyping"? _Generic and macros?

o11c•54m ago

Yup. It's a lot saner in C++, but people who refuse to use C++ for political reasons can do it the ugly way using C11 or GNU C.

improgrammer007•42m ago

They even downvote people who suggest C++ :-). Doing this in C is such a colossal waste of time and energy, not to mention the bugs it'll introduce. Sigh!

zahlman•23m ago

Trolling about the choice of implementation language from a throwaway account is worth downvotes, yes. Doing a given task in a given language, simply for the sake of having it done in that language, is a legitimate endeavour, and having someone document (from personal experience) why it's difficult in that language is real content worth discussion. Choosing a better language is very much not a goal here.

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

amelius•1h ago

I wonder how an LLM would rate this code.

zahlman•28m ago

> It also makes the mistake of being mutable-first

Is mutability not part of the point of having a string buffer? Wouldn't the corresponding immutable type just be a string?

WalterBright•7m ago

    new_capacity *= 2;

A better value is to increase size by 1.5:

https://stackoverflow.com/questions/1100311/what-is-the-idea...

gblargg•7m ago

It's odd how it has error reporting in some areas (alloc, split can return NULL if allocation fails), but not others (append, prepend have a void return type but might require allocation internally).

Shutting Down Clear Linux OS

Asynchrony is not concurrency

How to write Rust in the Linux kernel: part 3

Ccusage: A CLI tool for analyzing Claude Code usage from local JSONL files

Silence Is a Commons by Ivan Illich (1983)

Wii U SDBoot1 Exploit “paid the beak”

Broadcom to discontinue free Bitnami Helm charts

Multiplatform Matrix Multiplication Kernels

lsr: ls with io_uring

Valve confirms credit card companies pressured it to delist certain adult games

Meta says it wont sign Europe AI agreement, calling it growth stunting overreach

EPA says it will eliminate its scientific reseach arm

Trying Guix: A Nixer's impressions

AI capex is so big that it's affecting economic statistics

Replication of Quantum Factorisation Records with a VIC-20, an Abacus, and a Dog

Show HN: Molab, a cloud-hosted Marimo notebook workspace

The year of peak might and magic

Mango Health (YC W24) Is Hiring

Show HN: I built library management app for those who outgrew spreadsheets

Sage: An atomic bomb kicked off the biggest computing project in history

CP/M creator Gary Kildall's memoirs released as free download

Cancer DNA is detectable in blood years before diagnosis

A New Geometry for Einstein's Theory of Relativity

Show HN: Simulating autonomous drone formations

How I keep up with AI progress

Making a StringBuffer in C, and questioning my sanity

Benben: An audio player for the terminal, written in Common Lisp

I'm Rebelling Against the Algorithm

Hundred Rabbits – Low-tech living while sailing the world

How to Get Foreign Keys Horribly Wrong

Shutting Down Clear Linux OS

Asynchrony is not concurrency

How to write Rust in the Linux kernel: part 3

Ccusage: A CLI tool for analyzing Claude Code usage from local JSONL files

Silence Is a Commons by Ivan Illich (1983)

Wii U SDBoot1 Exploit “paid the beak”

Broadcom to discontinue free Bitnami Helm charts

Multiplatform Matrix Multiplication Kernels

lsr: ls with io_uring

Valve confirms credit card companies pressured it to delist certain adult games

Meta says it wont sign Europe AI agreement, calling it growth stunting overreach

EPA says it will eliminate its scientific reseach arm

Trying Guix: A Nixer's impressions

AI capex is so big that it's affecting economic statistics

Replication of Quantum Factorisation Records with a VIC-20, an Abacus, and a Dog

Show HN: Molab, a cloud-hosted Marimo notebook workspace

The year of peak might and magic

Mango Health (YC W24) Is Hiring

Show HN: I built library management app for those who outgrew spreadsheets

Sage: An atomic bomb kicked off the biggest computing project in history

CP/M creator Gary Kildall's memoirs released as free download

Cancer DNA is detectable in blood years before diagnosis

A New Geometry for Einstein's Theory of Relativity

Show HN: Simulating autonomous drone formations

How I keep up with AI progress

Making a StringBuffer in C, and questioning my sanity

Benben: An audio player for the terminal, written in Common Lisp

I'm Rebelling Against the Algorithm

Hundred Rabbits – Low-tech living while sailing the world

How to Get Foreign Keys Horribly Wrong

Making a StringBuffer in C, and questioning my sanity

Comments