That boolean should probably be something else

https://ntietz.com/blog/that-boolean-should-probably-be-something-else/

89•vidyesh•19h ago

Comments

ck45•19h ago

One argument that I’m missing in the article is that with an enumerated, states are mutually exclusive, while withseveral booleans, there could be some limbo state of several bool columns with value true, e.g. is_guest and is_admin, which is an invalid state.

cjs_ac•18h ago

In that case, you set the enumeration up to use separate bit flags for each boolean, e.g., is_guest is the least significant bit, is_admin is the second least significant bit, etc. Of course, then you've still got a bunch of booleans that you need to test individually, but at least they're in the same column.

cratermoon•18h ago

look up the typestate pattern.

Fraterkes•18h ago

I’m not a very experienced programmer, but the first example immediately strikes me as weird. The consideration for choosing types is often to communicate intend to others (and your future self). I think that’s also why code is often broken up into functions, even if the logic does not need to be modular / repeatable: the function signature kind of “summarizes” that bit of code.

Making a boolean a datetime, just in case you ever want to use the data, is not the kind of pattern that makes your code clearer in my opinion. The fact that you only save a binary true/false value tells the person looking at the code a ton about what the program currently is meant to do.

bluGill•18h ago

In the case of a database you often can't fix mistakes so overdesign just in case makes sense. Many have been burned.

hahn-kev•16h ago

See always having a synthetic primary key

crazygringo•7h ago

Probably even more have been burned by overdesign.

If you decided to make your boolean a timestamp, and now realize you need a field with 3 states, now what?

If you'd kept your boolean, you could convert the field from BOOL to TINYINT without changing any data. [0, 1] becomes [0, 1, 2] easily.

jandrewrogers•6h ago

While I agree on the over-design point, it doesn't follow that the BOOL is trivially convertible to a TINYINT. In some databases a BOOL is stored as a single bit.

turboponyy•17h ago

I actually completely agree with both the article and your point that your code should directly communicate your intent.

The angle I'd approach it from is this: recording whether an email is verified as a boolean is actually misguided - that is, the intent is wrong.

The actual things of interest are the email entity and the verification event. If you record both, 'is_verified' is trivial to derive.

However, consider if you now must implement the rule that "emails are verified only if a verification took place within the last 6 months." Recording verifications as events handles this trivially, whilst this doesn't work with booleans.

Some other examples - what is the rate of verifications per unit of time? How many verification emails do we have to send out?

Flipping a boolean when the first of these events occurs without storing the event itself works in special cases, but not in general. Storing a boolean is overly rigid, throws away the underlying information of interest, and overloads the model with unrelated fields (imagine storing say 7 or 8 different kinds of events linked to some model).

crazygringo•7h ago

> that is, the intent is wrong. The actual things of interest are the email entity and the verification event.

Or, your assumption about the intent is wrong. Many (most?) times, the intent is precisely whether an email is verified. That's all. And that's OK if that's all the project needs.

> Storing a boolean is overly rigid, throws away the underlying information of interest, and overloads the model with unrelated fields

Also, storing a boolean can most accurately reflect intent, avoid hoarding unnecessary and unneeded information, and maximize the model's conceptual clarity.

joshstrange•17h ago

Normally you'd name the field `created_at`, `updated_at`, or similar which I think makes it very clear.

> Making a boolean a datetime, just in case you ever want to use the data, is not the kind of pattern that makes your code clearer in my opinion.

I don't follow at all, if your field is named as when a thing happened (`_at` suffix) then that seems very clear. Also, even if you never expose this via UI it can be a godsend for debugging "Oh, it was updated on XXXX-XX-XX, that's when we had Y bug or that's why Z service was having an issue".

burnt-resistor•2h ago

The author doesn't consider the nuanced trade-offs of where, what kind, and how much data to persist, compute, and where to store it, whether in the database or modeled in code. It's a quite superficial article bordering on meaninglessness that doesn't expound on the considerations of thoughtful engineering for the stakeholders: swe maintenance, operations, and business/user needs. It should lead into asking questions rather than present "the" answer.

taylodl•18h ago

What I'm getting out of this is boolean shouldn't be a state that's durably stored, it's ephemeral, an artifact of runtime processing. You wouldn't likely durably store a boolean in an OLTP store, but your ETL into the OLAP store may capture a boolean to simplify logic for all the systems using the OLAP store to drive decision support. That is, it's an optimization. That feels right, but I've never really thought through this before. Interesting!

jbreckmckye•18h ago

This makes intuitive sense because booleans are obviously reductive, as reductive as it gets (ideally stored in 1 bit), but for processing and analysis there's typically no reason to store data so sparingly

taylodl•18h ago

For processing and analysis, you're centralizing the compute of complex analysis and storing the result so downstream decision support systems can use the result as a criterion in their analysis - and not have to distribute, and maintain, that logic throughout the set of applications. A contrived example: is_valued_customer. This is a simple boolean, but its computation can be involved and you wouldn't want to have to replicate and maintain this logic throughout all the applications. But at the time, it likely has no business being in the OLTP store.

jbreckmckye•18h ago

You might persist that value as an optimisation, but if you make it your source of truth, and discard your inputs, you better make sure you never ever ever ever have a bug in deriveValuedCustomer() or else you have lost data permanently

taylodl•18h ago

Good point - you wouldn't want to discard your inputs. You're going to need them should you ever redefine deriveValuedCustomer() - which is likely for a system that will be in production for 10-20 years or more.

RaftPeople•8h ago

> You wouldn't likely durably store a boolean in an OLTP store

Parcel carrier shipment transaction:

ReturnServiceRequested: True/False

I can think of many more of these that are options of some transaction that should be stored and naturally are represented as boolean.

jbreckmckye•18h ago

To summarise: booleans should be derived, not stored

chikinpotpi•18h ago

I generally prefer to let one value mean one thing.

Allowing the presence of a dateTime (UserVerificationDate for example) to have a meaning in addition to its raw value seems safe and clean. But over time in any system these double meanings pile up and lose their context.

Having two fields (i.e. UserHasVerified, UserVerificationDate) doesn't waste THAT much more space, and leaves no room for interpretation.

cratermoon•18h ago

> Having two fields (i.e. UserHasVerified, UserVerificationDate)

What happens when they get out of sync?

jerf•18h ago

But it does leave room for "UserHasVerified = false, UserVerificationDate = 2025/08/25" and "UserHasVerified = true, UserVerificationDate = NULL".

The better databases can be given a key to force the two fields to match. Most programming languages can be written in such a way that there's no way to separate the two fields and represent the broken states I show above.

However the end result of doing that ends up isomorphic to simply having the UserVerificationDate also indicate verification. You just spent more effort to get there. You were probably better off with a comment indicating that "NULL" means not verified.

In a perfect world I would say it's obvious that NULL means not verified. In the real world I live in I encounter random NULLs that do not have a clear intentionality behind them in my databases all the time. Still, some comments about this (or other documentation) would do the trick, and the system should still tend to evolve towards this field being used correctly once it gets wired in to the first couple of uses.

mrheosuper•18h ago

I dont like this pattern.

The author example, checking if "Datetime is null" to check if user is authorized or not, is not clear.

What if there are other field associated with login session like login Location ? Now you dont know exactly what field to check.

Or if you receive Null in Datetime field, is it because the user has not login, or because there is problem when retriving Datetime ?

This is just micro-optimization for no good reason

monkeyelite•17h ago

> Now you dont know exactly what field to check.

Yes you do - you have a helper method that encapsulates the details.

In the DB you could also make a view or generated column.

> This is just micro-optimization for no good reason

It’s conceptually simpler to have a representation with fewer states, and bugs are hopefully impossible. For example what would it mean for the bool authorized to be false but the authorized date time to be non-null?

RaftPeople•8h ago

> In the DB you could also make a view or generated column.

Or you could just use a boolean with a natural self describing name.

monkeyelite•6h ago

My proposal is to use a null date.

Did you miss the part about contradictory states? Are you going to add some database constraints to your book instead?

RaftPeople•4h ago

If you need to store a value that has two states, use a boolean, don't overcomplicate it unless there is real value in creating the complication (which there is value, sometimes).

Regarding contradictory states:

Given that just about no DB is in 5th normal form, the possibility of contradictory states exist in almost every RDBMS, regardless of booleans. It seems like an argument that doesn't really have any strength to it.

coin•18h ago

> But, you're throwing away data

Often it’s intentional for privacy. Record no more data than what’s needed.

usernamed7•18h ago

replace "should" with "could".

I do think its wise to consider when a boolean could be inferred from some other mechanism, but i also use booleans a lot because they are the best solution for many problems. Sure, sometimes what is now a boolean may need to become something later like an enum, and that's fine too. But I would not suggest jumping to those out the gate.

Booleans are good toggles and representatives of 2 states like on/off, public/private. But sometimes an association, or datetime, or field presence can give you more data and said data is more useful to know than a separate attribute.

fifticon•18h ago

The scope of TFA is data modelling, where it advises to use more descriptive data values, such as enums or happenedAtTimestamp.

However, personally I agree with the advice, in another context: Function return types, and if-statements.

Often, some critical major situation or direction is communicated with returned booleans. They will indicate something like 'did-optimizer-pass-succeed-or-run-to-completion-or-finish', stuff like that. And this will determine how the program proceeds next (retry, abort, continue, etc.)

A problem arises when multiple developers (maybe yourself, in 3 months) need to communicate about and understand this correctly.

Sometimes, that returned value will mean 'function-was-successful'. Sometimes it means 'true if there were problems/issues' (the way to this perspective, is when the function is 'checkForProblems'/verify/sanitycheck() ).

Another way to make confusion with this, is when multiple functions are available to plug in or proceed to call - and people assume they all agree on "true is OK, false is problems" or vice versa.

A third and maybe most important variant, is when 'the return value doesn't quite mean what you thought'. - 'I thought it meant "a map has been allocated".' - but it means 'a map exists' (but has not necesarily been allocated, if it was pre-existing).

All this can be attacked with two-value enums, NO_CONVERSION_FAILED=0, YES_CONVERSION_WAS_SUCCESFUL=1 . (and yes, I see the peril in putting 0 and 1 there, but any value will be dangerous..)

1718627440•16h ago

That's why you have coding style guides and documentation. Both choices are "correct", you just need to be consistent.

the__alchemist•18h ago

I read an article with the same premise here a few years ago.

A Boolean is a special, universal case of an enum (or whatever you prefer to call these choice types...) that is semantically valid for many uses.

I'm also an enum fanboy, and agree with the article's examples. It's conclusion of not using booleans because enums are more appropriate in some cases is wrong.

Some cases are good uses of booleans. If you find a Boolean isn't semantically clear, or you need a third variant, then move to an enum.

fenesiistvan•17h ago

I was hoping to read about bitfields or bit flags.

OskarS•17h ago

A piece of advice I read somewhere early in my career was "a boolean should almost never be an argument to a function". I didn't understand what the problem was at the time, but then years later I started at a company with a large Lua code-base (mostly written by one-two developers) and there were many lines of code that looked like this:

   serialize(someObject, true, false, nil, true)

What does those extra arguments do? Who knows, it's impossible without looking at the function definition.

Basically, what had happened was that the developer had written a function ("serialize()", in this example) and then later discovered that they wanted slightly different behaviour in some cases (maybe pretty printed or something). Since Lua allows you to change arity of a function without changing call-sites (missing arguments are just nil), they had just added a flag as an argument. And then another flag. And then another.

I now believe very strongly that you should virtually never have a boolean as an argument to a function. There are exceptions, but not many.

arethuza•17h ago

If you use keyword arguments then something like that doesn't look too bad:

serialize(someObject, prettyPrint:true)

NB I have no idea whether Lua has keyword arguments but if your language does then that would seem to address your particular issue?

OskarS•16h ago

Lua doesn't directly support keyword arguments, but you can simulate it using tables:

    serialize(someObject, { prettyPrint = true })

And indeed that is a big improvement (and commonly done), but it doesn't solve all problems. Say you have X flags, then there's 2^X different configurations you have to check and test and so forth. In reality, all 2^X configurations will not be used, only a tiny fraction will be. In addition, some configurations will simply not be legal (i.e. if flag A is true, then flag B must be as well), and then you have a "make illegal states unrepresentable" situation.

If the tiny fraction is small enough, just write different functions for it ("serialize()" and "prettyPrint()"). If it's not feasible to do it, have a good long think about the API design and if you can refactor it nicely. If the number of combinations is enormous, something like the "builder pattern" is probably a good idea.

It's a hard problem to solve, because there's all sorts of programming principles in tension here ("don't repeat yourself", "make illegal states unrepresentable", "feature flags are bad") and in your way of solving a practical problem. It's interesting to study how popular libraries do this. libcurl is a good example, which has a GAZILLION options for how to do a request, and you do it "statefully" by setting options [1]. libcairo for drawing vector graphics is another interesting example, where you really do have a combinatorial explosion of different shapes, strokes, caps, paths and fills [2]. They also do it statefully.

[1]: https://curl.se/libcurl/c/curl_easy_setopt.html

[2]: https://cairographics.org/manual/cairo-cairo-t.html

lelanthran•17h ago

It's a failing of many type systems of older languages (except Pascal).

The best way in many languages for flags is using unsigned integers that are botwise-ORed together.

In pseudocode:

    Object someObject;
    foo (someObject, Object.Flag1 | Object.Flag2 | Object.Flag3);

Whatever language you are using, it probably has some namespaced way to define flags as `(1 << 0)` and `(1 << 1)` etc.

arethuza•17h ago

If you really need all of that I think I'd go with a separate object holding all of the options:

options = new SerializeOptions();

options.PrettyPrint = true;

options.Flag2 = "red"

options.Flag3 = 27;

serialize(someObject, options)

vanviegen•10h ago

So 1 line of C/C++ becomes 5 lines of Java/C#? That sounds about right! :-) Though I'm sure we can get to 30 if we squeeze in an abstract factory or two!

wallstop•2h ago

You can do the above in C#, I haven't written Java in a decade so can't comment on that. I don't really understand your argument though - the options approach is extremely readable. You can also do the options approach in C or C++. The amount of stuff that you can slap into one line is an interesting benchmark to use for languages.

dandersch•15h ago

It's always crazy to see languages like C being able to beat high-level languages at some ergonomics (which is usually their #1 point of pride) just because C has bitfields and they often don't.

RaftPeople•8h ago

> The best way in many languages for flags is using unsigned integers that are botwise-ORed together.

Why is that the "best" way?

waste_monk•5h ago

It's simple, efficient, and saves space in memory. While not as big a deal these days where most systems have plentiful RAM, it's still useful on things like embedded devices.

Why waste a whole byte on a bool that has one bit of data, when you can pack the equivalent of eight bools into the same space as an uint8_t for free?

RaftPeople•4h ago

Sure, that works when trying to conserve memory to the degree that a few bytes matter, but the downside is that it's more complex, less obvious.

I've done exactly what you propose on different projects but I would never call it the "best" method, merely one that conserves memory but with typical trade-offs like all solutions.

lelanthran•12m ago

> Why is that the "best" way?

"Best way" is often contextual and subjective. In this context (boolean flags to a function), this way is short, readable and scoped, even in C which doesn't even have scoped namespaces.

Maybe there are better ways, and maybe you have a different "best way", but then someone can legitimately ask you about your "best way": `Why is that the "best" way?`

The only objective truth that one can say about a particular way to do something is "This is not the worst way".

account42•17h ago

But this isn't really a boolean problem - even in your example there is another mistery argument: nil

And you can get the same problem with any argument type. What do the arguments in

  copy(obectA, objectB, "")

mean?

In general, you're going to need some kind of way to communicate the purpose - named parameters, IDE autocomplete, whatever - and once you have that then booleans are not worse than any other type.

8-prime•17h ago

True, but I think its worth noting that inferring what a parameter could be is much easier if its something other than a boolean.

You could of course store the boolean in a variable and have the variable name speak for its meaning but at that point might as well just use an enum and do it proper.

For things like strings you either have a variable name - ideally a well describing one - or a string literal which still contains much more information than simply a true or false.

nomel•6h ago

If you language doesn't support named arguments, you can always name the value, with the usual mechanism:

    debug_mode=True
    some_func(..., debug_mode)

OskarS•16h ago

You're correct in principle, but I'm saying that "in practice", boolean arguments are usually feature flag that changes the behavior of the function in some way instead of being some pure value. And that can be really problematic, not least for testing where you now aren't testing a single function, you're testing a combinatorial explosions worth of functions with different feature flags.

Basically, if you have a function takes a boolean in your API, just have two functions instead with descriptive names.

hamburglar•16h ago

> Basically, if you have a function takes a boolean in your API, just have two functions instead with descriptive names.

Yeah right like I’m going to expand this function that takes 10 booleans into 1024 functions. I’m sticking with it. /s

OrderlyTiamat•16h ago

If your function has a McCabe complexity higher than 1024, then boolean arguments are the least of your problems...

crazygringo•6h ago

Not really.

Tons of well-written functions have many more potential code paths than that. And they're easy to reason about because the parameters don't interact much.

Just think of plotting libraries with a ton of optional parameters for showing/hiding axes, ticks, labels, gridlines, legend, etc.

atoav•1h ago

Yes but this is about the difference between:

  engage_turbo_encabulator(True, False, True, False, True, False, True, False)

and:

  engage_turbo_encabulator(
    enable_hydrocoptic=True,
    allow_girdlespring=False,
    activate_marzelvanes=True,
    sync_trunnions=False,
    stabilize_panametric=True,
    lock_detractors=False,
    invert_logarithms=True,
    suppress_lunar_wane=False
  )

The latter is how you should use such a function if you can't change it (and if your language allows it).

If this was my function I would probably make the parameters atrributes of an TurboEncabulator class and add some setter methods that can be chained, e.g. Rust-style:

  encabulator = (
    TurboEncabulator.new()
    .enable_hydrocoptic(True)
    .allow_girdlespring(False)
    .enable_marzelvane_activation(True)
    .enable _trunnion_syncing(False)
    .enable_param_stabilization(True)
    .enable_detractor_locking(False)
    .enable_logarithm_inversion(True)
    .enable_lunar_wane_supression(False)
    .build()
  )

Viliam1234•10h ago

Hopefully you could refactor it automatically into 1024 functions and then find out that 1009 of them are never called in the project, so you can remove them.

atoav•1h ago

Well, that just means the function might be named wrong?

  copy_from_to_by_key(objectA, objectB, "name")

Or, much better, you use named parameters, if your language supports it:

  copy_value(
    source=objectA,
    target=objectB,
    key="name"
  )

Or you could make it part of object by declaring a method that could be used like this:

  objectB.set_value_from(objectA, key="name")

nutjob2•17h ago

> I now believe very strongly that you should virtually never have a boolean as an argument to a function. There are exceptions, but not many.

Really? That sounds unjustified outside of some specific context. As a general rule I just can't see it.

I don't see whats fundamentally wrong with it. Whats the alternative? Multiple static functions with different names corresponding to the flags and code duplication, plus switch statements to select the right function?

Or maybe you're making some other point?

0x3444ac53•13h ago

I think the answer to this (specific to lua) is passing a table as an argument that gets unpacked.

StopDisinfo910•10h ago

Named arguments are a solution to precisely this issue. With optional arguments with default value, you get to do precisely what was being done in your Lua code but with self documenting code.

I personally believe very strongly that people shouldn’t use programming languages lacking basic functionalities.

_dain_•9h ago

Named arguments don't stop the deeper problem, which is that N booleans have 2^N possible states. As N increases it's rare for all those combinations to be valid. Just figuring out the truth table might be challenging enough, then there's the question of whether the caller or callee is responsible for enforcing it. And either way you have to document and test it.

Enums are better because you can carve out precisely the state space you want and no more.

fluoridation•9h ago

That's not a problem per se. It may very well be that you're configuring the behavior of something with a bunch of totally independent on/off switches. Replacing n booleans with an enum with 2^n values is just as wrong as replacing a 5-valued enum with 3 booleans that cannot be validly set independently.

lukan•6h ago

Or not use them without tooling?

I believe IDE's had the feature of showing me the function header with a mouse hover 20+ years ago.

kevin_thibedeau•6h ago

You can also document the argument name inline for languages with block comments but no named args.

Lyngbakr•9h ago

I don't know if this is where you read it, but this advice is also given in Clean Code.

OskarS•37m ago

I don’t remember exactly where I read this, but I think it was some internet forum of some kind. It makes sense that whoever wrote it got it from there. Never read it myself.

mostlysimilar•9h ago

Something I really love in Elixir is that functions can be named identically and are considered different with different arity.

https://elixirschool.com/en/lessons/basics/functions#functio...

hatthew•8h ago

I'm not sure I understand how this is different from function overloading

Vinnl•8h ago

I die a little inside every time I write:

    JSON.stringify(val, null, 2);

(So yes, but it goes beyond booleans. All optional parameters should be named parameters.)

drdec•6h ago

I'm surprised nobody has suggested this yet. Just use a different name for the function. In your example, the new function should be prettyPrint(). No booleans required. No extra structures required.

bayindirh•17h ago

I'll expand on the first example, the datetime one.

Many user databases use soft-deletes where fields can change or be deleted, so user's actions can be logged, investigated or rolled back.

When user changes their e-mail (or adds another one), we add a row, and "verifiedAt" is now null. User verifies new email, so its time is recorded to the "verifiedAt" field.

Now, we have many e-mails for the same user with valid "verifiedAt" fields. Which one is the current one? We need another boolean for that (isCurrent). Selecting the last one doesn't make sense all the time, because we might have primary and backup mails, and the oldest one might be the primary one.

If we want to support multiple valid e-mails for a single account, we might need another boolean field "isPrimary". So it makes two additional booleans. isCurrent, isPrimary.

I can merge it into a nice bit field or a comma separated value list, but it defeats the purpose and wanders into code-golf territory.

Booleans are nice. Love them, and don't kick them around because they're small, and sometimes round.

kelnos•9h ago

I would say for your specific example, you shouldn't have boolean flags for that in the user_emails table, but instead have a primary_email column in the users table, that has a foreign key reference to the user_emails table. That way you can also ensure that the user always has exactly one primary email.

And for is_current, I still think a nullable timestamp could be useful there instead of a boolean. You might have a policy to delete old email addresses after they've been inactive for a certain amount of time, for example. But I'll admit that a boolean is fine there too, if you really don't care when the user removed an email from the current list. (Depending on usage patterns, you might even want to move inactive email addresses to a different table, if you expect them to accumulate over time.)

I think booleans are special in a weird way: if you think more about what you're using it for, you can almost always find a different way to store it that gives you richer information, if you need it.

alphazard•17h ago

The timestamps instead of boolean thing is something good engineers stumble upon pretty reliably. One gotcha is the database might be weird about indexing nulls. I'm not going to give an example because you should really read the docs for your specific database if this matters.

The ever growing set of boolean flags seems to be an attractor state for database schemas. Unless you take steps to avoid/prohibit it, people will reach for a single boolean flag for their project/task. Fortunately it's pretty easy to explain why it's bad with a counting argument. e.g. There are this many states with booleans, and this fraction are valid vs. this many with the enum and this fraction are valid. There is no verification, so a misunderstanding is more likely to produce an invalid state than a valid state.

pixelfarmer•17h ago

There can be verification for such things.

bsoles•17h ago

This is such a weird advice and it seems to come from a particular experience of software development.

How about using Booleans for binary things? Is the LED on or off, is the button pressed or not, is the microcontroller pin low or high? Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice.

leni536•17h ago

> Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice.

In C++ you can use enums in bit-fields, not sure what the case is in C.

jilles•17h ago

* led status: on, off, non-responsive * button status: idle, pressing, pressed

I'm with you by the way, but you can often think of a way to use enums instead (not saying you should).

nh23423fefe•10h ago

well yes. every boolean is iso to 2, and every 2 can be embedded in 3. and every N can be embedded in N+1

pessimizer•10h ago

  enum Bool 
  { 
      True, 
      False, 
      FileNotFound 
  };

https://thedailywtf.com/articles/What_Is_Truth_0x3f_

edit: The 24th of October will be the 20th anniversary of that post.

padjo•17h ago

I think it’s implicitly in the context of datastore design. In that context it feels like decent advice that would prevent a lot of mess.

kps•17h ago

They're boolean (single bit of information) but not boolean (single bit interpreted as meaning true or false). The LED isn't true or false, the microcontroller pin isn't true or false.

bsoles•17h ago

This is semantic pedantry. The association true/1/high and false/0/low is well-known and understood.

kps•13h ago

Plenty of signals are asserted (true) by being brought low, or have 1=low (e.g. CAN).

marcellus23•17h ago

huh? The LED isn't true or false, but whether the LED is on is true or false.

simondw•10h ago

And whether the LED is off is false or true.

aDyslecticCrow•17h ago

The boolean type is the massive whaste, not the enum. A boolean in c is just a full int. So definitely not a whaste to use an enum which is also an int.

And usually you use operations to isolate the bit from a status byte or word, which is how it's also stored and accessed in registers anyway.

So its still no boolean type despite expressing boolean things.

Enums also help keep the state machine clear. {Init, on, off, error} capture a larger part of the program behavior in a clear format than 2-3 binary flags, despite describing the same function. Every new boolean flag is a two state composite state machine hiding edgecases.

glxxyz•6h ago

Not necessarily a waste in all languages. A c++ `std::vector<bool>` efficiently packs bits for example, although it does have its own 'issues'.

aDyslecticCrow•1h ago

I kinda hate that. It gives the vector very special behaviour for one type in particular, going against the intuition behind how both boolean and vector works everywhere else in the language.

Id prefer if they just added std::bitvector.

devnullbrain•9h ago

Having spent time in the embedded mines, I think the onus is on embedded to vocally differentiate itself from normal software development, not for it to be assumed that general software advice applies to embedded.

If embedded projects start using C standards from the past quarter century, they can join in on type discourse.

bigger_cheese•8h ago

I work at an industrial plant we use boolean datatypes for stateful things like this. For example is Conveyor belt running (1) or stopped (0).

Sure we could store the data by logging the start timestamp and a stop timestamp but our data is stored on a time series basis (i.e. in a Timeseries DB, the timestamp is already the primary key for each record) When you are viewing the trend (such a on control room screen) you get a nice square-wave type effect you can easily see when the state changes.

This also makes things like total run time easy to compute, just sum the flag value over 1 second increments to get number of seconds in a shift the conveyor was running for.

Sure in my example you could just store something like motor current in Amps (and we do) and use this to infer the conveyor state but hopefully I've illustrated why a on/off flag is cleaner.

eflim•17h ago

I would add counters to this list. Start from zero (false), and then you know not just whether an event has occurred, but how many times.

arethuza•17h ago

I once, briefly, worked with a developer who believed that you should never use primitive types for fields or parameters...

Duanemclemore•15h ago

APL and its descendents don't have booleans, just 0 and 1 [0]. Which is awesome. It allows for bitmasks, sums / reductions, and even conditionals via Iverson Brackets. [1]

[0] https://aplwiki.com/wiki/Boolean

[1] https://en.m.wikipedia.org/wiki/Iverson_bracket

dang•11h ago

That boolean should probably be something else - https://news.ycombinator.com/item?id=44423995 - June 2025 (1 comment, but it's solid)

zwieback•11h ago

Maybe for the DB domain author is talking about but the nice thing about a bool is that it's true or false. I don't have to dig around documentation or look through the code what the convention of converting enum, datetime, etc. to true/false is. 1970/1/1 (I was four years old then, just sayin), -6000 or something else?

Nullable helps a lot here but not all languages support that the same way.

amelius•10h ago

> A lot of boolean data is representing a temporal event having happened. For example, websites often have you confirm your email. This may be stored as a boolean column, is_confirmed, in the database. It makes a lot of sense.

> But, you're throwing away data: when the confirmation happened. You can instead store when the user confirmed their email in a nullable column. You can still get the same information by checking whether the column is null. But you also get richer data for other purposes.

So the Boolean should be something else + NULL?

Now we have another problem ...

buckle8017•10h ago

It should be a timestamp of the last time the email was verified.

It's a surprisingly useful piece of data to have.

amelius•10h ago

Even more useful is a log of all the changes in the database. This gives you what you want, and it would be automatic for any data you store.

So, keep the Boolean, and use a log.

aydyn•10h ago

No? So you have to look at database history to extract information you think is useful?

That's a terrible database design.

amelius•10h ago

It's the basis behind Datomic, if I'm not mistaking.

You can easily search through history. The point is, it is better to do this in the design of the database than in the design of the schema.

So: "No?" -> "Yes!"

aydyn•10h ago

Okay, but for something like SQL this seems like a bad idea.

afc•10h ago

It should be: std::optional<Timestamp> (or Optional[datetime] or equivalent in others languages)

If you're using a type system that is so poor that it won't easily detect statically places where you're not correctly handling the absent values, you do have a much bigger problem than using bool.

_dain_•9h ago

Booleans don't "remember" what they mean. They're just a `true` or a `false`, the association with the `is_authenticated` variable or whatever has to be maintained by programmer discipline. But when you have an enum variant like `Authenticated`, that's encoded in the value itself, helped by the type system. It can't be confused with some other state or condition.

Booleans beget more booleans. Once you have one or two argument flags, they tend to proliferate, as programmers try to cram more and more modalities into the same function signature. The set of possible inputs grows with 2^N, but usually not all of them are valid combinations. This is a source of bugs. Again, enums / sum-types solve this because you can make the cardinality of the input space precisely equal to the number of valid inputs.

baw-bag•9h ago

Please if you are in this situation do not take this advice. You just generate massive garbage abstractions upstream. If boolean arguments are out of hand, the problem isn't the boolean.

nurettin•9h ago

Isn't that the point? If booleans are out of hand, either you are trying to emulate a state machine or you are lacking enums. Or in case of 20 bool parameters, just make it a struct. Nobody will complain.

astrange•9h ago

Everyone's always trying to emulate a state machine - OOP objects are kind of just an unsafe informal state machine implementation.

Oddly, almost noone has tried providing actual state machines where you have to prove you've figured out what the state transitions are.

jmyeet•9h ago

My favorite Java code I've ever seen is:

    @Nullable Optional<Boolean> foo;

For when 3 values for a boolean just aren't enough.

Here are two rules I learned from data modelling and APIs many years ago:

1. If you don't do arithmetic on it, it's not a number. ID int columns and foreign keys are excluded from this. But a phone number or an SSN or a employee ID (that is visible to people) should never be a number; and

2. It's almost never a boolean. It's almost always an enum.

Enums are just better. You can't accidentally pass a strong enum into the wrong parameter. Enums can be extended. There's nothing more depressing than seeing:

    do_stuff(id, true, true, false, true, false, true);

This goes for returning success from a function too.

kelnos•9h ago

> @Nullable Optional<Boolean> foo;

To be (somewhat facetiously) fair, that's just JSON. The key can be not-present, present but null, or it can have a value. I usually use nested Options for that, not nulls, but it's still annoying to represent.

In Rust I could also do

    enum JsonValue<T> {
        Missing,
        Null,
        Present(T),
    }

But then I'd end up reinventing Option semantics, and would need to do a bunch of conversions when interacting with other stuff.

zavec•9h ago

Oh this is fantastic! I'm giving a talk in about a month at work on how to use the python type system in useful ways to catch more bugs before runtime, and this seems like a great point to throw in there as an aside at the very least!

skybrian•8h ago

Changing a boolean database field like 'is_confirmed' to a nullable datetime is a simple, cheap hack that records a little bit of information about an event. It's appropriate when you're not sure you care about the event.

If you know you actually care about the event, there are probably more fields to stuff into an event record, and then maybe you could save the event record's id instead?

But going too far in this direction based on speculation about what information you might need later is going to complicate the schema.

See Paul's comment in the other thread for more: https://news.ycombinator.com/item?id=44423995

crazygringo•8h ago

No. All of this is breaking the primary rule of programming: KISS (keep it simple, stupid). Don't add unnecessary complexity. Avoid premature optimization. Tons of things are correctly booleans and should stay that way.

Turning boolean database values into timestamps is a weird hack that wastes space. Why do you want to record when an email was verified, but not when any other fields that happen to be strings or numbers or blobs were changed? Either implement proper event logging or not, but don't do some weird hack where only booleans get fake-logged but nothing else does.

Should booleans turn into enums when a third mutually-exclusive state gets added? Yes, of course, so go refactor, easy. But don't start with an enum before you need it. The same way we don't start with floats rather than ints "just in case" we need fractional values later on.

Booleans are a cornerstone of programming and logic. They're great. I don't know where this "booleans are bad" idea came from, but it's the opposite of communicating intention clearly in code. That boolean should probably stay a boolean unless there's an actual reason to change it.

breadwinner•7h ago

Disagree. KISS is for bigger things like architecture. Exposing an enum instead of a simple bool is a good idea that will save you time later. The only time to not do this is if you're exposing internal info, i.e., breaking encapsulation.

msgodel•7h ago

Yeah it might be better to think of booleans as "the smallest possible integer type" and use enums (or whatever your language has) to represent more meaningful data.

Although it always depends on what exactly you're really doing.

nostrademons•6h ago

It saves you time until you realize that those status flags are orthogonal. It's very common for a job to be both is_started and is_queued, for example. And a simple is_failed status enum is problematic once you add retries, and can have a failed job enter the queue to be started again.

KISS, YAGNI, and then actually analyze your requirements to understand what the mature schema looks like. A boolean is the simplest thing that can possibly work, usually. Do that first and see how your requirements evolve, then build the database schema that reflects your actual requirements.

glxxyz•6h ago

Yes the advice in TFA was brought to you by the sort of people who never finish anything because they're always wasting time thinking about potential future use cases that will never happen. Make it simple and extensible and make it satisfy today's requirements.

lock1•38m ago

Disagree. Given the current popularity of dynamic languages and the fact that many people don't understand the value of ADT, newtype pattern, C-like enum even in static languages, I'd argue booleans & primitives are way overused.

I think a lot of people misunderstand KISS, believing everything should be primitives or surface-level simplicity. Instead, I interpret "simple" not something like golang's surface-level readability, but infosec's "principle of least privilege". Pick the option that minimizes possible state and capture the requirement logic, rather than primitives just because they're "simple" or "familiar".

Even then, sometimes it's fine to violate it. In this case, (nullable) date time might be more preferable than boolean for future-proofing purposes. It's trivial to optimize space by mapping date time to boolean, while it's a total pain to migrate from boolean to date time.

Also, doesn't "... a weird hack that wastes space" contradict "Avoid premature optimization"?

throwaway81523•7h ago

Aka "Boolean blindness", look it up.

burnt-resistor•3h ago

The only, universally-valid advice is: it depends. There are no hard and fast universal rules except carefully deciding when and when not to break conventions and guidelines.

Depending on the complexity of and user requirements the system, hard-coding roles as an enum could span the spectrum anywhere from a good to a bad idea. It would be a terrible thing if user-define roles were a requirement because an enum can't model a dynamic set of ad-hoc, user-defined groups. The careful and defensive planning for evolution of requirements without over-optimizing, over-engineering, or adding too much extra code is part of the balance that must be made. It could be a good thing if it were a very simple site that just needed to ship ASAP.

Claude Sonnet will ship in Xcode

Strange CW Keys

Make any site multiplayer in a few lines. Serverless WebRTC matchmaking

Ask HN: The government of my country blocked VPN access. What should I use?

The Synology End Game

Lucky 13: a look at Debian trixie

A deep dive into Debian 13 /tmp: What's new, and what to do if you don't like it

PSA: Libxslt is unmaintained and has 5 unpatched security bugs

Some thoughts on LLMs and software development

Fuck up my site – Turn any website into beautiful chaos

AI adoption linked to 13% decline in jobs for young U.S. workers: study

Uncertain<T>

My startup banking story (2023)

The Space Shuttle Columbia disaster and the over-reliance on PowerPoint (2019)

Launch HN: Dedalus Labs (YC S25) – Vercel for Agents

An eyecare foundation model for clinical assistance

Expert: LSP for Elixir

AI coding made me faster, but I can't code to music anymore

Python: The Documentary [video]

How life-size cows made of butter became an iconic symbol of the Midwest

Rupert's Property

Sometimes CPU cores are odd

Building your own CLI coding agent with Pydantic-AI

Thrashing

In Search of AI Psychosis

Web Bot Auth

TuneD is a system tuning service for Linux

Powerful GPUs or Fast Interconnects: Analyzing Relational Workloads

You no longer need JavaScript: an overview of what makes modern CSS so awesome

RSS is awesome

Claude Sonnet will ship in Xcode

Strange CW Keys

Make any site multiplayer in a few lines. Serverless WebRTC matchmaking

Ask HN: The government of my country blocked VPN access. What should I use?

The Synology End Game

Lucky 13: a look at Debian trixie

A deep dive into Debian 13 /tmp: What's new, and what to do if you don't like it

PSA: Libxslt is unmaintained and has 5 unpatched security bugs

Some thoughts on LLMs and software development

Fuck up my site – Turn any website into beautiful chaos

AI adoption linked to 13% decline in jobs for young U.S. workers: study

Uncertain<T>

My startup banking story (2023)

The Space Shuttle Columbia disaster and the over-reliance on PowerPoint (2019)

Launch HN: Dedalus Labs (YC S25) – Vercel for Agents

An eyecare foundation model for clinical assistance

Expert: LSP for Elixir

AI coding made me faster, but I can't code to music anymore

Python: The Documentary [video]

How life-size cows made of butter became an iconic symbol of the Midwest

Rupert's Property

Sometimes CPU cores are odd

Building your own CLI coding agent with Pydantic-AI

Thrashing

In Search of AI Psychosis

Web Bot Auth

TuneD is a system tuning service for Linux

Powerful GPUs or Fast Interconnects: Analyzing Relational Workloads

You no longer need JavaScript: an overview of what makes modern CSS so awesome

RSS is awesome

That boolean should probably be something else

Comments