Making a boolean a datetime, just in case you ever want to use the data, is not the kind of pattern that makes your code clearer in my opinion. The fact that you only save a binary true/false value tells the person looking at the code a ton about what the program currently is meant to do.
If you decided to make your boolean a timestamp, and now realize you need a field with 3 states, now what?
If you'd kept your boolean, you could convert the field from BOOL to TINYINT without changing any data. [0, 1] becomes [0, 1, 2] easily.
The angle I'd approach it from is this: recording whether an email is verified as a boolean is actually misguided - that is, the intent is wrong.
The actual things of interest are the email entity and the verification event. If you record both, 'is_verified' is trivial to derive.
However, consider if you now must implement the rule that "emails are verified only if a verification took place within the last 6 months." Recording verifications as events handles this trivially, whilst this doesn't work with booleans.
Some other examples - what is the rate of verifications per unit of time? How many verification emails do we have to send out?
Flipping a boolean when the first of these events occurs without storing the event itself works in special cases, but not in general. Storing a boolean is overly rigid, throws away the underlying information of interest, and overloads the model with unrelated fields (imagine storing say 7 or 8 different kinds of events linked to some model).
Or, your assumption about the intent is wrong. Many (most?) times, the intent is precisely whether an email is verified. That's all. And that's OK if that's all the project needs.
> Storing a boolean is overly rigid, throws away the underlying information of interest, and overloads the model with unrelated fields
Also, storing a boolean can most accurately reflect intent, avoid hoarding unnecessary and unneeded information, and maximize the model's conceptual clarity.
> Making a boolean a datetime, just in case you ever want to use the data, is not the kind of pattern that makes your code clearer in my opinion.
I don't follow at all, if your field is named as when a thing happened (`_at` suffix) then that seems very clear. Also, even if you never expose this via UI it can be a godsend for debugging "Oh, it was updated on XXXX-XX-XX, that's when we had Y bug or that's why Z service was having an issue".
Parcel carrier shipment transaction:
ReturnServiceRequested: True/False
I can think of many more of these that are options of some transaction that should be stored and naturally are represented as boolean.
Allowing the presence of a dateTime (UserVerificationDate for example) to have a meaning in addition to its raw value seems safe and clean. But over time in any system these double meanings pile up and lose their context.
Having two fields (i.e. UserHasVerified, UserVerificationDate) doesn't waste THAT much more space, and leaves no room for interpretation.
What happens when they get out of sync?
The better databases can be given a key to force the two fields to match. Most programming languages can be written in such a way that there's no way to separate the two fields and represent the broken states I show above.
However the end result of doing that ends up isomorphic to simply having the UserVerificationDate also indicate verification. You just spent more effort to get there. You were probably better off with a comment indicating that "NULL" means not verified.
In a perfect world I would say it's obvious that NULL means not verified. In the real world I live in I encounter random NULLs that do not have a clear intentionality behind them in my databases all the time. Still, some comments about this (or other documentation) would do the trick, and the system should still tend to evolve towards this field being used correctly once it gets wired in to the first couple of uses.
The author example, checking if "Datetime is null" to check if user is authorized or not, is not clear.
What if there are other field associated with login session like login Location ? Now you dont know exactly what field to check.
Or if you receive Null in Datetime field, is it because the user has not login, or because there is problem when retriving Datetime ?
This is just micro-optimization for no good reason
Yes you do - you have a helper method that encapsulates the details.
In the DB you could also make a view or generated column.
> This is just micro-optimization for no good reason
It’s conceptually simpler to have a representation with fewer states, and bugs are hopefully impossible. For example what would it mean for the bool authorized to be false but the authorized date time to be non-null?
Or you could just use a boolean with a natural self describing name.
Did you miss the part about contradictory states? Are you going to add some database constraints to your book instead?
Regarding contradictory states:
Given that just about no DB is in 5th normal form, the possibility of contradictory states exist in almost every RDBMS, regardless of booleans. It seems like an argument that doesn't really have any strength to it.
Often it’s intentional for privacy. Record no more data than what’s needed.
I do think its wise to consider when a boolean could be inferred from some other mechanism, but i also use booleans a lot because they are the best solution for many problems. Sure, sometimes what is now a boolean may need to become something later like an enum, and that's fine too. But I would not suggest jumping to those out the gate.
Booleans are good toggles and representatives of 2 states like on/off, public/private. But sometimes an association, or datetime, or field presence can give you more data and said data is more useful to know than a separate attribute.
However, personally I agree with the advice, in another context: Function return types, and if-statements.
Often, some critical major situation or direction is communicated with returned booleans. They will indicate something like 'did-optimizer-pass-succeed-or-run-to-completion-or-finish', stuff like that. And this will determine how the program proceeds next (retry, abort, continue, etc.)
A problem arises when multiple developers (maybe yourself, in 3 months) need to communicate about and understand this correctly.
Sometimes, that returned value will mean 'function-was-successful'. Sometimes it means 'true if there were problems/issues' (the way to this perspective, is when the function is 'checkForProblems'/verify/sanitycheck() ).
Another way to make confusion with this, is when multiple functions are available to plug in or proceed to call - and people assume they all agree on "true is OK, false is problems" or vice versa.
A third and maybe most important variant, is when 'the return value doesn't quite mean what you thought'. - 'I thought it meant "a map has been allocated".' - but it means 'a map exists' (but has not necesarily been allocated, if it was pre-existing).
All this can be attacked with two-value enums, NO_CONVERSION_FAILED=0, YES_CONVERSION_WAS_SUCCESFUL=1 . (and yes, I see the peril in putting 0 and 1 there, but any value will be dangerous..)
A Boolean is a special, universal case of an enum (or whatever you prefer to call these choice types...) that is semantically valid for many uses.
I'm also an enum fanboy, and agree with the article's examples. It's conclusion of not using booleans because enums are more appropriate in some cases is wrong.
Some cases are good uses of booleans. If you find a Boolean isn't semantically clear, or you need a third variant, then move to an enum.
serialize(someObject, true, false, nil, true)
What does those extra arguments do? Who knows, it's impossible without looking at the function definition.Basically, what had happened was that the developer had written a function ("serialize()", in this example) and then later discovered that they wanted slightly different behaviour in some cases (maybe pretty printed or something). Since Lua allows you to change arity of a function without changing call-sites (missing arguments are just nil), they had just added a flag as an argument. And then another flag. And then another.
I now believe very strongly that you should virtually never have a boolean as an argument to a function. There are exceptions, but not many.
serialize(someObject, prettyPrint:true)
NB I have no idea whether Lua has keyword arguments but if your language does then that would seem to address your particular issue?
serialize(someObject, { prettyPrint = true })
And indeed that is a big improvement (and commonly done), but it doesn't solve all problems. Say you have X flags, then there's 2^X different configurations you have to check and test and so forth. In reality, all 2^X configurations will not be used, only a tiny fraction will be. In addition, some configurations will simply not be legal (i.e. if flag A is true, then flag B must be as well), and then you have a "make illegal states unrepresentable" situation.If the tiny fraction is small enough, just write different functions for it ("serialize()" and "prettyPrint()"). If it's not feasible to do it, have a good long think about the API design and if you can refactor it nicely. If the number of combinations is enormous, something like the "builder pattern" is probably a good idea.
It's a hard problem to solve, because there's all sorts of programming principles in tension here ("don't repeat yourself", "make illegal states unrepresentable", "feature flags are bad") and in your way of solving a practical problem. It's interesting to study how popular libraries do this. libcurl is a good example, which has a GAZILLION options for how to do a request, and you do it "statefully" by setting options [1]. libcairo for drawing vector graphics is another interesting example, where you really do have a combinatorial explosion of different shapes, strokes, caps, paths and fills [2]. They also do it statefully.
The best way in many languages for flags is using unsigned integers that are botwise-ORed together.
In pseudocode:
Object someObject;
foo (someObject, Object.Flag1 | Object.Flag2 | Object.Flag3);
Whatever language you are using, it probably has some namespaced way to define flags as `(1 << 0)` and `(1 << 1)` etc.options = new SerializeOptions();
options.PrettyPrint = true;
options.Flag2 = "red"
options.Flag3 = 27;
serialize(someObject, options)
Why is that the "best" way?
Why waste a whole byte on a bool that has one bit of data, when you can pack the equivalent of eight bools into the same space as an uint8_t for free?
I've done exactly what you propose on different projects but I would never call it the "best" method, merely one that conserves memory but with typical trade-offs like all solutions.
"Best way" is often contextual and subjective. In this context (boolean flags to a function), this way is short, readable and scoped, even in C which doesn't even have scoped namespaces.
Maybe there are better ways, and maybe you have a different "best way", but then someone can legitimately ask you about your "best way": `Why is that the "best" way?`
The only objective truth that one can say about a particular way to do something is "This is not the worst way".
And you can get the same problem with any argument type. What do the arguments in
copy(obectA, objectB, "")
mean?In general, you're going to need some kind of way to communicate the purpose - named parameters, IDE autocomplete, whatever - and once you have that then booleans are not worse than any other type.
You could of course store the boolean in a variable and have the variable name speak for its meaning but at that point might as well just use an enum and do it proper.
For things like strings you either have a variable name - ideally a well describing one - or a string literal which still contains much more information than simply a true or false.
debug_mode=True
some_func(..., debug_mode)
Basically, if you have a function takes a boolean in your API, just have two functions instead with descriptive names.
Yeah right like I’m going to expand this function that takes 10 booleans into 1024 functions. I’m sticking with it. /s
Tons of well-written functions have many more potential code paths than that. And they're easy to reason about because the parameters don't interact much.
Just think of plotting libraries with a ton of optional parameters for showing/hiding axes, ticks, labels, gridlines, legend, etc.
engage_turbo_encabulator(True, False, True, False, True, False, True, False)
and: engage_turbo_encabulator(
enable_hydrocoptic=True,
allow_girdlespring=False,
activate_marzelvanes=True,
sync_trunnions=False,
stabilize_panametric=True,
lock_detractors=False,
invert_logarithms=True,
suppress_lunar_wane=False
)
The latter is how you should use such a function if you can't change it (and if your language allows it).If this was my function I would probably make the parameters atrributes of an TurboEncabulator class and add some setter methods that can be chained, e.g. Rust-style:
encabulator = (
TurboEncabulator.new()
.enable_hydrocoptic(True)
.allow_girdlespring(False)
.enable_marzelvane_activation(True)
.enable _trunnion_syncing(False)
.enable_param_stabilization(True)
.enable_detractor_locking(False)
.enable_logarithm_inversion(True)
.enable_lunar_wane_supression(False)
.build()
)
copy_from_to_by_key(objectA, objectB, "name")
Or, much better, you use named parameters, if your language supports it: copy_value(
source=objectA,
target=objectB,
key="name"
)
Or you could make it part of object by declaring a method that could be used like this: objectB.set_value_from(objectA, key="name")
Really? That sounds unjustified outside of some specific context. As a general rule I just can't see it.
I don't see whats fundamentally wrong with it. Whats the alternative? Multiple static functions with different names corresponding to the flags and code duplication, plus switch statements to select the right function?
Or maybe you're making some other point?
I personally believe very strongly that people shouldn’t use programming languages lacking basic functionalities.
Enums are better because you can carve out precisely the state space you want and no more.
I believe IDE's had the feature of showing me the function header with a mouse hover 20+ years ago.
https://elixirschool.com/en/lessons/basics/functions#functio...
JSON.stringify(val, null, 2);
(So yes, but it goes beyond booleans. All optional parameters should be named parameters.)Many user databases use soft-deletes where fields can change or be deleted, so user's actions can be logged, investigated or rolled back.
When user changes their e-mail (or adds another one), we add a row, and "verifiedAt" is now null. User verifies new email, so its time is recorded to the "verifiedAt" field.
Now, we have many e-mails for the same user with valid "verifiedAt" fields. Which one is the current one? We need another boolean for that (isCurrent). Selecting the last one doesn't make sense all the time, because we might have primary and backup mails, and the oldest one might be the primary one.
If we want to support multiple valid e-mails for a single account, we might need another boolean field "isPrimary". So it makes two additional booleans. isCurrent, isPrimary.
I can merge it into a nice bit field or a comma separated value list, but it defeats the purpose and wanders into code-golf territory.
Booleans are nice. Love them, and don't kick them around because they're small, and sometimes round.
And for is_current, I still think a nullable timestamp could be useful there instead of a boolean. You might have a policy to delete old email addresses after they've been inactive for a certain amount of time, for example. But I'll admit that a boolean is fine there too, if you really don't care when the user removed an email from the current list. (Depending on usage patterns, you might even want to move inactive email addresses to a different table, if you expect them to accumulate over time.)
I think booleans are special in a weird way: if you think more about what you're using it for, you can almost always find a different way to store it that gives you richer information, if you need it.
The ever growing set of boolean flags seems to be an attractor state for database schemas. Unless you take steps to avoid/prohibit it, people will reach for a single boolean flag for their project/task. Fortunately it's pretty easy to explain why it's bad with a counting argument. e.g. There are this many states with booleans, and this fraction are valid vs. this many with the enum and this fraction are valid. There is no verification, so a misunderstanding is more likely to produce an invalid state than a valid state.
How about using Booleans for binary things? Is the LED on or off, is the button pressed or not, is the microcontroller pin low or high? Using Enums, etc. to represent those values in the embedded world would be a monumental waste of memory, where a single bit would normally suffice.
In C++ you can use enums in bit-fields, not sure what the case is in C.
I'm with you by the way, but you can often think of a way to use enums instead (not saying you should).
enum Bool
{
True,
False,
FileNotFound
};
https://thedailywtf.com/articles/What_Is_Truth_0x3f_edit: The 24th of October will be the 20th anniversary of that post.
And usually you use operations to isolate the bit from a status byte or word, which is how it's also stored and accessed in registers anyway.
So its still no boolean type despite expressing boolean things.
Enums also help keep the state machine clear. {Init, on, off, error} capture a larger part of the program behavior in a clear format than 2-3 binary flags, despite describing the same function. Every new boolean flag is a two state composite state machine hiding edgecases.
Id prefer if they just added std::bitvector.
If embedded projects start using C standards from the past quarter century, they can join in on type discourse.
Sure we could store the data by logging the start timestamp and a stop timestamp but our data is stored on a time series basis (i.e. in a Timeseries DB, the timestamp is already the primary key for each record) When you are viewing the trend (such a on control room screen) you get a nice square-wave type effect you can easily see when the state changes.
This also makes things like total run time easy to compute, just sum the flag value over 1 second increments to get number of seconds in a shift the conveyor was running for.
Sure in my example you could just store something like motor current in Amps (and we do) and use this to infer the conveyor state but hopefully I've illustrated why a on/off flag is cleaner.
That boolean should probably be something else - https://news.ycombinator.com/item?id=44423995 - June 2025 (1 comment, but it's solid)
Nullable helps a lot here but not all languages support that the same way.
> But, you're throwing away data: when the confirmation happened. You can instead store when the user confirmed their email in a nullable column. You can still get the same information by checking whether the column is null. But you also get richer data for other purposes.
So the Boolean should be something else + NULL?
Now we have another problem ...
It's a surprisingly useful piece of data to have.
So, keep the Boolean, and use a log.
That's a terrible database design.
You can easily search through history. The point is, it is better to do this in the design of the database than in the design of the schema.
So: "No?" -> "Yes!"
If you're using a type system that is so poor that it won't easily detect statically places where you're not correctly handling the absent values, you do have a much bigger problem than using bool.
Booleans beget more booleans. Once you have one or two argument flags, they tend to proliferate, as programmers try to cram more and more modalities into the same function signature. The set of possible inputs grows with 2^N, but usually not all of them are valid combinations. This is a source of bugs. Again, enums / sum-types solve this because you can make the cardinality of the input space precisely equal to the number of valid inputs.
Oddly, almost noone has tried providing actual state machines where you have to prove you've figured out what the state transitions are.
@Nullable Optional<Boolean> foo;
For when 3 values for a boolean just aren't enough.Here are two rules I learned from data modelling and APIs many years ago:
1. If you don't do arithmetic on it, it's not a number. ID int columns and foreign keys are excluded from this. But a phone number or an SSN or a employee ID (that is visible to people) should never be a number; and
2. It's almost never a boolean. It's almost always an enum.
Enums are just better. You can't accidentally pass a strong enum into the wrong parameter. Enums can be extended. There's nothing more depressing than seeing:
do_stuff(id, true, true, false, true, false, true);
This goes for returning success from a function too.To be (somewhat facetiously) fair, that's just JSON. The key can be not-present, present but null, or it can have a value. I usually use nested Options for that, not nulls, but it's still annoying to represent.
In Rust I could also do
enum JsonValue<T> {
Missing,
Null,
Present(T),
}
But then I'd end up reinventing Option semantics, and would need to do a bunch of conversions when interacting with other stuff.If you know you actually care about the event, there are probably more fields to stuff into an event record, and then maybe you could save the event record's id instead?
But going too far in this direction based on speculation about what information you might need later is going to complicate the schema.
See Paul's comment in the other thread for more: https://news.ycombinator.com/item?id=44423995
Turning boolean database values into timestamps is a weird hack that wastes space. Why do you want to record when an email was verified, but not when any other fields that happen to be strings or numbers or blobs were changed? Either implement proper event logging or not, but don't do some weird hack where only booleans get fake-logged but nothing else does.
Should booleans turn into enums when a third mutually-exclusive state gets added? Yes, of course, so go refactor, easy. But don't start with an enum before you need it. The same way we don't start with floats rather than ints "just in case" we need fractional values later on.
Booleans are a cornerstone of programming and logic. They're great. I don't know where this "booleans are bad" idea came from, but it's the opposite of communicating intention clearly in code. That boolean should probably stay a boolean unless there's an actual reason to change it.
Although it always depends on what exactly you're really doing.
KISS, YAGNI, and then actually analyze your requirements to understand what the mature schema looks like. A boolean is the simplest thing that can possibly work, usually. Do that first and see how your requirements evolve, then build the database schema that reflects your actual requirements.
I think a lot of people misunderstand KISS, believing everything should be primitives or surface-level simplicity. Instead, I interpret "simple" not something like golang's surface-level readability, but infosec's "principle of least privilege". Pick the option that minimizes possible state and capture the requirement logic, rather than primitives just because they're "simple" or "familiar".
Even then, sometimes it's fine to violate it. In this case, (nullable) date time might be more preferable than boolean for future-proofing purposes. It's trivial to optimize space by mapping date time to boolean, while it's a total pain to migrate from boolean to date time.
Also, doesn't "... a weird hack that wastes space" contradict "Avoid premature optimization"?
Depending on the complexity of and user requirements the system, hard-coding roles as an enum could span the spectrum anywhere from a good to a bad idea. It would be a terrible thing if user-define roles were a requirement because an enum can't model a dynamic set of ad-hoc, user-defined groups. The careful and defensive planning for evolution of requirements without over-optimizing, over-engineering, or adding too much extra code is part of the balance that must be made. It could be a good thing if it were a very simple site that just needed to ship ASAP.
ck45•19h ago
cjs_ac•18h ago
cratermoon•18h ago