eg. you might have some bug CSV uploaded and your have number of rows in it, your app could insert record without this number and async process would fill that later.
there might be even some corner case where null value is possible
I believe solution here isn't to check what fields do not use null, but to improve process of creating such migration. either you should create second ticket for next release to update db or commit new migration to some kind of next branch.
The harm is the same as any other unreliable linter rule.
Each one such rule is almost harmless. And on most places that use that kind of rule, they are extremely harmful.
I envy your team who's only mistake is to forget setting NULLABLE. Rainbows and unicorns ;)
It's more common in string fields, which in many cases just get rendered on a web form that doesn't differentiate blank (empty string) from null state, therefore in the database we should in most cases set it up as follows:
- Value required: field should be non-NULL + at least length 1 (via check constraint)
- Value optional: either field is non-NULL, or field is nullable + at least length 1
I'm curious if you prefer to store optional strings as non-NULL and rely on the length, or as nullable and have a length constraint.
I've had the firsthand experience building a consumer for that kind of DB and it's hell to get anything running correctly without literally writing a dedicated layer to sanity check everything.
It's not attractive to developers for a variety of reasons, but encapsulation is generally cheaper (in every sense) than the alternatives by orders of magnitude. If a system is hard to alter and other people rely on it to do their jobs, thinking about it as set in stone during my planning process will save me grief and heartache.
Starting with an independent client in mind makes it easier to plan around things like reconstructing tables from composed exports or dealing with authoritatively incorrect data. It's a good collection point for hard-coded edge case handling, and a great location for all of your discovery notes.
Each process can either have predefined fields (take time to implement) or key-string value store (yes, they chose this).
Either the BPMN logic or some JSs (some, as not all people can code) in PBMN nodes or some Java hooks (a lot, wrote under those other people requests) edit those values.
So when something's wrong, we swam in the sea of key-value, search all the code for who update this field, and why it's value is not JSON nor null nor boolean but a string "Not used, waiting for policy 1234".
After that project, I'm happy every time I can use a DB to store data, more so if it can run in local, much more so if it's relational.
ALTER TABLE foo WITH NOCHECK ADD CONSTRAINT CheckNotnull CHECK (id IS NOT NULL)
any new values coming in cannot be null but the values already in the table with null are fine... then you can update them to not null over time
This reminds me of frozen/nonfrozen enums in Swift. You can do exhaustive case analysis on frozen enums, but case analysis on nonfrozen enums requires adding an `@unknown default` case.
https://docs.swift.org/swift-book/documentation/the-swift-pr...
Could you give an example? Null _is_ absence, the way I see it
> Many other useful performance improvements, including the ability to avoid a table rewrite for ALTER TABLE ... ADD COLUMN with a non-null column default
https://www.postgresql.org/docs/release/11.0/
I think there is some restriction there, like the default can't be "volatile" - I can't remember the precise definition here but I think current_timestamp would be volatile, but any static value would not.
Also interesting, `now()` is non-volatile because it's defined as "start of the transaction". So if you add a column with `DEFAULT now()` all rows will get the same value. But `timeofday()` is not volatile, so `DEFAULT timeofday()` is going to lock the table for a long time. A bit of a subtle gotcha.
> But `timeofday()` is not volatile, so `DEFAULT timeofday()` is going to lock the table for a long time.
Perhaps the “not” was a typo?
There should also be a [the-asteroid-has-hit-y'all-are-so-stupid] and global data systems should just just pass that around after impact until the power goes out for good.
for example
create table foo(id int) insert foo values (1), (2), (3)
insert foo values (null)
select * from foo
id
1
2
3
NULL
ALTER TABLE foo with nocheck ADD CONSTRAINT CheckNotnull check (id IS NOT NULL)
insert foo values (null)
Msg 547, Level 16, State 0, Line 13 The INSERT statement conflicted with the CHECK constraint "CheckNotnull". The conflict occurred in database tempdb", table "dbo.foo", column 'id'. The statement has been terminated.
However be aware that if you update an existing value to NULL, you will still get the error
update foo set id = null where id = 2
Msg 547, Level 16, State 0, Line 20 The UPDATE statement conflicted with the CHECK constraint "CheckNotnull". The conflict occurred in database "tempdb", table "dbo.foo", column 'id'.
Firstly, it’s more practical and brings fewer surprises to all people involved (especially if there are many people involved).
Secondly, if we’re being pedantic, it is a mere acknowledgement of the fact that the field was not there before and things worked without it, and now the field is here, so in all senses the field _is actually_ optional, so all the touching code should know how to handle this.
lblume•16h ago
This seems to be the central claim. But just as lies require intent, so does database design to some degree.
A column that is nullable but never null does not conclusively say anything, really. That's like saying a birthday column that is never before 1970 in the current data should be restricted to years after that date.
A nullable column signals that the data may be left empty, which is entirely different from that column actually having to be empty with the current data. Is-ought distinction, the potentiality of null ("nullability") is not to be confused with the actuality of current vacancy ("null values"). The programmer extracting data from the database must check for null if the data is nullable, not just if is null now as a principle of robust fault-tolerant code.
matsemann•16h ago
tuyiown•16h ago
But even non-nullable does not always resist to time, I'd argue that use cases where the field _has_ to be null eventually emerges and somehow have to be mitigated. There is no easy solution to safely workaround that without either tons work that duplicates lots of things or taking risks by adapting the code base.
darkwater•16h ago
lblume•16h ago
jerf•16h ago
One could write a separate piece about maybe using that as a clue that the field could be NOT NULL'd in the future but that's not what this post is.
lblume•16h ago
dataflow•16h ago
No it's not, because they specifically started with the premise that the field was initially intentionally non-null and was only temporarily made nullable for migration purposes. That is obviously not the situation you are describing, right?