I can imagine them going "I had a perfect database schema that covered every edge case, and then..." with each bullet point.
This had never happened before.
Like, you don't even _change_ the IATA code of a live airport. To switch them was a huuuuuuuuuge assumption breaker for the industry.
https://www.youtube.com/watch?v=jfOUVYQnuhw
including (attempts at) a few in-depth reasons for why these quirks exists
My impression is that every single older (pre-2010) computer system that manages the Brazilian aviation felt for that and fixed it in a hack.
> Airports never move
Also, Runways never move. Also, if runways move, they don't change direction. Also, if airport or runways move, there will exist some construction work before.
I'd add "aircraft only land in runways" there too. And "ok, aircraft only land in runways and heliports".
Can you elaborate more?
It's pretty cool to be on a ferry and see a plane land basically next to you in the middle of the river.
One of my favourite planes were the Grumman Mallards still owned and operated by Paspaley Pearling out of Mungalalu Truscott and other Kimberley airbases.
They're classic 1950s twin-engined amphibious aircraft that landed anywhere up and down the Kimberley Coast for pearling transfers.
Myths programers believe about cars:
Cars in the same lane always travel in the same direction.
Each street has a name.
Each street has a unique name.
Each street has only one name.
Cars have four wheels.
Cars never move vertically.
Roads never move.
Roads never cross water without bridges.
When two roads cross, the do so at an intersection.
Take any field in human experience and one can make such a list.
All boats float. Ships are bigger than boats. Boats are slower than airplanes. Boats only travel on water.
That's exactly the point. The famous example (Falsehoods Programmers Believe About Names) has examples I have encountered in medical databases. If a programmer somewhere didn't fall into the trap, patient names in a medical database would have been better managed and may have avoided duplication, lost records, etc.
to the best of my recollection, the only way to tell a ship from a boat is to watch it make a "high" speed turn, ships lean out, boats lean in. But this is probably incorrect, just like all of my education was.
https://www.flightaware.com/squawks/view/1/7_days/popular_ne...
From that dead comment quoting a chat bot that clearly did not understand the question at all, I think maybe we can extract a single bullet point:
* “Edge cases” live only at the edges; they never creep into the middle.
But that's not much to build a post with.
1. I'll never need to learn a falsehood list, so I can skip it.
2. A falsehood list is complete at the time of writing.
3. OK, but it will surely get updated with new falsehoods and clarifications.
4. Skimming the falsehood list is all I need to do to learn it.
5. OK, but surely I'll remember to recheck the falsehood list once I actually need to, right?
6. If a falsehood doesn't immediately make sense to me, there must be something wrong with it, despite the author having domain expertise that I don't.
Literally had to point out just last night how UTC is not sufficient in all scenarios. I swear it happens every 6 mos on Reddit.
I know that there is a ICAO code on Mars (since I had read about it before).
I think there are some airports that have a ICAO code but not IATA code and vice-versa, and some have a "pseudo-ICAO" code with letters and numbers together.
Software unfortunately follows rigid rules so the challenge is finding a set of rigid rules that can encompass reality. It would be pretty natural if you were writing a database schema that a flight would have a departing airport and an arriving airport— but alas.
A theme running through the article is "this value is unique " and "this value does not change". And of course those are both wrong.
So when designing databases now I assume "everything changes, nothing is unique " (even when the domain "expert" professes it is.)
This approach solves so many problems and saves something time later on when it turns out that that "absolutely, positively, unique for ever" natural key, isn't.
UUID keys PLUS some form of versioning with creation dates will let you change an airport name and let you know what the airport name was on some arbitrary date in the past. Useful for backfills and debugging
So now some information comes in from outside the system that something happened with a plane, and you still have to find which surrogate id that plane has in your system.
You may decide two things happened to two different planes whereas another system consider it the same plane both times, and vice versa.
A UUID is at best 2x larger than even a BIGINT, thus the index size is 2x larger. If you aren’t using v1 or v7, it’s also not k-sortable. But most importantly for MySQL (and optionally SQL Server) if the table contains things related to a common entity, like a user’s purchases, the rows are now scattered around the clustering index’s B+tree. That incurs a huge amount of I/O on large tables, and short of a covering index and/or partitioning (which only masks the problem by shrinking the search space), there is no way to improve it. If instead the PK was (user_id, some_other_identifier), all records for a given user are physically co-located.
It is by no mean specific to programmers. Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
What is specific to programmers is that their tool performs at its best with simpler rules, so their job is to find the necessary and sufficient set of rules - and will dismiss most of the cases pointed by this article as unimportant exceptions the software won't handle.
I took French in middle school, and it was always a running joke that the teacher spent the first 5 minutes on the rule, and the next 40 minutes on the exceptions.
I'd argue that programmers are indeed much more aware of how many exceptions and edge cases most real world domains have. Ask a lay person about such a simple thing as leap seconds, for instance, and they'll often believe you're making shit up.
In the map everything is clear. It is clear what a "plane" is what "airports" are and what their relationship is. And transferring that into a computer program is straight forward.
In the territory everything is fuzzy. None of the definitions are without edge cases and the expected relationships are often violated in surprising ways.
Aviation isn't unique here, every system suffers from the distinction between its actual function and the abstract description of that system.
Aircraft do not have a singular unique identifier that is time invariant.
While it is true that aircraft have serial numbers issued to their airframe, by itself, aircraft serial numbers are not unique.
The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number.
I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
The combination of ICAO aircraft type designator + serial number approximately is the most permanent identifier for an airframe - and even then - if an airframe is modified significantly enough that it no longer is the previous type - even then this identifier can change.
Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
P.S. For those who might ask - aircraft registration numbers are like license plates, so they change - tail numbers can be ambiguous and misinterpreted depending on what is painted on the aircraft where, and ICAO 24-bit aircraft addresses are tied to ADS-B transponder boxes, which technically can be moved and reprogrammed between aircraft also.
Is that allowed?
It was worth it because without that, a home built airplane would have an experimental certificate and you couldn’t sell rides in it.
In other words had Virgin Galactic built the VSS Enterprise around the data plate of a Cessna 172, would it then no longer have been an experimental aircraft?
Plus year of production if necessary.
I’ve seen programmers attempt deduplicate humans by language spoken.
(No racist intentions here, but you bring up both points and I thought that to be interesting)
The son of John who is a smith
I'm only joking a little. Funny thing, surnames aren't actually that old for Europeans. Most of history there'd be maybe two people with the same name. They solved it back then very much the same way we solve it now.
https://en.wikipedia.org/wiki/Category:Occupational_surnames
If you've ever spent time in old car forums, you learn that even this isn't enough because of production-line sloppiness.
Serial number re-use is rare, but it happens. Usually because a product had something detected that resulted in remanufacturing, but sometimes other things slip.
How is that supposed to help? If two people have the same name, it's overwhelmingly likely that they also speak the same language.
The operators, such as Delta, do not actually own engines on the aircraft they fly, even though they own the aircraft. The engines are rented from e.g. Pratt & Whitney along with a maintenance contract. That said, that engines are in fact installed at the factory.
> patent that involves defining that as a unique identifier for aircraft.
Now i got mighty curious what makes this novel enough to be a patent.
It boggles my mind that despite not having some sort of universal system things work as well as they do.
Aviation grew up relatively insular, and each country that had any sort of aircraft manufacturing did things their own way until fairly recently. Arguably, the first half of the history of aviation is a kind of free-for-all. The fact that we now have a globalized airline industry that mostly follows some kind of standards is the mind-blowing part to me. And I suspect if we weren't mostly down to a dozen or so manufacturers for the vast majority of airliners, even that wouldn't be the case.
What if a new aircraft were made 50/50 from the parts of two older aircraft
Reminds me of a list that came up ages ago that presented an assumption of "X code always runs" with the counterpoint that you could unplug the computer. Ok sure, but then why write software at all? Clearly no point assuming any code will ever run since you can just terminate the program at any random time.
That said, I do appreciate some of these lists--which maybe has put you on edge to the paradigm--do have an edge to them... but, in all honesty, I think they should? The bugs and edge cases that these lists tend to expose aren't random glitches that equally affect every user: they usually segment users into the ones whose lives "follow the happy path" (which often just means "are intuitive and familiar to the culture near the developer") and the users who get disproportionately (or even continually!) screwed every time they dare interact with a computer.
And like, it is actually a problem that the other side of this is almost always a developer who doesn't really give a shit and considers that user's (or even an entire region/country's) existence to somehow be a negligible statistic not worth their time or energy, and I really do think that they deserve to take some flak for that (the same way I try to not get offended if someone points out how my being a cis-het white male blinds me to stuff: I think I deserve to get held to task harder by frustrated minorities rather than force them to be nice all the time in a world that penalizes them).
It would have been cool if the blog post discussed those outcomes so we can reason about it properly, otherwise it's just a list of claims at face value. If the programmer making an assumption means a screen at a gate says the wrong boarding time when there's a human there controlling the boarding, then not the end of the world. But if the programmer making an assumption causes 1/10000 flights to crash, then that's interesting and worthwhile calling out. It's just endless speculation without a proper outcome to tie it down.
When designing data I think these questions (skepticisms) should be front of mind;
1) natural values are not unique.
2) things identified by number are best stored as a string. If you're not going to do math on it, it's not a number. That "customer number" should be treated as "customer id" and as a string.
3) be careful constraining data. Those "helpful checks" to make sure the "zip code is valid" are harmful not helpful.
4) those tiny edge cases may "almost never happen" but they will end up consuming your support department. Challenge your own assumptions at every possible opportunity. Never assume anything you "know" is true.
It's hard to measure time saved, and problems avoided, with good design. But it's easy to see bad design as it plays out over decades.
And (especially today) never optimize design for "size". Y2K showed that folly once and for all.
This implies denormalization, which is rarely needed for performance, despite what so many believe. Now you’ve introduced referential integrity issues, and have taken a huge performance hit at scale.
> 3)
I mean, maybe don’t try to use a regex on an email address beyond “is there a local and domain portion,” but a ZIP code, as in U.S. only, seems pretty straightforward to check. I would much rather have to update a check constraint if proven wrong than to risk bad data the rest of the time.
> never optimize for size
Optimize for size when it doesn’t introduce other issues. Anyone working on 2-digit years could have and likely did see that issue, but opted to ignore it for various reasons (“not my problem,” etc.). But for example, _especially_ since Postgres has a native type for IP addresses, there is zero reason to store them as strings in dotted quad. Even if you have MySQL, store them as a UINT32, and use its built-in functions to cast back and forth.
These lists hopefully make programmers aware that a lot of their assumptions about the real world might be wrong, or at least questionable.
Examples are assumptions on the local part of email addresses without checking the appropriate RFCs. Which then get enshrined in e.g. JavaScript libraries which everyone copies. I've been annoyed for the last 30 years by websites where the local part is expected to be composed of only [a-z0-9_-] although the plus sign (and many other characters) are valid constituents of a local part.
Or assumptions on telephone numbers. Including various ways (depending on local culture) of structuring their notation, e.g. "123 456 789" versus "12-3456-89" where software is too dumb to just ignore spaces or dashes, or even a stray whitespace character copied by accident with the mouse.
And those forms where you have to enter a credit card (or bank account number) in fields of n characters each, which makes cut/copy/paste difficult because you notes contain it in the "wrong" format.
So while some examples may count as "just usability" it all stemps from naive assumptions by programmers who think one size fits all (it doesn't).
> There are a lot of assumptions one could make when designing data types and schemas for aviation data that turn out to be inaccurate.
Sounds like a pretty explicit acknowledgement of the notion that these are otherwise reasonable assumptions that just happen to fail when put to the test, I'd say.
It's very easy to self-deprecate, especially if one has insecurities. But that doesn't mean that articles like this actually mean to do so. I think it's worthwhile for everyone involved to always evaluate whether the feeling is actually coming from the source you're looking at, or if that source just happened to trigger it inside you. More often than not, in my anecdotal experience, it's the latter.
I'd also find it interesting to learn what happens when these falsehoods nonetheless make it into an implementation though.
Mostly confusion, but the combination of aviation and confusion can be dangerous and even deadly. Not directly related to this list, but I'm reminded of [1]: no one entity has set out to inconvenience the hapless traveler, but the combination of history and practice are a constant source of irritation, and at the times of heightened tensions and security might even lead to scary incidents. All because of the name.
[1] https://travel.stackexchange.com/questions/149323/my-name-ca...
Eventually you end up having to make choices and deal with the consequences. Otherwise Jordan Peterson would have you chasing your tail for days about what a "choice" is, and nothing would ever get done.
tl;dr: just make your best guess and always include an extra "notes" column where things can get leaky.
Notes / data / extra et. al columns are the worst, as a DBRE. People inevitably shove various shit into them over time instead of making an effort to properly fix past mistakes, and at some point, they practically contain their own table.
Aside: is there a notation for such constraints?
Not all the things in the list, because I am aware of those. I might have missed the runway numbers changing based on shifting magnetic field of the earth, but that's a thing too. Runway 22? That's now Runway 21.
But why programmers specifically would believe this, as opposed to ... any other profession that is not aviation?
I don't read it as programmers specifically believing that, is that they're specifically treating these things as invariants in their projects.
Also, feeling myself stupid very quickly. Very nice summary, bravo!
Isn't that blindingly obvious? If so, how did it get to be a patent? And is someone now extracting rent from it?
> A method for inducing cats to exercise consists of directing a beam of invisible light produced by a hand-held laser apparatus onto the floor or wall or other opaque surface in the vicinity of the cat, then moving the laser so as to cause the bright pattern of light to move in an irregular way fascinating to cats, and to any other animal with a chase instinct.
How on earth is anyone supposed to be able to take the patents system as a whole when there are 100s (if not 1000s) of examples like that, which obviously shouldn't be approved if "novel" or "non-obvious" ideas are required.
https://patents.google.com/patent/US6360693B1/en
The US patent system seems profoundly broken. Given that the patent system seems much less broken in other developed countries and the vast wealth and resources of the US, I assume it is broken on purpose?
It's not just the technology, it's the employment of it too. In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
Put another way, the change can be incremental. Building upon what is. Without this, pretty much all incremental science would lose funding, for the moment you invent, regardless of cost, it'd just be copied.
If you've ever done hardware, even a toy, it's not simple.
Extensive prototypes, testing for drops, hand fit, assembly at the factory, and more.
Devs today can't even conceive of making a 100% stable product to be shipped on floppy and never updated. Reshipping for bugfixes could break a company in the old days.
Now try that with hardware!
And all those tweaks, fixes, tests can be copied in a second without patents.
I think separating software and hardware patent discussions would be better here, because hardware patents are requied.
I think your timescale is slightly off, but I don't know enough about laser history to say definitely. But judging by what I could find, in 1981 Popular Science seems to have run an ad for laser pointer devices, aimed (no pun intended) towards consumers:
> It wasn’t until the 1980s that lasers became small enough, and required so little energy, that they finally became cheap enough to be used in consumer electronics — take this funky laser pointer from the early 1980s, for example. The November 1981 edition of Popular Science features a Lasers Unlimited advertisement for an assortment of laser pointing devices, including a ruby laser ray gun, a visible red laser lightgun, multi-color lasers and laser light shows, all of which were selling for less than $15 (equivalent to about $42 today) - https://melmagazine.com/en-us/story/a-dazzling-history-of-th...
So if they became usable but consumers in 1980s, I'm about 99% confident at least one individual used it for playing with their cats.
But since the author of the patent just happened to have spent the time (10 years later) to write the patent, they got it awarded to them.
Hehe, I was once told we couldn't land at our destination A, so we got diverted to B; while on our way to B we were told we are actually going to C; and, while on our way to C, A became available again so the plane did a U-turn and we flew back to A, landing with a ~3 hour delay.
The cause was snow and wind.
What made the corresponding lists for names and time interesting were that it was genuinely surprising to realise that their statements were actually false. I don't get that feeling with these.
Like the top level comment about identifiers for airplanes -- why would they have them? That sounds baffling to me. With ownership changes, continuous upgrades, extending airframes, repurposing etc. I would be surprised if there was a stable identity.
* Programmers believe they are handling all possible configurations of the universe when putting something into production.
* Programmers don't handle all possible configurations of the universe when putting code into production because they don't know any better.
Falsehoods people believe about the universe:
* There exists a constant.
* SI units are constant at all times or everywhere.
* When a new corner case appears, it is easy to adjust the program to handle it.
We detached this comment from https://news.ycombinator.com/item?id= 44207171 and marked it off-topic.
Things like flight numbers not having reasonable semantics, or conceptual pollution of what a flight is to include multiple take offs and landings are bad design, plain and simple. Just model the problem correctly e.g. maybe a Trip is multiple Flights, or Flights have multiple Legs. This isn't aviation specific. These are generic problems that programmers can and should get right.
Some of it is intrinsic to the domain, like flights not all having gates, or not landing at airports. That was a new tidbit for me.
it doesn't matter whose failure it is
the point of the article (just as with the one about names) is that there are "reasonable defaults" many people would believe - that don't work in practice and become gotchas
whether you have enough knowledge to know that something is unreasonable doesn't mean it doesn't seem reasonable for many others
The fact remains that software that models real-life events or information is making normative assumptions about what can and cannot happen in the domain, due to the very nature of software, and these assumptions are knowingly-or-not being introduced by programmers. If for any given domain we had hundreds of human notaries, scribes or typists managing information instead of software, their mistaken assumptions wouldn't matter—they would simply go "Oh, that's odd", make the necessary adjustments, and learn from the experience. But as long as software is a prescriptive model of what it is representing, it will be valuable to highlight the "falsehoods" that its creators may accidentally prescribe into it.
[1] https://www.ncei.noaa.gov/news/airport-runway-names-shift-ma...
FabHK•11h ago
Having said that, many of the links are very informative. For example the crater on Mars that has an ICAO airport code [2]: "On 19 April 2021, Ingenuity performed the first powered flight on Mars from Jezero, which received the commemorative ICAO airport code JZRO."
[1] https://www.flightaware.com/live/flight/PDT5965/history/2025...
[2] https://en.wikipedia.org/wiki/Jezero_(crater)
freeone3000•7h ago