The reason the author found that data modeling is 'dead' is that the Modern Data Stack promised that you could transform your data later, and so many people never got around to that. Long live the data swamp!
What is forgotten is the data governance and the data quality, which results in, yes, data swamps as far as the eye can see and hordes of "data scientists" roaming around hoping to find actionable "gems".
With the advent of unlimited storage and separation of computer and storage, dimensional data modeling would only be possible if there was strong data governance in a system like SAP or a COE.
Now what is this dump good for? It's just bunch of bytes of information which now needs to be interpreted. There's different perspectives (sales vs manufacturing vs procurement vs finance etc). There's data quality issues that need to be identified and resolved. There's PII and other compliance stuff. You have to watch out for giving permissions to sensitive information (ever dealt with payroll data? It's fun) Your data dump isn't doing any of that by itself. And I think people tend to simply stop at the data dump stage and then give access to analysts and data scientists and tell them to go do reports and outbound data feeds.
With obvious results.
Then you look under the hood of the dashboards, only to see that not a single one follows the official definition of the business.
tietjens•4mo ago
tremon•4mo ago
me_bx•4mo ago
Well thought, sophisticated ways of modeling data for analytics purposes -using established approaches - are being replaced by just pulling data from the data sources - with barely any change in the source structure - into cloud data platforms.
In the past we used to model layers in a data-warehousing infrastructure each with a purpose and a data modelling methodology. For instance, an operational data store (ODS) layer, integrating data from all the sources, with a normalized data structure. Then a set of datamarts, each of them containing a subset of the ODS content, in a denormalized format, focused each on a specific functional domain.
We had rules, methods to structure data in order to get performant reporting, and a customer orientation.
Coming from this world, it seems like data governance principles are gone, and it feels like some organisations use the modern data stack same way as each analyst would be doing their own Excel files in their own corner, without any safeguards.
icedchai•4mo ago