Do we still need data models? What is the value of models? Why did data modeling failed in the past? How can we create data models? How can we create value by data modeling? All of this I try to answer in the following presentation.
The interesting question is: Is data modeling dead? And this is not a hypothetical question. It started some years ago when I was at a conference in the US, and some very famous data modelers, who had been working for big insurance companies for 30-40 years, asked: ‘Is data modeling dead?’ They were doing their job, creating value, but they didn’t feel appreciated anymore. The value of data modeling wasn’t seen as much as before.
So let’s start by understanding what data modeling is and why we need it. For context, I work for a data warehouse automation tool called Data Vault Builder. We believe in working on business models and automatically converting them into working code. This is my personal perspective.
Now, let me share my professional experience and a simplified history of data modeling. I began working in the data warehousing area in the year 2000. Initially, I worked for a financial institution where we focused on data reconciliation. Later, I transitioned to working on telecommunications variables and led a data management team for a telecom provider. We dealt with more complex tasks.
In 2012, I faced challenges with agile data warehousing and near real-time data warehousing. I tried using data vault as a modeling paradigm, and it worked well. However, we realized that automation was necessary for it to be effective. This realization in 2012 laid the foundation for Data Vault Builder. Since then, I have been with the company, and we have been developing this tool as a software renderer.
Today, my role involves pre and post-sales activities, as well as conducting presentations to train and enable people to exchange ideas within the community.
So what do models mean to me? They provide meaning to complex things by simplifying them. A model is usually a simplified representation of something more complicated. If you look up “ontology” on Wikipedia, you will find different kinds of elements. We have classes or concepts, properties, relationships, axioms, and constraints. These basic elements of data modeling have remained consistent throughout history, although there may be variations. The core ideas remain the same. This means that if you learn data modeling at one point in time, you will be well-prepared for the future, for many years to come.
Let’s take an example of a tree. A real tree with all its leaves and branches has thousands of different aspects. Trying to capture a real tree in a database or a file would be enormous, and often unnecessary for analytical purposes or in a business context. Instead, we can create a simpler model that represents key characteristics. For example, we can use a model with green parts, brown parts, and the shape of a tree. Even with this simplified representation, if I show you the picture, you will understand that it represents a tree. The simplest representation would be just the outline, and yet you would still recognize it as a tree. Counting the number of trees would be straightforward. This level of simplicity is sufficient for many applications.
There is a great picture that illustrates the relationship between the real thing and its model.
It says, ‘This is not a pipe’; it’s a picture of a pipe. It’s a simplified, two-dimensional representation, but you can still recognize what it is. However, it’s important to remember that it’s not the real thing. We can use these kinds of models to simplify and explain things.
Now, let’s consider the forest in this picture. If I show you this picture, you might say it’s a nice forest, but what is it all about?