The Datavault Builder follows a standardized Data Vault integration model. The core itself contains data of different source, and can therefore be split up into three different categories:
Persistent Staging Area: Historization of the source in the Datavault by using the technical identifier (Source Vault). This is needed, if the business key is not available in the source column. (PSA is optional)
Raw Vault: Object historization on the business key.
Business Vault: Historization/Persisting of applied business logic.
Holds a 1:1 copy of the extracted data from the source. Is transicient, meaning, the data will be cleaned and repopulated with the latest extract of data coming from the source in the next load.
Persistent Staging Area¶
The PSA is optional.
Alternatively, a PSA can as well be realized based on purely technical drivers, for instance if the Business Keys are not yet defined (or to automatically build a Source Vault based on the Primary Keys of the Source).
As a basic principle, between PSA and Raw Vault NO RULES are applied. The only exception to this is cleaning of Business Keys (for instance duplicates, which otherwise couldn’t be loaded into the Raw Vault). In this case however, a link should be built to document, which records from the PSA were integrated in the Raw Vault.
The primary way from the source into the Datavault is however the Raw Vault, where the integration into the hub is done using the business key.
After the Raw Vault, Business logic may be applied.
Based on these explanations, the Raw Vault (mainly) and PSA (in addition) are in combination the single source of facts.
Business Object / Business Rules¶
Business Objects & Business Rules should however not be misinterpreted as the Business Vault, as we will see in the next paragraph.
The Business Vault is a materialization of the applied business logic from the Business Rules. Therefore, a new source based on the business rules in the Datavault Builder itself is created, allowing to loop back the output of the applied business logic and store it in a historized manner.
The Accesslayer is the virtualized interface layer for a target system. Here the data is presented either as flat tables, a dimensional or a star schema model.
The Errormart is the virtualized interface layer for data quality analysis. Based on error views from the business rules layer, faulty data will be presented in the access errormart.