Loading

Problem

Possible Cause

Possible Resolution

Module

Job failed to complete

Data Structure in the Source changed

Hotfix: go in the webgui to Operations / Jobs.
Select the affected job.
Change to the selection mode.
Deactivate temporary the part of the job which was failing if it makes sense to run the job without this part.

operations

Job failed to complete

Duplicate Key

The business process changed or the assumption about the business process was wrong. You need to involve a Data Analyst from this subject area.

operations

In the staging, I am getting the message Error decrypting source password

1. A rollout was perform from one environment onto the next one without setting any passwords on the target environment.
2. The system encryption key and/or password was changed in the configuration of the environment. Therefore, the encrypted passwords can not be decripted anymore.

Connect onto the environment and manually set the system password.

staging

How is the raw data historized after loading, so it can be reloaded in an error case?
  • In the data integration flow, the data is only historized and persisted once it is loaded up to the data vault (persistent staging / raw vault / business vault). The staging itself will always be deleted when a new dataset is loaded. In case that there is a need to historize the data as it arrived in the staging area, there are two approaches:

    • Define a persistent staging load (based on the technical key of the source) to load the data into the data vault before doing the integration on the business key.

    • Write a manual process, which is triggered after each staging run to archive the loaded data.

What is the difference between a general subset load and a delta load?
  • The difference between a general subset and a delta subset is the way the Datavault Builder handles implicit deletions in the source. Therefore, this mainly has an impact on the historization of data in the data vault.

  • With an active General Subset clause, the loaded data set is viewed as if it is still the complete data set from the source and is treated as a full load. This means that if a key value is no longer provided by the source, it will be marked as deleted in the tracking satellite. This is the same behavior as a normal full load, simply with a certain reduced data set, which is always the same.

  • An example of the use of this type is customer data: We are a country branch of an international company and always extract only the customer data of our own country.

  • With a delta subset, only part of the data is extracted from a single source, mainly for performance reasons. In this case, the loaded data is regarded as an effective delta. For historization, this means that if a key value does not appear in a load, it cannot be regarded as deleted. This is because the key may not be in the loaded subset. Therefore the information of the Last Seen is carried in the tracking satellite. This information can also be used for permanent delta loading.

How can bitemporal data be loaded (e.g. there is a business validity already in the source)?
  • If there is already a business validity in the source, there are two different time slices: The business validity in the source and the loading time (knowledge period of the DWH). To load such a business validity, you can create two hubs in the model:

    • The first for the main object (for example, a contract; business key: contract number).

    • The second for the business versioned object with business validity (for example, contract version; business key: contract number + valid from).

  • In this sample, if you then have “claims”, those will always point to a specific “business validity”-version of the contract and will therefore be linked to the business versioned object.