In this part of the documentation, we will walk you step by step through all modules and dialogues in the Datavault Builder GUI.

Staging

The staging area is used to connect the Datavault Builder to sources, such as databases, files or even REST-Apis.

../_images/staging.png
  1. Search-Bar

    Allows to filter the Staging-List for System Name, System ID or Staging Table.

  2. New Source

    Allows to define a new source connection by opening up the New Source Dialogue.

  3. Add Table

    Opens up the Add Table Dialogue to add a new staging table for a previously defined source system.

  4. System Actions
    ../_images/systemActions.png
    • Plus: Opens up the Add Table Dialogue with a prefilled source system.

    • Circled arrow: Initiates a full load of all the defined staging tables of the system. Color/Hovering indicates loading status.

    • Screw nut: Opens up the Edit Source Dialogue, which allows to change all editable properties of the New Source Dialogue.

    • Trash bin: Removes the Source System. This action is only available if no dependent staging tables or jobs are defined.

  5. Table Actions
    ../_images/tableActions.png
    • Circled arrow: Initiates a full load of the specific staging table. Color/Hovering indicates loading status. While loading, the button can be used for cancelling.

    • Screw nut: Opens up the Edit Staging Table Dialogue, which allows to change all editable properties of the Add Table Dialogue.

    • Trash bin: Removes the Staging Table. This action is only available if no dependent data vault loads are defined.

  6. Table Row

    Lists the Source Table, Source Schema, Staging Table and last succeeded load date and duration. Clicking onto the Source Table* or Staging Table opens up the Data Preview. (*if supported)

  7. System Row

    Lists the System Name and System ID.

  8. Subset Status
    ../_images/subsetStatus1.png

    Load definition has/had a General Subset Where Clause.

    ../_images/subsetStatus2.png

    Load definition has/had Delta load Subset Where Clause.

    Coloring:
    • black: no load has yet been executed but a subset where clause is defined

    • green: a load has been executed with the current subset where clause definition

    • orange: a load has been executed but in the meantime the subset definition has changed

    • red: a load has been executed but in the meantime the subset definition has been removed

Connecting a new source

Definition

A source is a connection to a data origin. This origin can be either a database, files or even a webservice.

Steps
  1. Navigate to the Staging Module.

  2. Click onto New Source.

  3. Fill out the Base-Tab.

  4. Fill out the Connection-Tab.

  5. Confirm the creation on the Review-Tab.

Base

../_images/base5.png
  1. Source System Name

    Declaration of the Name to be displayed for the System. Mandatory; Editable.

  2. Source System ID

    Declaration of the System ID, which is used for the naming of the technical implementation (such as tables, views, …) related with that system on the database. Mandatory.

  3. Comment

    Individual notes about the system. Will as well appear in the documentation. Editable.

Connection

In this step, the connection properties are declared. The Datavault Builder can connect to any source providing a JDBC-Driver. You can add you own specific drivers.

Note

Please directly contact us, if you are missing a certain default driver

../_images/connection.png
  1. Connection Type

    Declaration of the source type to connect to.

  2. Source Type Parameters

    Depending on the chosen Connection Type, the Datavault Builder will require you to fill in the connection properties.

  3. Connection String

    Based on the declared connection properties, the jdbc-connection string is put together. The string can directly be manipulated to use properties, which are not yet possible by using the source type parameters block.

  4. Test Connection

    By pressing this button, the Datavault Builder tries to connect to the source by using the specified connection string. The test result will return next to the button within a couple of seconds.

Adding a staging table

Adding a staging table will automatically define a staging load.

Definition

A staging load is a 1-to-1 copy of the data from the source.

Usage

These loads can be done as full or as delta loads. The underlying logic to pull a delta from the source can be specified within the creation dialogue of each staging table.

The result of the dialogue is a table in the staging area, consisting of fields with the same or a most similar datatype as the original type in the source. Length-restrictions for certain datatypes are removed, so the data can also be retrieved if a fieldtype is edited in the source. In connection with the built in loading logic, the added staging table can immediately be loaded.

Steps
  1. Navigate to the staging module and click onto Add Table.

  2. Select an existing system on the base tab.

  3. Select the columns to load on the columns tab.

  4. Fill out the details tab.

  5. Confirm the creation on the review tab.

Base

../_images/base6.png
  1. Source System

    Select a previously defined Source System from the list.

Columns

../_images/columns1.png
  1. Source Schema

    By scanning the source, the Datavault Builder supplies a list of available schemas. When using Flat Files as a source, the placeholder “Default Schema” shall be used.

  2. Source Table

    A specific table within the chosen schema can be chosen. When using Flat Files, this relates to a specific file.

  3. Available Columns

    List of the available columns in the source table, with column ID and column type

  4. Selected Columns

    List of the chosen columns which should be added to the staging table.

  5. Available Columns Actions
    ../_images/columnsActions.png
    • Ordering of the columns either by order or name

    • Filtering of columns by name

    • Magnifying Glass: Open up Data Preview (if supported).

  6. Column Selection

    Columns can be added to the selected columns using the buttons in the middle, by double-clicking or drag&dropping.

Details

../_images/details.png
  1. Staging Table Name

    Declaration of the displayed name for the staging table. Editable.

  2. Staging Table ID

    Declaration of the ID used in the technical implementation related with the staging table.

  3. Batch Size

    Activation of batch-based loading from the source as well as specification of the batch-size to load in. Editable.

  4. Delta Load Clause

    Specification of a SQL-based where clause statement to perform delta-loads. The where clause can be parameterized using {{parameters}}, to which values can be assigned on execution in the jobs module. This way, it is as well possible to declare different delta loads for the same staging table. Editable.

  5. Comment

    Comment for the staging table, which will as well appear in the documentation. Editable.

Source Specific Configuration

For certain sources, specifc configuration questions may arise.

Generally, you can use the configuration wizard to connect to a source. Based on the wizard, the connection string is automatically derived. If you are missing options in the wizard, then the connection string can as well be directly edited, offering you more advanced configuration options.

CSV

Detailed documentation and advanced parameters to load from a CSV files is available at http://csvjdbc.sourceforge.net/doc.html - To load from a TAB-Delimited file, use %09 as separator.