Staging / Ingestion Module

Build your own integrated data model by fetching data from various data sources – internal and external data.

Databases

Connect to all databases and other sources providing a JDBC driver.

Databases

Oracle, MSSQL, Big Query, MySQL, Azure Synapse Analytics, DB2, Amazon Redshift, Maria DB, Postgres, SAP Hana and many more....

Files

Connect to all possible file formats as CSV, Excel, Iceberg and Parquet files.

Files

CSV, JSON, XML, TSV, fixed width files, Parquet, Iceberg and most important: Excel files 🙂

No SQL

Connect to NoSQL sources as Delta Lake, Kafka or Mongo DB

No SQL

Delta Lake, Elasticsearch, Hive, Kafka, Kinesis, MongoDB, Prometheus, Redis, Thrift and others....

Python Classes

Connect to online applications providing a Python class for connection

Python Classes

Google Analytics, Salesforce, Exact Online and whatever you find as Python class

Compatible with other ingestion tools

Combine Datavault Builder with the database built in ingestion tools or 3th party ETL tools.

Compatible with other ingestion tools

Use ingestion by Snowflake, Exasol or Microsoft Synapse Analytics dbs or use ETL tools like Informatica, Matillion or Talend

Batch Loading

Connect to any data source out of the box and let Datavault Builder calculate which data did change since the last load. 

Delta Loads

Based on a source column Datavault Builder can filter the data already before staging.

CDC Loads

Datavault builder is able to consume CDC streams from MSSQL Servers, Qlik Replicate, Golden Gate, Kafka and others and interpret the ingestion time order correctly.

Near Real Time

Data can also be consumed from enterprise service buses like Kafka.

Get your free personal presentation

Staging / Ingestion Features

Module

Feature

Staging

Load from any JDBC Source Databases like Amazon Aurora, Amazon RDS, Amazon Redshift, Apache Derby, Apache Spark, IBM DB2, EXASOL, eXist-db, Firebird, Google Cloud, Greenplum, H2, HSQL DB, BigQuery, Informix, Ingres, InterBase, JavaDB, MariaDB, MaxDB, Microsoft SQL Server, MySQL, Netezza, Oracle, ParAccel, PostgresSQL, PostgresPlus, Redshift, SAP Hana, SAS, SQLite, SQL Server, SingleStore, Sybase, Sybase IQ, Teradata, VectorWise, Vertica, Windows Azure

Staging

Load from NoSQL Datasources as Accumulo, Atop, Black Hole, Cassandra, ClickHouse, Delta Lake, Druid, Elasticsearch, Google Sheets, Hive, Hudi, Iceberg, JMX, Kafka, Kinesis, Kudu, MongoDB, Phoenix, Pinot, Prometheus, Redis

Staging

Stage and Read Metadata from SAP ERP and SAP BW connecting to the Theobald Connector

Staging

Connect to Python to use all Pyhton Modules to source data from applications like Sales Force, Exact Online, Microsoft Dynamics 365 and similar

Staging

Connect to Enter Service Buses (ESB) like Kafka for Near Realtime Warehousing

Staging

Load from CSV, TSV and other Delimiter separated Files 

Staging

Load fixed width files

Staging

Load from MS Access and MS Excel Files

Staging

Load from REST Services (JSON, XML) 

Staging

Load from Webpages

Staging

Load from Big Data Formats like Parquet, Delta Lake and Iceberg

Staging

Preview actual Source Data

Staging

Perform Full Loads into staging and CDC into the Data Vault

Staging

Perform Delta Loads by dynamically filtering the source download

Staging

Accept CDC streams as data source (like Qlik Replicate, Oracle Golden Gate or MSSQL CDC)

Staging

Filter data before downloading

Staging

Write custom queries for sources

Staging

Convert data types while staging

Staging

Read metadata from source system if present

Staging

Accept external time lines for Bi-Temporal and Multi-Temporal loads