Staging / Ingestion Module

Build your own integrated data model by fetching data from various data sources – internal and external data.

Databases

Connect to all databases and other sources providing a JDBC driver.

Databases

Oracle, MSSQL, Big Query, MySQL, Azure Synapse Analytics, DB2, Amazon Redshift, Maria DB, Postgres, SAP Hana and many more....

Files

Connect to all possible file formats as CSV, Excel, Iceberg and Parquet files.

Files

CSV, JSON, XML, TSV, fixed width files, Parquet, Iceberg and most important: Excel files 🙂

No SQL

Connect to NoSQL sources as Delta Lake, Kafka or Mongo DB

No SQL

Delta Lake, Elasticsearch, Hive, Kafka, Kinesis, MongoDB, Prometheus, Redis, Thrift and others....

Python Classes

Connect to online applications providing a Python class for connection

Python Classes

Google Analytics, Salesforce, Exact Online and whatever you find as Python class

Compatible with other ingestion tools

Combine Datavault Builder with the database built in ingestion tools or 3th party ETL tools.

Compatible with other ingestion tools

Use ingestion by Snowflake, Exasol or Microsoft Synapse Analytics dbs or use ETL tools like Informatica, Matillion or Talend

Batch Loading

Connect to any data source out of the box and let Datavault Builder calculate which data did change since the last load. 

Delta Loads

Based on a source column Datavault Builder can filter the data already before staging.

CDC Loads

Datavault builder is able to consume CDC streams from MSSQL Servers, Qlik Replicate, Golden Gate, Kafka and others and interpret the ingestion time order correctly.

Near Real Time

Data can also be consumed from enterprise service buses like Kafka.

Request your complimentary personal presentation today.

Staging / Ingestion Features

Module

Feature

Supported

Staging

Load from any JDBC Source Databases like Amazon Aurora, Amazon RDS, Amazon Redshift, Apache Derby, Apache Spark, IBM DB2, EXASOL, eXist-db, Firebird, Google Cloud, Greenplum, H2, HSQL DB, BigQuery, Informix, Ingres, InterBase, JavaDB, MariaDB, MaxDB, Microsoft SQL Server, MySQL, Netezza, Oracle, ParAccel, PostgresSQL, PostgresPlus, Redshift, SAP Hana, SAS, SQLite, SQL Server, SingleStore, Sybase, Sybase IQ, Teradata, VectorWise, Vertica, Windows Azure

Yes

Staging

Load from NoSQL Datasources as Accumulo, Atop, Black Hole, Cassandra, ClickHouse, Delta Lake, Druid, Elasticsearch, Google Sheets, Hive, Hudi, Iceberg, JMX, Kafka, Kinesis, Kudu, MongoDB, Phoenix, Pinot, Prometheus, Redis

Yes

Staging

Stage and Read Metadata from SAP ERP and SAP BW connecting to the Theobald Connector

Yes

Staging

Connect to Python to use all Pyhton Modules to source data from applications like Sales Force, Exact Online, Microsoft Dynamics 365 and similar

Yes

Staging

Connect to Enter Service Buses (ESB) like Kafka for Near Realtime Warehousing

Yes

Staging

Load from CSV, TSV and other Delimiter separated Files 

Yes

Staging

Load fixed width files

Yes

Staging

Load from MS Access and MS Excel Files

Yes

Staging

Load from REST Services (JSON, XML) 

Yes

Staging

Load from Webpages

Yes

Staging

Load from Big Data Formats like Parquet, Delta Lake and Iceberg

Yes

Staging

Preview actual Source Data

Yes

Staging

Perform Full Loads into staging and CDC into the Data Vault

Yes

Staging

Perform Delta Loads by dynamically filtering the source download

Yes

Staging

Accept CDC streams as data source (like Qlik Replicate, Oracle Golden Gate or MSSQL CDC)

Yes

Staging

Filter data before downloading

Yes

Staging

Write custom queries for sources

Yes

Staging

Convert data types while staging

Yes

Staging

Read metadata from source system if present

Yes

Staging

Accept external time lines for Bi-Temporal and Multi-Temporal loads

Yes

Cookie Consent Banner by Real Cookie Banner