Post-Job (ELT)


Post-Job (ELT)


What is ELT?

ELT (Extract, Load, Transform) is a method of data extraction where modifications to the data, such as Transformations, take place after the loading process. The vastly greater storage and scaling capabilities of modern cloud data warehouses allow data to be moved in its entirety and then modified after the fact. The primary difference between ETL and ELT is the order in which the steps take place.

ELT Transformations allow you to rename columns, change data values by applying SQL functions, and map them to destination columns. You can also add new values that are computed directly within the Sync data pipeline.

ELT provides a number of benefits:

  • Performance: The larger storage limits of modern databases allow ELT to move the entirety of updated data before implementing changes. Because the data doesn’t need to be cleaned or modified ahead of time, the speed and efficiency of the Transformation is greatly increased when compared to traditional ETL methods.
  • Simplicity: In situations where you need to move a large amount of data, ELT streamlines and simplifies the Transformation process.

Configuring Transformations

To create and configure a Transformation, begin by clicking Transformations from the top menu bar in CData Sync.

Nav Bar

This will take you to the Transformations screen. Here you can see a list of previously-configured Transformations, edit and delete Transformations, run queries, and add new Transformations.

To add a new Transformation, click the Add Transformation button. This will open the Create New Transformation modal. Here you can enter a Transformation name and select an associated Connection. Once you’ve named your Transformation and selected a Connection, click Create to continue.

Add Transformations

Transformation Detail Screen

The Transformation Detail screen is divided into two sections: Connections and Job Settings.

Transformation Settings

The Job Settings section is divided into several components:

Task

The Task section lets you manage the queries that execute as part of your Transformation(s). Click Configure to enter your Transformation(s). You can enter multiple statements by separating with a ‘;’. When you enter multiple statements, the statements are parsed into separate tasks. These tasks will be run sequentially, and logs will be written for each statement that is executed.

Schedule

The Schedule section lets you set Transformations to run on a schedule. See Scheduling Jobs for more information.

Notifications

Transformations can be configured to send Email notifications on completion of a run. For more information on setting up Email notifications and configuring your mail server, See Notifications.

Logging and History

The Logging section allows you to monitor current and review the previous Transformation runs. See Logging and History for more information regarding what information each level of Verbosity contains and how to retrieve your log files.

Sample queries for all databases

Deduplicate Records

CREATE OR REPLACE TABLE dbo.Accounts_Deduplicated as (
  SELECT *, row_number() over (partition by id order by modified desc) as duplicate_rank 
  FROM dbo.Accounts
);	
DELETE FROM dbo.Accounts_Deduplicated	WHERE duplicate_rank > 1;