In-Flight (ETL)
Version 24.2.9064
Version 24.2.9064
In-Flight (ETL)
Extract, Transform, Load (ETL) is a method of data extraction in which data is moved from a source and transformed in-flight (midstream) before delivery to its destination.
Differences between ETL and ELT
Traditionally, ETL has been the most popular way to move and transform data. However, with the rise of cloud computing and the greatly expanded storage capacity of databases, ETL is being overtaken gradually by Extra, Load, Transform (ELT) as the preferred method of data delivery and transformation.
The primary difference between the ETL and ELT is the order in which the steps take place. However, ELT also has other benefits:
-
Performance: The larger storage limits of modern databases enable ELT to move the entirety of updated data before implementing changes. Because the data does not need to be cleaned or modified ahead of time, the speed and efficiency of the transformation is greatly increased when compared to traditional ETL methods.
-
Simplicity: In situations where you need to move a large amount of data, ELT streamlines and simplifies the transformation process.
There are still cases where you might want to modify data before pushing it to the destination. In such cases, CData Sync makes it easy to perform the transformations.
Renaming Tables
In moving data from the source, Sync automatically creates a table in the destination. Alternatively, you can map to an existing table by clicking the table name on the Task tab, selecting Map to Existing Table from the drop-down menu, and choosing the destination table. Sync attempts to map the columns automatically via same names. You can then navigate into the Column Mapping section and make any adjustments.
Mapping Columns
The Columns tab provides control over which data to retrieve from your source table and how you want that data to be represented in the target table. From here, you can remove or adjust column mappings, rename destination columns, apply a transformation on the source, and more, as explained in the following sections.
Renaming Columns
To rename a column:
-
Select a task. Then select … > Edit.
-
Click the Columns tab. Then, click Edit Mapping.
-
Select the column row that you want to edit. Then select … > Edit Destination.
-
Edit the column names in the Edit Destination Column dialog box and click Done.
Applying SQL Transformations
To apply an SQL transformation:
-
Select a task. Then select … > Edit.
-
Click the Columns tab. Then, click Edit Mapping.
-
Select the column row that you want. Then, select … > Apply Transformation to open the Transform Source dialog box. In that dialog box, the column that you selected is listed in the text box under Apply a SQL Transformation to your source data.
-
Click the Apply SQL Transformation button to choose from the following transformations:
-
MASK() - Masks characters in the specified column by replacing each character with an asterisk. For more details about masking, see Masking for Personally Identifiable Data.
-
TRIM() - Removes leading and trailing spaces or a specified string of characters from a string.
-
ISNULL() - Returns a specified value if the expression is NULL. Otherwise, the expression is returned.
-
UPPER() - Converts the text to uppercase characters.
-
LOWER() - Converts the text to lowercase characters.
The following example illustrates the application of the MASK() transformation once you select it from the list.
Note: You can create other transformations manually as well by entering an SQL function around your column. For example, the following image shows the application of the LEN() function to the name column:
-
-
Click Done to save your selection.
Adding Columns
To add a column to your destination table:
-
Select a column row and click the edit icon at the end of the row.
-
Click the Columns tab. Then, click Edit Mapping.
-
Click the add icon (+) to open the Add Column dialog box
-
Add column names for your source and destination. Then click Add Column.
Removing Columns
To remove a column, clear the checkbox to the left of the row that you want to remove. That row becomes inactive, which indicates that the column is removed from your query and will not be replicated.