SQL Transformations
Version 23.4.8843
Version 23.4.8843
SQL Transformations
An SQL transformation processes SQL queries midstream (ETL) in a pipeline. This type of transformation processes external SQL scripts or SQL queries that you create in an SQL editor. You can use these queries to insert, delete, update, and retrieve rows from a database. The SQL transformation processes the queries, returning rows and database errors.
Creating an SQL Transformation
To add an SQL transformation:
-
Open Sync and navigate to the Transformations page.
-
Select the job on which you want to run a transformation. Then, click Add Transformations.
-
Fill out the fields in the Add Transformation dialog box, as follows:
-
Name: Enter a name for your transformation.
-
Type: Select SQL as the transformation type.
-
Destination: Select a destination on which to run the transformation.
-
-
Click Add Transformation.
After you create your transformation, you can manage your transformation from the options menu (…).
Understanding the Transformations Page
The Transformations page contains a table that lists the details of your transformations including the destination used, whether the transformation is scheduled, the last run status, and the next run. You can also filter the table by transformation type, as shown below.
To use bulk operations on your job (Run, Disable Schedule, Enable Schedule, and Deselect), select the checkbox next to one or more jobs.
You can manage each transformations from the options menu (…). To see that menu, position your cursor at the end of the transformation row. The following options are available from that menu:
-
Edit - Enables you to edit your transformation settings, including modifying general settings, setting notification alerts, scheduling the transformation to run, and viewing the history for a transformation.
-
Download Log File - Downloads the log file for your last transformation run.
-
Run Details - Opens a dialog box that displays information about your transformation run. The dialog box contains two tabs of information:
-
Details: This tab displays the run date, the run status, the amount of time it took to run the transformation, and the number of rows affected.
-
Logs: This tab lists the logs that are available for your transformation. You can download individual logs here by clicking the download icon to right of the log name. You can also download all logs by clicking Download All Log Files.
-
-
Remove - Permanently deletes the transformation.
Specifying Transformation Settings
To modify transformation settings, select … > Edit at the end of the transformation row. The settings page contains two tabs from which you can manage settings:
-
Overview: This tab provides Settings, Notifications, and Triggers categories where you can modify those setting types.
-
Queries: This tab enables you to add and manage queries that execute as part of your transformations.
-
Transformations History: This tab provides details about your transformation job run. Details include Run Date, Status, Records Affected, and Run Time. You can also filter your report to show transformations by status (All Status, SUCCESS, FAILED, or ERROR). In addition, you can export the history by clicking Export History.
Overview Tab
The Overview tab enables you to manage general settings, triggers, and notifications. You can also see details from the last run of your transformation on this page.
Settings Category
The Settings category enables you to set logfile verbosity and command timeout, as follows.
-
Select … > Edit at the end of the transformation row. This action opens the Edit Settings dialog box.
-
Select a value for Logfile Verbosity.
-
Enter a timeout value in the Command Timeout field.
-
Click Save.
Trigger Category
The Triggers category enables you to set a specific schedule by which to run your transformation, as follows.
-
Click Configure in the Triggers category. This action opens the Trigger dialog box.
-
Specify the type of trigger that you want. You can select from the following types:
-
None - Turns off scheduling options, allowing you to start a transformation manually.
-
Scheduled - Enables you to set a specific schedule by which to run your transformation. This setting also activates the Intervals and Minutes Past The Hour fields. You can choose more granular settings for the Intervals setting:
-
Minute - Activates the Every option where you can specify increments in minutes (every 10, 15, 20, or 30 minutes) in which to run your transformation.
-
Hourly - Activates a Minutes Past The Hour field so you can set your transformation to run hourly at specific numbers of minutes past the hour (0-59). For example, you might want to run the transformation every hour at 15 minutes past the hour.
-
Daily - Activates a Time of Day field so you can set the specific time each day that you want your transformation to run.
-
Weekly - Activates two fields (Days and Time of Day) where you can specify a specific day of the week and time to run your transformation.
-
Monthly - Activates Day and Time fields. However, for Day, you choose a specific day of each month (values of 1 through 31 or Last Day) along with the time.
-
Cron - Activates a Cron Statement field where you can write a cron job to initiate a transformation run.
The Scheduled type is useful when you run multiple jobs. You cannot use the After Job option for multiple jobs because you cannot specify after which job to run the transformation.
-
-
After Job - Activates a Job drop-down list that presents a list of all available jobs. You can choose the specific job after which you want the transformation to run.
You can also specify various tasks to run on your data by clicking Select Specific Tasks under the Job field. Simply select the task or tasks that you want to run.
-
-
Click Save.
Note: To disable schedule triggers, slide the Trigger toggle switch to the left (the switch turns gray when it is disabled). The toggle switch appears to the right of Trigger in the category box.
Notifications Category
The Notifications category enables you to set email notifications for transformations, as follows.
-
Select … > Edit at the end of the transformation row for which you want to set notifications.
-
Click Configure in the Notifications category.
-
Set the following options in the Notifications dialog box:
-
To - The email to which to send notifications.
-
Subject - The subject for notification emails.
-
Only send notifications on errors - An option that you can select to specify that notification emails should be sent only when the job encounters an error. If you do not select this option, Sync sends notifications after each job run.
-
-
Click Save.
Note: To disable notifications, slide the Notifications toggle switch to the left (the switch turns gray when it is disabled). The toggle switch appears to the right of Notifications in the category box.
Queries Tab
This tab enables you to add and manage queries that execute as part of your transformations.
Adding a Query
To add a query:
-
Click Add Query to open the Add Query dialog box.
-
Enter your query in the text box. If you have multiple statements, make sure that you separate each statement with a semicolon.
-
Click Add Query. If the addition of the query is successful, Sync displays a message saying it is successful at the top of the Queries page.
If you want to run your query, click Run at the top of the Queries page.
After you add and run queries, the Queries page displays a table of the queries as well as the LAST RUN date.
Editing a Query
To edit a query:
-
Select … > Edit at the end of the query row.
-
Modify your query, as necessary, in the Edit Query dialog box.
-
Click Save.
Transformations History Tab
This tab provides details about your transformation job run. Details include Run Date, Status, Records Affected, and Run Time. You can also filter your report to show transformations by status (All Status, SUCCESS, FAILED, or ERROR). In addition, you can export the history by clicking Export History.
Sample Query for All Databases
The following SQL query deduplicates records:
CREATE OR REPLACE TABLE dbo.Accounts_Deduplicated as (
SELECT *, row_number() over (partition by id order by modified desc) as duplicate_rank
FROM dbo.Accounts
);
DELETE FROM dbo.Accounts_Deduplicated WHERE duplicate_rank > 1;