Using the Source Component
After Establishing a Connection to the data source, you can use the CData Databricks source component to pull data into your Data Flow task.
Querying Databricks Data with the Source Component
Follow the procedure below to connect to Databricks, retrieve data, and provide data to other components in the workflow.- In the SSIS Toolbox, drag the CData Databricks source component into the Data Flow task.
- Double-click the CData Databricks source component. The CData Databricks Source Editor will display.
- In the Connection Managers menu, select an available CData Databricks connection manager, or create a new connection manager with the New... button if one is not already available.
- Choose your access mode in the Data access mode dropdown menu. Select Table or View to use the GUI to select a table or view. Select SQL command to author a custom SQL statement in the SQL command text box.
- Select the Columns tab and rename any output columns as desired.
See SELECT Statements for the supported SQL syntax and examples. The process for calling stored procedures is detailed in Calling Stored Procedures.
Using Parameterized Queries
Parameterized statements provide an efficient way to execute queries dynamically and mitigate SQL injection attacks. When using a custom SQL Command, the Source Component provides a Parameters button that launches the Set Query Parameters dialog. This interface contains the following columns:- Parameters: An alias for an in-query parameter in the custom SQL command box. An entry is generated for each positional parameter (represented by a ?) and named parameter (in the format @ParameterName) present in the query.
- ColumnGlobalVariables: The system-level or package-level parameter you would like to map the in-query parameter to. The in-query parameter will resolve to the value of the mapped parameter.
In order to use the Parameters option, your query must contain parameters, which can be either defined as a positional parameter or as a named parameter:
- Positional parameters: When setting up the parameter mapping in the Set Query Parameters dialog, the names in the Parameters column must be the index (starting from 1) of that parameter in the query. Consider this example query:
SELECT * FROM Table WHERE FirstName = ? AND Date > ?
Here, the names in the Parameters column must be set to '1' and '2' for 'FirstName' and 'Date' respectively.
- Named parameters: When setting up the parameter mapping in the Set Query Parameters dialog, the names in the Parameters column must exactly match the names of the parameters in the query, without the preceding '@' symbol. Consider this example query:
SELECT * FROM Table WHERE FirstName = @FirstName AND Date > @Date
Here, the names in the Parameters column must be set to 'FirstName' and 'Date'.
Building Dynamic Queries in the Expression Builder
After configuring a source component, you can then use the SSIS Expression Builder to access the SQL statement that the source component executes at run time.
The component builds and executes these dynamically defined queries at runtime.
- In SSIS Designer, click the Control Flow tab.
- In the Properties pane, click in the value box for the Expressions property and click the '...' button that appears in the box. The Property Expressions Editor window appears.
- In the resulting Property Expressions Editor window, click an empty row in the Property box and select the SQLStatement property of the CData Databricks Source component in the drop-down menu.
- Click the "..." button at the end of the row where you just selected the SQLStatement property. The Expression Builder window appears.
- In the Expression box, you can create new SQL commands that use the variables available at run time as input parameters. Ensure that you enclose the expression in quotes. For example:
"SELECT * FROM Table WHERE FirstName = '" + @[User::Name] + "' AND Date > '" + (DT_WSTR, 50) DATEADD("day", -30, GETDATE()) + "'"