Using the Source Component
After Establishing a Connection to the data source, you can use the CData Apache Hive source component to pull data into your Data Flow task.
Querying Apache Hive Data with the Source ComponentFollow the procedure below to connect to Apache Hive, retrieve data, and provide data to other components in the workflow.
- In the SSIS Toolbox, drag the CData Apache Hive source component into the Data Flow task.
- Double-click the CData Apache Hive source component. The CData Apache Hive Source Editor will display.
- In the Connection Managers menu, select an available CData Apache Hive connection manager, or create a new instance if one is not already available.
- Choose your Access Mode: "Table or View" or "SQL Statement". Select "Table or View" to use the GUI to select a table or view. Select "SQL Statement" to configure a statement of your choice.
- Select the Columns tab and rename any output columns as desired.
Using Parameterized QueriesParameterized statements provide an efficient way to execute queries dynamically and mitigate SQL injection attacks. The Source Component provides a Parameters button that can be used to map parameters defined in the query to variables in the data flow when using a custom SQL Command. The component will execute these queries as parameterized statements at runtime.
In order to use the Parameters option, your query must contain parameters, which can be either defined positionally as a named prarameter:
- Positional parameters: When setting up the parameter mapping, the names in the Parameter list must be the index (starting from 1) of that parameter in the query. Example query:
SELECT * FROM Table WHERE FirstName = ? AND Date > ?
Here, the Parameter names must be set to '1' and '2' for 'FirstName' and 'Date' respectively.
- Named parameters: When setting up the parameter mapping, the names in the Parameter list must exactly match the names of the parameters in the query without the preceding '@' symbol. Example query:
SELECT * FROM Table WHERE FirstName = @FirstName AND Date > @Date
Here, the Parameter names must be set to 'FirstName' and 'Date'.
Building Dynamic Queries in the Expression Builder
After configuring a source component, you can then use the SSIS Expression Builder to access the SQL statement that the source component executes at run time.
The component will build and execute these dynamically defined queries at runtime.
- In SSIS Designer, click the Control Flow tab.
- In the Properties pane, click the button in the box for the Expressions property.
- In the resulting Property Expressions Editor, click an empty row in the Property box and select the SQLStatement property of the CData Apache Hive source component from the drop-down menu. Then click the button in the row you just added. This displays the Expression Builder.
- In the Expression box, you can create new SQL commands that use the variables available at run time as input parameters. Ensure that you enclose the expression in quotes. For example:
"SELECT * FROM Table WHERE FirstName = '" + @[User::Name] + "' AND Date > '" + (DT_WSTR, 50) DATEADD("day", -30, GETDATE()) + "'"