Snowflake Connector Setup
Snowflake Connector Setup
The Snowflake connector allows you to integrate Snowflake into your data flow by pushing or pulling data from Snowflake. Follow the steps below to connect CData Arc to Snowflake.
Watch this short video for an overview of how you can integrate Snowflake into your B2B automations using Arc.
Establish a Connection
To allow Arc to use data from Snowflake, you must first establish a connection to Snowflake. There are two ways to establish this connection:
- Add a Snowflake connector to your flow. Then, in the settings pane, click Create next to the Connection drop-down list.
- Open the Arc Settings page, then open the Connections tab. Click Add, select Snowflake, and click Next.
- The login process is required only the first time that the connection is created.
- Connections to Snowflake can be re-used across multiple Snowflake connectors.
Enter Connection Settings
After you open a new connection dialogue, follow these steps:
- Enter the connection information:
- Name—the static name of the connection. Set this as desired.
- Type—this is always set to Snowflake.
- Auth Scheme—the authentication scheme to use for the connection. Options are Password, OAuth, Okta, PrivateKey, Azure AD, and PingFederate.
- Create the connection according to your selected Auth scheme:
- Password-enter the password for your account.
- OAuth—enter the OAuth client Id and OAuth client secret for your account.
- Okta-enter the username and password for your account.
- PrivateKey-enter the private key and the private-key type for your account.
- Azure AD-enter the username for your account.
- PingFederate-enter the username for your account.
If necessary, click Advanced to open the drop-down menu of advanced connection settings. You should not need these settings in most cases.
Click Test Connection to ensure that Arc can connect to Snowflake with the provided information. If an error occurs, check all fields and try again.
Click Add Connection to finalize the connection.
In the Connection drop-down list of the connector configuration pane, select the newly-created connection.
- Click Save Changes.
Note: There are also data source-specific authentication and configuration options on the Advanced tab. They are not all described in this documentation, but you can find detailed information for your data source on the Online Help Files page of the CData website.
Select an Action
After establishing a connection to Snowflake, you must choose the action that the Snowflake connector will perform. The table below outlines each action and where it belongs in an Arc flow.
|Action||Description||Position in Flow|
|Upsert||Inserts or updates Snowflake data. By default, if a record already exists in Snowflake, an update is performed on the existing data in Snowflake using the values provided from the input.||End|
|Lookup||Retrieves a value from Snowflake and inserts that value into an already-existing Arc message in the flow.
The Lookup Query determines what value the connector will retrieve from Snowflake. It should be formatted as a SQL query against the Snowflake tables.
|Select||Retrieves data from Snowflake and brings it into Arc.
You can use the Filter panel to add filters to the Select. These filters function similarly to WHERE clauses in SQL.
|Execute Stored Procedures||Treats data coming into the connector as input for a stored procedure, and then passes the result down the flow.
You can click the Show Sample Data button to provide sample inputs to the selected Stored Procedure and preview the results.
Settings related to the automatic processing of files by the connector.
- Send Whether files arriving at the connector are automatically sent.
- Retry Interval The number of minutes before a failed send is retried.
- Max Attempts The maximum number of times the connector processes the file. Success is measured based on a successful server acknowledgement. If you set this to 0, the connector retries the file indefinitely.
- Receive Whether the connector should automatically query the data source.
- Receive Interval The interval between automatic query attempts.
- Minutes Past the Hour The minutes offset for an hourly schedule. Only applicable when the interval setting above is set to Hourly. For example, if this value is set to 5, the automation service downloads at 1:05, 2:05, 3:05, etc.
- Time The time of day that the attempt should occur. Only applicable when the interval setting above is set to Daily, Weekly, or Monthly.
- Day The day on which the attempt should occur. Only applicable when the interval setting above is set to Weekly or Monthly.
- Minutes The number of minutes to wait before attempting the download. Only applicable when the interval setting above is set to Minute.
- Cron Expression A five-position string representing a cron expression that determines when the attempt should occur. Only applicable when the interval setting above is set to Advanced.
Settings related to the allocation of resources to the connector.
- Max Workers The maximum number of worker threads consumed from the threadpool to process files on this connector. If set, this overrides the default setting on the Settings > Automation page.
- Max Files The maximum number of files sent by each thread assigned to the connector. If set, this overrides the default setting on the Settings > Automation page.
Many of the settings on the Advanced tab are dynamically loaded from the data source you are working with, so they are not covered in this documentation. You can find detailed information for your data source on the Online Help Files page of the CData website.
The options described below are available for all data sources.
- Save to Sent Folder Check this to copy files processed by the connector to the Sent folder for the connector.
- Sent Folder Scheme Instructs the connector to group messages in the Sent folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all messages for the week in that folder. The blank setting tells the connector to save all messages directly in the Sent folder. For connectors that process many messages, using subfolders helps keep messsages organized and improves performance.
Advanced settings are common settings for connectors that rely on arbitrary database drivers (for example, Database, CData, and API connectors) in order to make connections to various data sources.
- Auto Truncate When enabled, the connector automatically truncates any string or binary column values that are longer than the allowed limit.
- Command Timeout The command execution timeout duration in seconds.
- Flat Mapping Scheme Governs which Input Mapping is used to process incoming flat files based on the filename of the document. Glob matching is supported, and you can define multiple mappings by separating each mapping with a semicolon. For example, setting this field to ‘INVOICE=Invoice;ORDER=Order;’ causes the connector to use the Invoice Input Mapping to process any files with ‘INVOICE’ in the filename, and files with ‘ORDER’ in the name are processed using the Order Input Mapping. Comparisons are case-insensitive.
- Last Inserted Id Query Provide a query to execute to retrieve the auto-increased Id for the last inserted record.
- Log Level The verbosity of logs generated by the connector. When you request support, set this to Debug.
- Local File Scheme A filemask for determining local file names as they are downloaded by the connector. You can use the following macros to reference contextual information:
%ConnectorId%, %Filename%, %FilenameNoExt%, %Ext%, %ShortDate%, %LongDate%, %RegexFilename:%, %DateFormat:%.
For example: %FilenameNoExt%_%ShortDate%%Ext%
- Log Messages When checked, the connector keeps a copy of the message next to the logs for the message in the Logs directory. If you disable this, you might not be able to download a copy of the file from the Input or Output tabs.
- Max Failed Records The maximum number of records that are allowed to fail during insertion to allow processing to continue. The default value of 0 means that any errors cause the input message to be marked as an Error, and any uncommitted transactions are rolled back. A value of -1 means that all errors are ignored and the connector continues attempting to insert subsequent records. A positive value means that the connector continues attempting to insert records until the threshold is reached.
- Output File Format The format in which output data is represented. The default value (XML) causes the connector to output an XML file for every record that is processed, while optionally combining multiple records into a single data structure (depending on the value of Max Records). The CSV and TSV options output data in their respective file formats. This option is not available for complex table structures that include child tables.
- Process Changes Interval Unit Applicable to Select action only. When Use column columnname for processing new or changed records is checked in the Advanced portion of the Select Configuration section of the Settings tab, this controls how to interpret the Process Changes Interval setting (for example, Hours, Days, or Weeks). See Only Process New or Changed Records for details on this and the following two settings.
- Process Changes Interval When Use column columnname for processing new or changed records is checked on the Select Configuration section of the Settings tab, this controls how much historical data Arc attempts to process on the first attempt. For example, keeping the default of 180 (Days) means Arc only attempts to process data that has been created or modified in the last 180 days.
- Reset History Resets the cache stored when Use column columnname for processing new or changed records is checked.
- Batch Input Size The maximum number of queries in a batch if batches are supported by the data source.
- Batch Size The maximum number of batch messages in a batch group.
- Max Records The maximum number of records to include in a single output message. Use -1 to indicate that all output records should be put in a single file, and use 0 to indicate that the connector can decide based on the configured Output File Format. By default, XML outputs one record per file, and flat file formats include all records in one file.
- Transaction Size The maximum number of queries in a transaction.
- Parent Connector When set to the Connector Id of an existing connector of the same type, the settings from that connector are used, except where they are overridden in the existing port configuration. Settings that are not explicitly overridden are highlighted and are automatically updated with changes to the parent connector.
- Processing Delay The amount of time (in seconds) by which the processing of files placed in the Input folder is delayed. This is a legacy setting. Best practice is to use a File connector to manage local file systems instead of this setting.
- Lookup Caching Applicable to Lookup action only. Check this to have queries performed against a local cache before they are performed directly against the data source (if no records are returned from the cache). If left unchecked, each query is performed directly against the data source. See Cache for details on this and the following five settings.
- Lookup Cache Provider Choose which driver to use to facilitate the connection to the cache database when caching is enabled.
- Lookup Cache Connection String When set, the connector uses the database specified here for caching.
- Lookup Cache Tolerance Controls how often the connector issues a query to the data source to refresh the cache.
- Lookup Cache Tolerance Unit The Lookup Cache Tolerance unit. Choose Minute, Hour, or Day.
- Reset Lookup Cache Resets the cache.
- Log Subfolder Scheme Instructs the connector to group files in the
Logsfolder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all logs for the week in that folder. The blank setting tells the connector to save all logs directly in the
Logsfolder. For connectors that process many transactions, using subfolders can help keep logs organized and improve performance.
Miscellaneous settings are for specific use cases.
- Other Settings Enables you to configure hidden connector settings in a semicolon-separated list (for example,
setting1=value1;setting2=value2). Normal connector use cases and functionality should not require the use of these settings.