File Connector
Version 24.2.9039
Version 24.2.9039
File Connector
The File connector can:
- Pull files from external directories into the CData Arc flow.
- Push files from the Arc flow into to external directories.
Overview
Each File connector is configured with a path on disk that determines what folder it reads files from or writes files to. This can be a local file path or a UNC path to access other file locations on the network.
When a File connector receives a file from the configured path, it passes the received file off to the next connector in the Arc flow without modifying the file. In this way, File connectors can be used to poll an external folder location for files to pull into the Arc flow.
When a file in the Arc flow is processed by a File connector, that file is written to the folder at the configured path. In this way, File connectors can be used to drop off processed files at a location external to the application.
File connectors can be configured with username/password credentials to provide access to external filepaths that otherwise would not be accessible to the application. This allows Arc to pull files from protected folder locations without needing to grant the entire application access to the protected location.
The path configured in the File connector can be dynamically populated using macros. For more information, see the Macros section.
Connector Configuration
This section contains all of the configurable connector properties.
Settings Tab
Connector Settings
Settings related to the core operation of the connector.
- Connector Id The static, unique identifier for the connector.
- Connector Type Displays the connector name and a description of what it does.
- Connector Description An optional field to provide a free-form description of the connector and its role in the flow.
- Path The external filepath where the connector pulls files from or pushes files to. The path can include macros for dynamic evaluation, as described in Macros.
Receive
Settings related to pulling files from the external filepath.
- File Mask A glob pattern for filtering files that should be pulled from the configured path. Only files matching the filemask are pulled. This setting can be combined with Receive Filter on the Advanced tab to specify multiple filters.
- Delete files (after received) Whether to remove files from the external path after they are pulled into the Arc flow.
Authentication
- Username The Windows user account that is used to access the files. If this is not specified, the connector uses the current user account.
- Password The password for the specified user account.
- Domain The domain for the Windows user account. This can be empty if the specified user is a local user account.
Caching
Settings related to caching and comparing files between multiple downloads.
- File Size Comparison Check this to keep a record of downloaded file names and sizes. Previously downloaded files are skipped unless the file size is different than the last download.
- Timestamp Comparison Check this to keep a record of downloaded file names and last-modified timestamps. Previously downloaded files are skipped unless the timestamp is different than the last download.
Note: When you enable caching, the file names are case-insensitive. For example, the connector cannot distinguish between TEST.TXT
and test.txt
.
Automation Tab
Automation Settings
Settings related to the automatic processing of files by the connector.
- Send A toggle that instructs the connector to automatically send files when they are ready.
- Retry Interval The interval the connector waits before retrying a failed send.
- Max Attempts The number of attempts the connector makes to send the message. Setting this value to 1 instructs the connector to only make the initial send attempt without retrying. The connector waits the duration specified by Retry Interval between each attempt.
- Receive A toggle that instructs the connector to automatically process files when they are ready and send them to the Output tab.
- Receive Interval The interval at which the connector processes all pending files and sends them to the Output tab. The next field depends on the selection here:
Hourly — A Minutes Past the Hour dropdown menu allows you to specify the number of minutes past the hour to process receive files.
Daily — A Time field appears to specify the time of day (in UTC) to process receive files.
Weekly — Two fields appear. Day allows you to select the day of the week for processing, and Time allows you to specify the time (in UTC) to process receive files.
Monthly — Two fields appear. Day allows you to select the day of the month for processing, and Time allows you to specify the time (in UTC) to process receive files.
Minute — A Minutes field appears to specify the number of minutes between processing intervals.
Advanced — A five-position Cron Expression field allows you to specify exact processing intervals. Highlight the field in the connector for more information about these expressions.
Performance
Settings related to the allocation of resources to the connector.
- Max Workers The maximum number of worker threads consumed from the threadpool to process files on this connector. If set, this overrides the default setting on the Settings > Automation page.
- Max Files The maximum number of files sent by each thread assigned to the connector. If set, this overrides the default setting on the Settings > Automation page.
Alerts Tab
Settings related to configuring alerts and Service Level Agreements (SLAs).
Connector Email Settings
Before you can execute SLAs, you need to set up email alerts for notifications. Clicking Configure Alerts opens a new browser window to the Settings page where you can set up system-wide alerts. See Alerts for more information.
Service Level Agreement (SLA) Settings
SLAs enable you to configure the volume you expect connectors in your flow to send or receive, and to set the time frame in which you expect that volume to be met. CData Arc sends emails to warn the user when an SLA is not met, and marks the SLA as At Risk, which means that if the SLA is not met soon, it will be marked as Violated. This gives the user an opportunity to step in and determine the reasons the SLA is not being met, and to take appropriate actions. If the SLA is still not met at the end of the at-risk time period, the SLA is marked as violated, and the user is notified again.
To define an SLA, click Add Expected Volume Criteria.
- If your connector has separate send and receive actions, use the radio buttons to specify which direction the SLA pertains to.
- Set Expect at least to the minimum number of transactions (the volume) you expect to be processed, then use the Every fields to specify the time frame.
- By default, the SLA is in effect every day. To change that, uncheck Everyday then check the boxes for the days of the week you want.
- Use And set status to ‘At Risk’ to indicate when the SLA should be marked as at risk.
- By default, notifications are not sent until an SLA is in violation. To change that, check Send an ‘At Risk’ notification.
The following example shows an SLA configured for a connector that expects to receive 1000 files every day Monday-Friday. An at-risk notification is sent 1 hour before the end of the time period if the 1000 files have not been received.
Advanced Tab
Advanced Settings
Settings not included in the previous categories.
- Max Receive Files The maximum number of files that are pulled in a single receive interval. If this is not a positive integer, no limit is applied.
- Overwrite Option Specifies how the connector should handle the case when it attempts to write a file to the external path and that file already exists. The connector can rename the current file to a unique filename, overwrite the existing file, append the current file to the existing file, skip the operation without error, or stop the operation and throw an error.
- Processing Delay The amount of time (in seconds) by which the processing of files placed in the Input folder is delayed. This is a legacy setting. Best practice is to use a File connector to manage local file systems instead of this setting.
- Receiving Delay The amount of time (in seconds) that the connector waits before receiving files from the remote path.
- Recurse Set this to true to download files in all subfolders of the target remote path. The directories are preserved for the received files.
- Temp Send Path A directory to use as a staging folder before moving the file to its configured destination path.
- Temp Send Prefix If specified, the connector uploads the file with the temporary prefix, then renames the file to its original filename after the operation is complete.
- Temp Send Extension If specified, the connector uploads the file with the temporary extension, then renames the file to its original filename and extension after the operation is complete.
- Local File Scheme A scheme for assigning filenames to messages that are output by the connector. You can use macros in your filenames dynamically to include information such as identifiers and timestamps. For more information, see Macros.
- Receive Filter A glob pattern filter to determine which files should be downloaded from the remote storage (e.g. *.txt). Use negative patterns to indicate files that should not be downloaded (for example, -*.tmp). Use this setting when you need multiple File Mask patterns. Separate multiple patterns by commas; later filters take priority except when an exact match is found.
Message
Message settings determine how the connector searches for messages and manages them after processing. You can save messages to your Sent folder or you can group them based on a Sent folder scheme, as described below.
- Save to Sent Folder Check this to copy files processed by the connector to the Sent folder for the connector.
- Sent Folder Scheme Instructs the connector to group files in the Sent folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all sent files for the week in that folder. The blank setting instructs the connector to save all files directly in the Sent folder. For connectors that process many transactions, using subfolders can help keep files organized and improve performance.
Logging
Settings that govern the creation and storage of logs.
- Log Level The verbosity of logs generated by the connector. When you request support, set this to Debug.
- Log Subfolder Scheme Instructs the connector to group files in the Logs folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all logs for the week in that folder. The blank setting tells the connector to save all logs directly in the Logs folder. For connectors that process many transactions, using subfolders helps keep logs organized and improves performance.
- Log Messages Check this to have the log entry for a processed file include a copy of the file itself. If you disable this, you might not be able to download a copy of the file from the Input or Output tabs.
Miscellaneous
Miscellaneous settings are for specific use cases.
- Other Settings Enables you to configure hidden connector settings in a semicolon-separated list (for example,
setting1=value1;setting2=value2
). Normal connector use cases and functionality should not require the use of these settings.
Establishing a Connection
File connectors must have the appropriate permissions to read/write from the configured path. Permissions issues are primarily a concern when the configured path is a UNC path to another server on the network, but can also arise when pushing or pulling files from a protected folder on the local disk.
If the user running Arc does not have permission to access the path, set the Username, Password, and Domain fields in the Authentication section of the Settings tab to a specific user with the appropriate permissions.
Sending and Receiving Files
Sending Files
The File connector sends files from the Input Folder to the external folder specified in the configured path. Files are automatically sent if Send is enabled on the Automation tab.
Receiving Files
The File connector receives files from the external folder specified in the configured path and places them in the connector’s Output folder. If the File connector is connected to another connector in the flow, the file does not remain in this folder and is instead passed along to the next connected connector.
The connector automatically polls the external folder for files if Receive is enabled on the Automation tab.
The connector only pulls files that match the specified File Mask. If Enable Timestamp Comparison or Enable File Size Comparison is enabled, the connector caches file names pulled from the remote path and only receives files that are new or have been modified since they were last received.
Subfolder Headers
If Recurse Subdirectories is set to True, when receiving a file:
- The message for the received file contains a Subfolder header in its metadata.
- This header contains the subfolder, relative to the configured path, that the file was received from.
- This subfolder header is supported by other connectors that support a Subfolder header in Send operations.
When sending files, if a Subfolder header is present on the message that is sent to the configured path in the File connector:
- The file is placed in the subfolder (relative to the path setting in the File connector) that is specified in the subfolder header.
- The subfolder is created if permissions are available.
Macros
Using macros in file naming strategies can enhance organizational efficiency and contextual understanding of data. By incorporating macros into filenames, you can dynamically include relevant information such as identifiers, timestamps, and header information, providing valuable context to each file. This helps ensure that filenames reflect details important to your organization.
CData Arc supports these macros, which all use the following syntax: %Macro%
.
Macro | Description |
---|---|
ConnectorID | Evaluates to the ConnectorID of the connector. |
Ext | Evaluates to the file extension of the file currently being processed by the connector. |
Filename | Evaluates to the filename (extension included) of the file currently being processed by the connector. |
FilenameNoExt | Evaluates to the filename (without the extension) of the file currently being processed by the connector. |
MessageId | Evaluates to the MessageId of the message being output by the connector. |
RegexFilename:pattern | Applies a RegEx pattern to the filename of the file currently being processed by the connector. |
Header:headername | Evaluates to the value of a targeted header (headername ) on the current message being processed by the connector. |
LongDate | Evaluates to the current datetime of the system in long-handed format (for example, Wednesday, January 24, 2024). |
ShortDate | Evaluates to the current datetime of the system in a yyyy-MM-dd format (for example, 2024-01-24). |
DateFormat:format | Evaluates to the current datetime of the system in the specified format (format ). See Sample Date Formats for the available datetime formats |
Vault:vaultitem | Evaluates to the value of the specified vault item. |
Examples
Some macros, such as %Ext% and %ShortDate%, do not require an argument, but others do. All macros that take an argument use the following syntax: %Macro:argument%
Here are some examples of the macros that take an argument:
- %Header:headername%: Where
headername
is the name of a header on a message. - %Header:mycustomheader% resolves to the value of the
mycustomheader
header set on the input message. - %Header:ponum% resolves to the value of the
ponum
header set on the input message. - %RegexFilename:pattern%: Where
pattern
is a regex pattern. For example,%RegexFilename:^([\w][A-Za-z]+)%
matches and resolves to the first word in the filename and is case insensitive (test_file.xml
resolves totest
). - %Vault:vaultitem%: Where
vaultitem
is the name of an item in the vault. For example,%Vault:companyname%
resolves to the value of thecompanyname
item stored in the vault. - %DateFormat:format%: Where
format
is an accepted date format (see Sample Date Formats for details). For example,%DateFormat:yyyy-MM-dd-HH-mm-ss-fff%
resolves to the date and timestamp on the file.
You can also create more sophisticated macros, as shown in the following examples:
- Combining multiple macros in one filename:
%DateFormat:yyyy-MM-dd-HH-mm-ss-fff%%EXT%
- Including text outside of the macro:
MyFile_%DateFormat:yyyy-MM-dd-HH-mm-ss-fff%
- Including text within the macro:
%DateFormat:'DateProcessed-'yyyy-MM-dd_'TimeProcessed-'HH-mm-ss%