S3 Connector

Version 22.0.8473


S3 Connector


The S3 Connector integrates with Amazon’s S3 (Simple Storage Service) and other S3-like services (Google Storage, Wasabi, etc).

Overview

Each S3 Connector can automatically upload to and download from a single configured S3 Bucket. An Amazon Account (or Google Storage account, Wasabi account, etc) with the appropriate credentials is required. Upload and download paths can be specified within the Bucket, and files can be filtered by filename before download.

Connector Configuration

This section contains all of the configurable connector properties.

Settings Tab

Host Configuration

Settings related to the remote connection target.

  • Connector Id The static name of the connector. All connector-specific files are held in a folder by the same name within the Data Directory.
  • Connector Description An optional field to provide free-form description of the connector and its role in the flow.
  • Bucket Name The S3 Bucket that should be polled or uploaded to.

Amazon Account Settings

Settings related to the Amazon Account with permission to access the configured Bucket Name.

  • Access Key The Access Key account credential acquired from Amazon (or Google, Wasabi, etc).
  • Secret Key The Secret Key account credential acquired from Amazon (or Google, Wasabi, etc).
  • Region The Region where the specified Bucket Name is stored.

SSL Settings

Settings related to the SSL negotiation with the S3 server.

  • Use SSL when connecting with Amazon servers Whether SSL negotiation is enabled.
  • Server Public Certificate The public key certificate to trust when connecting to the S3 server. Can be set to Any Certificate to implicitly trust the server.

Upload

Settings related to the path within the specified bucket where files will be uploaded.

  • Remote Path The path within the Bucket where files will be uploaded.
  • Overwrite remote files Whether files that already exist in the specified Bucket should be overwritten during upload.

Download

Settings related to the path within the specified bucket where files will be uploaded.

  • Remote Path The path within the Bucket from which files will be downloaded. Multiple paths can be specified in a comma-delimited list.
  • File Filter A glob pattern that determines which files within the Remote Path should be downloaded. Multiple patterns may be specified in a comma-delimited list.

Automation Tab

Automation Settings

Settings related to the automatic processing of files by the connector.

  • Upload Whether files arriving at the connector will automatically be uploaded.
  • Retry Interval The amount of time before a failed upload is retried.
  • Max Attempts The maximum number of times the connector will process the input file. Success is measured based on a successful server acknowledgement. If this is set to 0, the connect will retry the file indefinitely.
  • Download Whether the connector should automatically poll the remote download path for files to download.
  • Download Interval The interval between automatic download attempts.
  • Minutes The number of minutes to wait before downloading. Only applicable when Download Interval is set to Minute.
  • Minutes Past the Hour The minutes offset for an hourly schedule. Only applicable when Download Interval is set to Hourly. For example, if this value is set to 5, the automation service will download at 1:05, 2:05, 3:05, etc.
  • Time The time within a given day that the download should occur. Only applicable when Download Interval is set to Daily, or Weekly, or Monthly.
  • Day The day on which the download should occur. Only applicable when Download Interval is set to Weekly or Monthly.
  • Cron Expression An arbitrary string representing a cron expression that determines when the download should occur. Only applicable when Download Interval is set to Advanced.

Performance

Settings related to the allocation of resources to the connector.

  • Max Workers The maximum number of worker threads that will be consumed from the threadpool to process files on this connector. If set, overrides the default setting from the Profile tab.
  • Max Files The maximum number of files that will be processed by the connector each time worker threads are assigned to the connector. If set, overrides the default setting from the Profile tab.

Advanced Tab

Other Settings

Settings not included in the previous categories.

  • Access Policy The access policy set on objects after they are uploaded to the S3 server.
  • Enable Size Comparison Whether to cache downloaded file names and sizes; if True then files will only be downloaded if they have not been downloaded before or have changed in size.
  • Enable Timestamp Comparison Whether to cache downloaded file names and last-modified timestamps; if True then files will only be downloaded if they have not been downloaded before or have been modified since they were downloaded.
  • Encryption Password If set, object data will be encrypted on the client side before upload, and downloaded objects will be automatically decrypted.
  • UseServerSideEncryption Whether to request the S3 server encrypts object data server-side.
  • Send Filter A glob pattern filter to determine which files in the Send folder will be uploaded by the connector (e.g. *.txt). Negative patterns may be used to indicate files that should not be processed by the connector (e.g. -*.tmp). Multiple patterns may be separated by commas, with later filters taking priority except when an exact match is found.
  • Local File Scheme A filemask for determining local file names as they are downloaded by the connector. The following macros may be used to reference contextual information:
    %ConnectorId%, %Filename%, %FilenameNoExt%, %Ext%, %ShortDate%, %LongDate%, %RegexFilename:%, %DateFormat:%.
    As an example: %FilenameNoExt%_%ShortDate%%Ext%
  • Log Level The verbosity of logs generated by the connector. When requesting support, it is recommended to set this to Debug.
  • Parent Connector The connector from which settings should be inherited, unless explicitly overwritten within the existing connector configuration. Must be set to a connector of the same type as the current connector.
  • Recurse Subdirectories Whether to download files in subfolders of the target remote path.
  • Use Virtual Hosting Whether to use hosted-style or path-style when referencing the Bucket endpoint.
  • Log Subfolder Scheme Instructs the connector to group files in the Logs folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all logs for the week in that folder. The blank setting tells the connector to save all logs directly in the Logs folder. For connectors that process many transactions, using subfolders can help keep logs organized and improve performance.
  • Log Messages Whether the log entry for a processed file will include a copy of the file itself.
  • Save to Sent Folder Whether files processed by the connector should be copied to the Sent folder for the connector.

Miscellaneous

Settings for specific use cases.

  • Other Settings Allows configuration of hidden connector settings in a semicolon-separated list, like setting1=value1;setting2=value2. Normal connector use cases and functionality should not require use of these settings.

Establishing a Connection

The requirements for establishing an S3 connection are simple:

  • Amazon account credentials (or Google, Wasabi, etc)
    • Access Key
    • Secret Key
  • An S3 Bucket that can be accessed by the above account

For Amazon S3 specifically, this link can be used to obtain Access Key and Secret Key information from Amazon.
Optionally, the connection with S3 servers can be secured by SSL by enabling the Use SSL when connecting with Amazon servers option.

Uploading

Uploading to Remote Folders

The Remote Path setting within the Upload section specifies the path within the Bucket to upload files. This allows for the logical separation of files into virtual folders within the same Bucket.

Note that S3 servers do not maintain a real folder structure, and CData Arc uses application logic to present a pseudo folder structure. Slashes in the Remote Path (/, \\) are interpreted as representing a folder hierarchy. This allows for uploading to or downloading from ‘subfolders’ within the Bucket based on the slashes in the path.

Upload Automation

The S3 Connector supports automatic upload via the Automation tab in the connector configuration panel. When Upload automation is enabled, files that reach the Input folder for the connector will be automatically uploaded to the specified Bucket Name at the specified Remote Path.

If a file fails to upload, the application will attempt to send it again after the Retry Interval has elapsed. This process will continue until the Max Attempts has been reached, after which the connector will raise an error.

Downloading

Downloading from Remote Folders

The Remote Path setting within the Download section specifies the path within the Bucket to upload files. This allows for the logical separation of files into virtual folders within the same Bucket.

The File Filter setting provides a way to only download specific filenames within the specified path.

Note that S3 servers do not maintain a real folder structure, and Arc uses application logic to present a pseudo folder structure. Slashes in the Remote Path (/, \\) are interpreted as representing a folder hierarchy. This allows for uploading to or downloading from ‘subfolders’ within the Bucket based on the slashes in the path.

Download Automation

The S3 Connector supports automatic upload via the Automation tab in the connector configuration panel. When Download automation is enabled, will be automatically poll the remote Bucket according to the specified Download Interval.