Google Storage Connector
Google Storage Connector
The Google Storage Connector uploads to and downloads from the Google Storage cloud storage service.
Each Google Storage Connector connects to a specified Google Storage service account. Within the remote storage, a single bucket is set as the upload and/or download target. Upload and download paths can be specified within the Bucket, and filters can be applied to only send and receive specified file names and/or file extensions.
Files that reach the Google Storage Connector in the CData Arc flow are uploaded to the upload target folder, and files found in the download target folder are downloaded and entered into the Arc flow.
Account authorization is OAuth-based, and OAuth credentials must be acquired from Google prior to connecting with the Google Storage Connector. The Connector handles the process of refreshing OAuth tokens to maintain successful authorization over time.
This section contains all of the configurable connector properties.
Settings related to authorizing access to the remote storage.
- Connector Id The static name of the connector. All connector-specific files are held in a folder by the same name within the Data Directory.
- Connector Description An optional field to provide free-form description of the connector and its role in the flow.
- OAuth Client ID The Client ID credential provided by Google for the target storage account. This value should be acquired directly from Google via the Google Console.
- OAuth Client Secret The Client Secret credential provided by Google for the target storage account. This value should be acquired directly from Google via the Google Console.
Note: Google may require an OAuth callback/redirect URL when generating OAuth credentials. The Arc callback/redirect URL is the same as the address/port where the application is hosted, plus the following resource path: src/oauthCallback.rst. For example, if Arc is hosted on ‘mydomain.com’ on port 8001, the following URL should be specified as the callback/redirect URL: http://mydomain.com:8001/src/oauthCallback.rst.
Settings related to the target storage host.
- Bucket Name The bucket within Google Storage to upload and/or download files.
Settings related to uploading to the remote storage.
- Remote Folder Files processed by the connector will be uploaded to this specified remote folder.
- Overwrite remote files Whether files that already exist in the remote folder should be uploaded (overwritten) or skipped.
Settings related to downloading from the remote storage.
- Remote Folder The remote folder from which files will be downloaded. Multiple folders can be specified in a comma-delimited list. Receive Filter advanced field should be used.
- Delete files (after received) Whether files that are successfully downloaded should be deleted from the remote storage afterwards.
Settings related to the automatic processing of files by the connector.
- Send Whether files arriving at the connector will automatically be uploaded.
- Retry Interval The amount of time before a failed upload is retried.
- Max Attempts The maximum number of times the connector will process the input file. Success is measured based on a successful server acknowledgement. If this is set to 0, the connect will retry the file indefinitely.
- Receive Whether the connector should automatically poll the remote download path for files to download.
- Receive Interval The interval between automatic download attempts.
- Minutes The number of minutes to wait before downloading. Only applicable when Receive Interval is set to Minute.
- Minutes Past the Hour The minutes offset for an hourly schedule. Only applicable when Receive Interval is set to Hourly. For example, if this value is set to 5, the automation service will download at 1:05, 2:05, 3:05, etc.
- Time The time within a given day that the download should occur. Only applicable when Receive Interval is set to Daily, or Weekly, or Monthly.
- Day The day on which the download should occur. Only applicable when Receive Interval is set to Weekly or Monthly.
- Cron Expression An arbitrary string representing a cron expression that determines when the download should occur. Only applicable when Receive Interval is set to Advanced.
Settings related to the allocation of resources to the connector.
- Max Workers The maximum number of worker threads that will be consumed from the threadpool to process files on this connector. If set, overrides the default setting from the Profile tab.
- Max Files The maximum number of files that will be processed by the connector each time worker threads are assigned to the connector. If set, overrides the default setting from the Profile tab.
Settings not included in the previous categories.
- Log Level The verbosity of logs generated by the connector. When requesting support, it is recommended to set this to Debug.
- Recurse Subdirectories Whether to download files in subfolders of the target remote path.
- Timeout The duration to wait for a server response before throwing a Timeout error.
- Parent Connector The connector from which settings should be inherited, unless explicitly overwritten within the existing connector configuration. Must be set to a connector of the same type as the current connector.
- Receive Filter A glob pattern filter to determine which files should be downloaded from the remote storage (e.g. *.txt). Negative patterns may be used to indicate files that should not be downloaded (e.g. -*.tmp). This setting should be used when multiple File Mask patterns are desired. Multiple patterns may be separated by commas, with later filters taking priority except when an exact match is found.
- Send Filter A glob pattern filter to determine which files in the Send folder will be uploaded by the connector (e.g. *.txt). Negative patterns may be used to indicate files that should not be uploaded (e.g. -*.tmp). Multiple patterns may be separated by commas, with later filters taking priority except when an exact match is found.
- Log Subfolder Scheme Instructs the connector to group files in the Logs folder according to the selected interval. For example, the Weekly option instructs the connector to create a new subfolder each week and store all logs for the week in that folder. The blank setting tells the connector to save all logs directly in the Logs folder. For connectors that process many transactions, using subfolders can help keep logs organized and improve performance.
- Log Messages Whether the log entry for a processed file will include a copy of the file itself.
- Save to Sent Folder Whether files processed by the connector should be copied to the Sent folder for the connector.
Settings for specific use cases.
- Other Settings Allows configuration of hidden connector settings in a semicolon-separated list, like
setting1=value1;setting2=value2. Normal connector use cases and functionality should not require use of these settings.
Establishing a Connection
OAuth credentials are required to establish a connection with the Google Storage Connector. OAuth credentials should be acquired directly from Google via the Google Console, then specified in the following connector fields:
- OAuth Client ID
- OAuth Client Secret
Google may require an OAuth callback/redirect URL when generating OAuth credentials. The Arc callback/redirect URL is the same as the address/port where the application is hosted, plus the following resource path: src/oauthCallback.rst. For example, if Arc is hosted on ‘mydomain.com’ on port 8001, the following URL should be specified as the callback/redirect URL: http://mydomain.com:8001/src/oauthCallback.rst.
Once authorization is successfully completed with Google, the Google Storage connector handles the processing of refreshing the OAuth tokens to ensure that authentication persists over time.
Uploading and Downloading
To upload files, set the Bucket Name field to the target bucket, then set the Send -> Remote Folder field to the folder where files should be uploaded to. Each Google Storage Connector uploads to a single specified folder location.
The Overwrite remote files option can be toggled to determine whether files that already exist in the remote folder should be overwritten or skipped. The Send Filter field in the Advanced tab can be used to determine which files should be uploaded by the connector, based on filename or file extension.
After configuring, any files placed into the Send/Input Folder of the connector will be uploaded to the remote storage. If Send Automation is enabled, the upload will happen automatically; otherwise, individual files can be Sent via the Input tab of the connector settings.
To download files, set the Bucket Name field to the target bucket, then set the Receive -> Remote Folder field to the folder where files should be downloaded from. Multiple download folders can be specified in a comma-delimited list.
The Receive Filter field in the Advanced tab can be set to a glob filter (e.g. *.txt), and only files matching this filter will be downloaded. Multiple filters can be specified in a comma-delimited list.
The Delete files (after received) option toggles whether or not successfully downloaded files should be removed from the remote storage afterwards.
After configuring, files will be downloaded according to the Receive Automation settings, or by manually clicking the Receive button within the Output tab.