The AWS S3 connector, known by the type name s3, exposes stored procedures to leverage resources stored in AWS S3.

Connector-specific Connection Properties

Name

Description

keyId

S3 key id

secretKey

S3 secret key

bucketName

S3 bucket name to work with

region

S3 region (optional)

prefix

pathAndPattern prefix to be used when handling files

Example

CALL SYSADMIN.createConnection('s3alias', 's3', 'region=eu-west-1, keyId=<id>, secretKey="<secret>", bucketName=dv-redshift-upload-test');;
CALL SYSADMIN.createDatasource('s3alias', 'ufile', 'importer.useFullSchemaName=false', null);;

IAM Role Authorization

When IAM Role authorization is configured, the keyId and secretKey connector parameters can be omitted:

CALL SYSADMIN.createConnection('s3alias', 's3', 'region=eu-west-1, bucketName=dv-redshift-upload-test');;
CALL SYSADMIN.createDatasource('s3alias', 'ufile', 'importer.useFullSchemaName=false', null);;

Example

This example shows using IAM policy on the AWS side:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowAccountLevelS3Actions",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:HeadBucket"
],
"Resource": "*"
},
{
"Sid": "AllowListAndReadS3ActionOnMyBucket",
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::mk-s3-test/*",
"arn:aws:s3:::mk-s3-test"
]
}
]
}

Multi-part Upload

The AWS S3 connector can be configured to perform the multipart upload using the following properties:

Name

Description

Default value

multipartUpload

TRUE for performing multi-part upload (optional)

FALSE

numberOfThreads

Number of threads for multi-part upload (optional)

5

partSize

Part size for multi-part upload in bytes (optional)

5MB

The partSize can be specified between 5 MB to 5 TB in size. If the specified value is out of this range, it will be automatically changed to either 5 MB or 5 TB, respectively.

Example

CALL SYSADMIN.createConnection('s3alias', 's3', 'region=eu-west-1,keyId=<id>,secretKey="<secret>",bucketName=dv-redshift-upload-test,multipartUpload=true,partSize=1024,numberOfThreads=5');;
CALL SYSADMIN.createDatasource('s3alias', 'ufile', 'importer.useFullSchemaName=false', null);;

Prefix

The Prefix enables limiting result set (see SDK documentation):

  • The Prefix property value gets passed in connectionOrResourceAdapterProperties;
  • All procedures of the connector automatically take the prefix into consideration (e.g. calling listFiles(pathAndPattern => NULL) still applies the prefix from the data source settings;
  • If the data source has a prefix configured, and a pathAndPattern gets passed, the values are concatenated. For example, if the data source is configured with prefix: a/b, and listFiles(pathAndPattern => 'c/d') gets called, this results in a/b/c/d.

Ceph Support

Ceph is an open-source distributed storage solution that can use S3 API. Please note that for the CData Virtuality S3 connector to work with Ceph, the RGW service must be configured.

A data source connected to Ceph via S3 API can be configured with the following properties:

Name

Description

endPoint

Mandatory in the case of Ceph; otherwise S3 API will use its Amazon endpoints by default

passStyleAccess

Mandatory in the case of Ceph if the DNS is not configured on the server running it; otherwise, by default, the S3 library will add a bucket name to the initial endpoint

Example

CALL SYSADMIN.createConnection(name => 'test_ceph_rgw', jbossCLITemplateName => 's3', connectionOrResourceAdapterProperties => 'endPoint=<endPoint>,keyId=<keyID>,secretKey=<secretKey>,bucketName=<bucketName>,passStyleAccess=true');;
 
CALL SYSADMIN.createDataSource(name => 'test_ceph_rgw', translator => 'ufile', modelProperties => 'importer.useFullSchemaName=false', translatorProperties => '');;