The AWS S3 connector, known by the type name s3
, exposes stored procedures to leverage resources stored in AWS S3.
Connector-specific Connection Properties
Name | Description |
---|---|
| S3 key id |
| S3 secret key |
| S3 bucket name to work with |
| S3 region (optional) |
|
|
Example
CALL SYSADMIN.createConnection(
's3alias'
,
's3'
,
'region=eu-west-1, keyId=<id>, secretKey="<secret>", bucketName=dv-redshift-upload-test'
);;
CALL SYSADMIN.createDatasource(
's3alias'
,
'ufile'
,
'importer.useFullSchemaName=false'
,
null
);;
IAM Role Authorization
When IAM Role authorization is configured, the keyId
and secretKey
connector parameters can be omitted:
CALL SYSADMIN.createConnection(
's3alias'
,
's3'
,
'region=eu-west-1, bucketName=dv-redshift-upload-test'
);;
CALL SYSADMIN.createDatasource(
's3alias'
,
'ufile'
,
'importer.useFullSchemaName=false'
,
null
);;
Example
This example shows using IAM policy on the AWS side:
{
"Version"
:
"2012-10-17"
,
"Statement"
: [
{
"Sid"
:
"AllowAccountLevelS3Actions"
,
"Effect"
:
"Allow"
,
"Action"
: [
"s3:ListAllMyBuckets"
,
"s3:HeadBucket"
],
"Resource"
:
"*"
},
{
"Sid"
:
"AllowListAndReadS3ActionOnMyBucket"
,
"Effect"
:
"Allow"
,
"Action"
: [
"s3:Get*"
,
"s3:List*"
],
"Resource"
: [
"arn:aws:s3:::mk-s3-test/*"
,
"arn:aws:s3:::mk-s3-test"
]
}
]
}
Multi-part Upload
The AWS S3 connector can be configured to perform the multipart upload using the following properties:
Name | Description | Default value |
---|---|---|
|
|
|
| Number of threads for multi-part upload (optional) | 5 |
| Part size for multi-part upload in bytes (optional) | 5MB |
The partSize
can be specified between 5 MB to 5 TB in size. If the specified value is out of this range, it will be automatically changed to either 5 MB or 5 TB, respectively.
Example
CALL SYSADMIN.createConnection(
's3alias'
,
's3'
,
'region=eu-west-1,keyId=<id>,secretKey="<secret>",bucketName=dv-redshift-upload-test,multipartUpload=true,partSize=1024,numberOfThreads=5'
);;
CALL SYSADMIN.createDatasource(
's3alias'
,
'ufile'
,
'importer.useFullSchemaName=false'
,
null
);;
Prefix
The Prefix enables limiting result set (see SDK documentation):
- The
Prefix
property value gets passed inconnectionOrResourceAdapterProperties
; - All procedures of the connector automatically take the prefix into consideration (e.g. calling
listFiles(pathAndPattern => NULL)
still applies the prefix from the data source settings; - If the data source has a prefix configured, and a
pathAndPattern
gets passed, the values are concatenated. For example, if the data source is configured with prefix: a/b, andlistFiles(pathAndPattern => 'c/d')
gets called, this results ina/b/c/d
.
Ceph Support
Ceph is an open-source distributed storage solution that can use S3 API. Please note that for the CData Virtuality S3 connector to work with Ceph, the RGW service must be configured.
A data source connected to Ceph via S3 API can be configured with the following properties:
Name | Description |
---|---|
| Mandatory in the case of Ceph; otherwise S3 API will use its Amazon endpoints by default |
| Mandatory in the case of Ceph if the DNS is not configured on the server running it; otherwise, by default, the S3 library will add a bucket name to the initial endpoint |
Example
CALL SYSADMIN.createConnection(
name
=>
'test_ceph_rgw'
, jbossCLITemplateName =>
's3'
, connectionOrResourceAdapterProperties =>
'endPoint=<endPoint>,keyId=<keyID>,secretKey=<secretKey>,bucketName=<bucketName>,passStyleAccess=true'
);;
CALL SYSADMIN.createDataSource(
name
=>
'test_ceph_rgw'
, translator =>
'ufile'
, modelProperties =>
'importer.useFullSchemaName=false'
, translatorProperties =>
''
);;