CData Python Connector for Parquet

Build 24.0.9060

Connection String Options

The connection string properties are the various options that can be used to establish a connection. This section provides a complete list of the options you can configure in the connection string for this provider. Click the links for further details.

For more information on establishing a connection, see Establishing a Connection.

Authentication


PropertyDescription
AuthSchemeThe type of authentication to use when connecting to remote services.
AccessKeyYour account access key. This value is accessible from your security credentials page.
SecretKeyYour account secret key. This value is accessible from your security credentials page.
ApiKeyThe API Key used to identify the user to IBM Cloud.
UserThe user account used to authenticate.
PasswordThe password used to authenticate the user.
SharePointEditionThe edition of SharePoint being used. Set either SharePointOnline or SharePointOnPremise.
ImpersonateUserModeSpecify the type of the user impersonation. It should be whether the User mode or the Admin mode.

Connection


PropertyDescription
ConnectionTypeSpecifies the file storage service, server, or file access protocol through which your Parquet files are stored and retreived.
URIThe Uniform Resource Identifier (URI) for the Parquet resource location.
DataModelSpecifies the data model to use when parsing Parquet documents and generating the database metadata.
RegionThe hosting region for your S3-like Web Services.
ProjectIdThe Id of the project where your Google Cloud Storage instance resides.
OracleNamespaceThe Oracle Cloud Object Storage namespace to use.
StorageBaseURLThe URL of a cloud storage service provider.
UseVirtualHostingIf true (default), buckets will be referenced in the request using the hosted-style request: http://yourbucket.s3.amazonaws.com/yourobject. If set to false, the bean will use the path-style request: http://s3.amazonaws.com/yourbucket/yourobject. Note that this property will be set to false, in case of an S3 based custom service when the CustomURL is specified.
UseLakeFormationWhen this property is set to true, AWSLakeFormation service will be used to retrieve temporary credentials, which enforce access policies against the user based on the configured IAM role. The service can be used when authenticating through OKTA, ADFS, AzureAD, PingFederate, while providing a SAML assertion.

AWS Authentication


PropertyDescription
AWSAccessKeyYour AWS account access key. This value is accessible from your AWS security credentials page.
AWSSecretKeyYour AWS account secret key. This value is accessible from your AWS security credentials page.
AWSRoleARNThe Amazon Resource Name of the role to use when authenticating.
AWSPrincipalARNThe ARN of the SAML Identity provider in your AWS account.
AWSRegionThe hosting region for your Amazon Web Services.
AWSCredentialsFileThe path to the AWS Credentials File to be used for authentication.
AWSCredentialsFileProfileThe name of the profile to be used from the supplied AWSCredentialsFile.
AWSSessionTokenYour AWS session token.
AWSExternalIdA unique identifier that might be required when you assume a role in another account.
MFASerialNumberThe serial number of the MFA device if one is being used.
MFATokenThe temporary token available from your MFA device.
CredentialsLocationThe location of the settings file where MFA credentials are saved.
TemporaryTokenDurationThe amount of time (in seconds) a temporary token will last.
AWSWebIdentityTokenThe OAuth 2.0 access token or OpenID Connect ID token that is provided by an identity provider.
ServerSideEncryptionWhen activated, file uploads into Amazon S3 buckets will be server-side encrypted.
SSEContextA BASE64-encoded UTF-8 string holding JSON which represents a string-string (key-value) map.
SSEEnableS3BucketKeysConfiguration to use an S3 Bucket Key at the object level when encrypting data with AWS KMS. Enabling this will reduce the cost of server-side encryption by lowering calls to AWS KMS.
SSEKeyA symmetric encryption KeyManagementService key, that is used to protect the data when using ServerSideEncryption.

Azure Authentication


PropertyDescription
AzureStorageAccountThe name of your Azure storage account.
AzureAccessKeyThe storage key associated with your Azure account.
AzureSharedAccessSignatureA shared access key signature that may be used for authentication.
AzureTenantThe Microsoft Online tenant being used to access data. If not specified, your default tenant is used.
AzureEnvironmentThe Azure Environment to use when establishing a connection.

SSO


PropertyDescription
SSOLoginURLThe identity provider's login URL.
SSOPropertiesAdditional properties required to connect to the identity provider, formatted as a semicolon-separated list.
SSOExchangeUrlThe URL used for consuming the SAML response and exchanging it for service specific credentials.

JWT OAuth


PropertyDescription
OAuthJWTCertThe JWT Certificate store.
OAuthJWTCertTypeThe type of key store containing the JWT Certificate.
OAuthJWTCertPasswordThe password for the OAuth JWT certificate.
OAuthJWTCertSubjectThe subject of the OAuth JWT certificate.
OAuthJWTSubjectTypeThe SubType for the JWT authentication.
OAuthJWTPublicKeyIdThe Id of the public key for JWT.

Kerberos


PropertyDescription
KerberosKDCThe Kerberos Key Distribution Center (KDC) service used to authenticate the user.
KerberosRealmThe Kerberos Realm used to authenticate the user.
KerberosSPNThe service principal name (SPN) for the Kerberos Domain Controller.
KerberosUserThe principal name for the Kerberos Domain Controller. Used in the format host/user@realm.
KerberosKeytabFileThe Keytab file containing your pairs of Kerberos principals and encrypted keys.
KerberosServiceRealmThe Kerberos realm of the service.
KerberosServiceKDCThe Kerberos KDC of the service.
KerberosTicketCacheThe full file path to an MIT Kerberos credential cache file.

OAuth


PropertyDescription
InitiateOAuthSet this property to initiate the process to obtain or refresh the OAuth access token when you connect.
OAuthVersionThe version of OAuth being used.
OAuthClientIdThe client Id assigned when you register your application with an OAuth authorization server.
OAuthClientSecretThe client secret assigned when you register your application with an OAuth authorization server.
OAuthAccessTokenThe access token for connecting using OAuth.
OAuthAccessTokenSecretThe OAuth access token secret for connecting using OAuth.
SubjectIdThe user subject for which the application is requesting delegated access.
SubjectTypeThe Subject Type for the Client Credentials authentication.
OAuthSettingsLocationThe location of the settings file where OAuth values are saved when InitiateOAuth is set to GETANDREFRESH or REFRESH . Alternatively, you can hold this location in memory by specifying a value starting with 'memory://'.
CallbackURLThe OAuth callback URL to return to when authenticating. This value must match the callback URL you specify in your app settings.
ScopeSpecify scope to obtain the initial access and refresh token.
OAuthGrantTypeThe grant type for the OAuth flow.
OAuthPasswordGrantModeSpecifies how the OAuth Client Id and Client Secret should be passed. Supported options: BASIC and POST.
OAuthIncludeCallbackURLWhether to include the callback URL in an access token request.
OAuthAuthorizationURLThe authorization URL for the OAuth service.
OAuthAccessTokenURLThe URL to retrieve the OAuth access token from.
OAuthRefreshTokenURLThe URL to refresh the OAuth token from.
OAuthRequestTokenURLThe URL the service provides to retrieve request tokens from. This is required in OAuth 1.0.
OAuthVerifierThe verifier code returned from the OAuth authorization URL.
AuthTokenThe authentication token used to request and obtain the OAuth Access Token.
AuthKeyThe authentication secret used to request and obtain the OAuth Access Token.
OAuthParamsA comma-separated list of other parameters to submit in the request for the OAuth access token in the format paramname=value.
OAuthRefreshTokenThe OAuth refresh token for the corresponding OAuth access token.
OAuthExpiresInThe lifetime in seconds of the OAuth AccessToken.
OAuthTokenTimestampThe Unix epoch timestamp in milliseconds when the current Access Token was created.

SSL


PropertyDescription
SSLClientCertThe TLS/SSL client certificate store for SSL Client Authentication (2-way SSL).
SSLClientCertTypeThe type of key store containing the TLS/SSL client certificate.
SSLClientCertPasswordThe password for the TLS/SSL client certificate.
SSLClientCertSubjectThe subject of the TLS/SSL client certificate.
SSLModeThe authentication mechanism to be used when connecting to the FTP or FTPS server.
SSLServerCertThe certificate to be accepted from the server when connecting using TLS/SSL.

SSH


PropertyDescription
SSHAuthModeThe authentication method used when establishing an SSH Tunnel to the service.
SSHClientCertA certificate to be used for authenticating the SSHUser.
SSHClientCertPasswordThe password of the SSHClientCert key if it has one.
SSHClientCertSubjectThe subject of the SSH client certificate.
SSHClientCertTypeThe type of SSHClientCert private key.
SSHUserThe SSH user.
SSHPasswordThe SSH password.

Firewall


PropertyDescription
FirewallTypeThe protocol used by a proxy-based firewall.
FirewallServerThe name or IP address of a proxy-based firewall.
FirewallPortThe TCP port for a proxy-based firewall.
FirewallUserThe user name to use to authenticate with a proxy-based firewall.
FirewallPasswordA password used to authenticate to a proxy-based firewall.

Proxy


PropertyDescription
ProxyAutoDetectWhen this connection property is set to True, the provider checks your system proxy settings for existing proxy server configurations (no need to manually supply proxy server details). Set to False if you want to manually configure the provider to connect to a specific proxy server.
ProxyServerThe hostname or IP address of the proxy server that you want to route HTTP traffic through.
ProxyPortThe TCP port that the proxy server (specified in the ProxyServer connection property) is running on.
ProxyAuthSchemeThe authentication method the provider uses when authenticating to the proxy server specified in the ProxyServer connection property.
ProxyUserThe username of a user account registered with the proxy server specified in the ProxyServer connnection property.
ProxyPasswordThe password associated with the user specified in the ProxyUser connection property.
ProxySSLTypeThe SSL type to use when connecting to the ProxyServer proxy.
ProxyExceptionsA semicolon separated list of destination hostnames or IPs that are exempt from connecting through the ProxyServer .

Logging


PropertyDescription
LogfileA filepath which designates the name and location of the log file.
VerbosityThe verbosity level that determines the amount of detail included in the log file.
LogModulesCore modules to be included in the log file.
MaxLogFileSizeA string specifying the maximum size in bytes for a log file (for example, 10 MB).
MaxLogFileCountA string specifying the maximum file count of log files.

Schema


PropertyDescription
LocationA path to the directory that contains the schema files defining tables, views, and stored procedures.
BrowsableSchemasThis property restricts the schemas reported to a subset of the available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC.
TablesThis property restricts the tables reported to a subset of the available tables. For example, Tables=TableA,TableB,TableC.
ViewsRestricts the views reported to a subset of the available tables. For example, Views=ViewA,ViewB,ViewC.
FlattenObjectsSet FlattenObjects to true to flatten object properties into columns of their own. Otherwise, objects nested in arrays are returned as strings of JSON.
FlattenArraysBy default, nested arrays are returned as strings. The FlattenArrays property can be used to flatten the elements of nested arrays into columns of their own. Set FlattenArrays to the number of elements you want to return from nested arrays.

Caching


PropertyDescription
AutoCacheAutomatically caches the results of SELECT queries into a cache database specified by either CacheLocation or both of CacheConnection and CacheProvider .
CacheProviderThe name of the provider to be used to cache data.
CacheDriverThe database driver used to cache data.
CacheConnectionThe connection string for the cache database. This property is always used in conjunction with CacheProvider . Setting both properties will override the value set for CacheLocation for caching data.
CacheLocationSpecifies the path to the cache when caching to a file.
CacheToleranceThe tolerance for stale data in the cache specified in seconds when using AutoCache .
OfflineUse offline mode to get the data from the cache instead of the live source.
CacheMetadataThis property determines whether or not to cache the table metadata to a file store.

Miscellaneous


PropertyDescription
AggregateFilesWhen set to true, the provider will aggregate all the files in URI directory into a single result. With this option enabled, the AggregatedFiles will be exposed which can be used to query the dataset.
CharsetSpecifies the session character set for encoding and decoding character data transferred to and from the Parquet file. The default value is UTF-8.
ClientCultureThis property can be used to specify the format of data (e.g., currency values) that is accepted by the client application. This property can be used when the client application does not support the machine's culture settings. For example, Microsoft Access requires 'en-US'.
CompressionSpecifies which compression encoding to be used when creating .parquet files using Create Table Statement and Bulk Inserts.
CultureThis setting can be used to specify culture settings that determine how the provider interprets certain data types that are passed into the provider. For example, setting Culture='de-DE' will output German formats even on an American machine.
DeleteDownloadedFilesWhen set to true, the provider will delete parsed .parquet files downloaded from cloud sources.
DirectoryRetrievalDepthLimit the subfolders recursively scanned when IncludeSubdirectories is enabled.
EnableDictionaryWhen set to true, the provider will enable dictionary encoding when creating .parquet files using Create Table Statement and Bulk Inserts.
ExcludeFilesComma-separated list of file extensions to exclude from the set of the files modeled as tables.
FolderIdThe ID of a folder in Google Drive. If set, the resource location specified by the URI is relative to the Folder ID for all operations.
IncludeDropboxTeamResourcesIndicates if you want to include Dropbox team files and folders.
IncludeFilesComma-separated list of file extensions to include into the set of the files modeled as tables.
IncludeItemsFromAllDrivesWhether Google Drive shared drive items should be included in results. If not present or set to false, then shared drive items are not returned.
IncludeSubdirectoriesWhether to read files from nested folders. In the case of a name collision, table names are prefixed by the underscore-separated folder names.
InsertModeThe behavior when using bulk inserts to create Parquet files.
MaxRowsLimits the number of rows returned when no aggregation or GROUP BY is used in the query. This takes precedence over LIMIT clauses.
MetadataDiscoveryURIUsed when aggregating multiple files into one table, this property specifies a specific file to read to determined the aggregated table schema.
OtherThese hidden properties are used only in specific use cases.
PageSize(Optional) PageSize value.
PathSeparatorDetermines the character which will be used to replace the file separator.
PseudoColumnsSpecify a set of pseudocolumns to expose as columns.
ReadonlyYou can use this property to enforce read-only access to Parquet from the provider.
RTKThe runtime key used for licensing.
TemporaryLocalFolderThe path, or URI, to the folder that is used to temporarily download parquet file(s).
TimeoutThe value in seconds until the timeout error is thrown, canceling the operation.
UserDefinedViewsA filepath pointing to the JSON configuration file containing your custom views.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060