Cloud

Build 24.0.9175
  • Google BigQuery
    • Getting Started
      • Establishing a Connection
      • Advanced Integrations
      • Minimum Required Roles
      • SSL Configuration
      • Firewall and Proxy
    • Data Model
      • Tables
      • Views
        • Datasets
        • PartitionsList
        • PartitionsValues
        • Projects
      • Stored Procedures
        • CancelJob
        • DeleteTable
        • GetJob
        • InsertJob
        • InsertLoadJob
      • System Tables
        • sys_catalogs
        • sys_schemas
        • sys_tables
        • sys_tablecolumns
        • sys_procedures
        • sys_procedureparameters
        • sys_keycolumns
        • sys_foreignkeys
        • sys_primarykeys
        • sys_indexes
        • sys_connection_props
        • sys_sqlinfo
        • sys_identity
        • sys_information
      • Data Type Mapping
    • Connection String Options
      • Authentication
        • AuthScheme
        • ProjectId
        • DatasetId
        • BillingProjectId
      • BigQuery
        • AllowLargeResultSets
        • DestinationTable
        • UseQueryCache
        • PageSize
        • PollingInterval
        • AllowUpdatesWithoutKey
        • FilterColumns
        • UseLegacySQL
        • PrivateEndpointName
      • Storage API
        • UseStorageAPI
        • UseArrowFormat
        • StorageThreshold
        • StoragePageSize
      • Uploading
        • InsertMode
        • WaitForBatchResults
        • TempTableDataset
      • OAuth
        • OAuthClientId
        • OAuthClientSecret
        • DelegatedServiceAccounts
        • RequestingServiceAccount
      • JWT OAuth
        • OAuthJWTCert
        • OAuthJWTCertType
        • OAuthJWTCertPassword
        • OAuthJWTCertSubject
        • OAuthJWTIssuer
        • OAuthJWTSubject
      • SSL
        • SSLServerCert
      • Logging
        • Verbosity
      • Schema
        • BrowsableSchemas
        • RefreshViewSchemas
        • ShowTableDescriptions
        • PrimaryKeyIdentifiers
        • AllowedTableTypes
        • FlattenObjects
      • Miscellaneous
        • StorageTimeout
        • EmptyArraysAsNull
        • HidePartitionColumns
        • AllowAggregateParameters
        • ApplicationName
        • AuditLimit
        • AuditMode
        • AWSWorkloadIdentityConfig
        • BigQueryOptions
        • MaximumBillingTier
        • MaximumBytesBilled
        • MaxRows
        • PseudoColumns
        • SupportCaseSensitiveTables
        • TableSamplePercent
        • Timeout
        • WorkloadPoolId
        • WorkloadProjectId
        • WorkloadProviderId
    • Third Party Copyrights

Google BigQuery - CData Cloud

Overview

CData Cloud offers access to Google BigQuery across several standard services and protocols, in a cloud-hosted solution. Any application that can connect to a MySQL or SQL Server database can connect to Google BigQuery through CData Cloud.

CData Cloud allows you to standardize and configure connections to Google BigQuery as though it were any other OData endpoint, or standard SQL Server/MySQL database.

Key Features

  • Full SQL Support: Google BigQuery appears as standard relational databases, allowing you to perform operations - Filter, Group, Join, etc. - using standard SQL, regardless of whether these operations are supported by the underlying API.
  • CRUD Support: Both read and write operations are supported, restricted only by security settings that you can configure in Cloud or downstream in the source itself.
  • Secure Access: The administrator can create users and define their access to specific databases and read-only operations or grant full read & write privileges.
  • Comprehensive Data Model & Dynamic Discovery: CData Cloud provides comprehensive access to all of the data exposed in the underlying data source, including full access to dynamic data and easily searchable metadata.

CData Cloud

Getting Started

This page provides a guide to Establishing a Connection to Google BigQuery in CData Cloud, as well as information on the available resources, and a reference to the available connection properties.

Connecting to Google BigQuery

Establishing a Connection shows how to authenticate to Google BigQuery and configure any necessary connection properties to create a database in CData Cloud

Accessing Data from CData Cloud Services

Accessing data from Google BigQuery through the available standard services and CData Cloud administration is documented in further details in the CData Cloud Documentation.

CData Cloud

Establishing a Connection

Connect to Google BigQuery by selecting the corresponding icon in the Database tab. Required properties are listed under Settings. The Advanced tab lists connection properties that are not typically required.

Connecting to Google BigQuery

By default, the CData Cloud connects to all available projects in your database. To limit the scope of your connection, set combinations of the following properties:

  • ProjectId: specifies which projects the driver connects to
  • BillingProjectId: specifies which projects are billed
  • DatasetId: specifies which datasets the driver accesses

Authenticating to Google BigQuery

The Cloud supports using user accounts and GCP instance accounts for authentication.

The following sections discuss the available authentication schemes for Google BigQuery:

  • User Accounts (OAuth)
  • Service Account (OAuthJWT)
  • GCP Instance Account

User Accounts (OAuth)

AuthScheme must be set to OAuth in all user account flows.

Web Applications

When connecting via a Web application, you need to create and register a custom OAuth application with Google BigQuery. You can then use the Cloud to acquire and manage the OAuth token values. See Creating a Custom OAuth App for more information about custom applications.

Get an OAuth Access Token

Set the following connection properties to obtain the OAuthAccessToken:

  • OAuthClientId: Set this to the Client Id in your application settings.
  • OAuthClientSecret: Set this to the Client Secret in your application settings.

Then call stored procedures to complete the OAuth exchange:

  1. Call the GetOAuthAuthorizationURL stored procedure. Set the CallbackURL input to the Callback URL you specified in your application settings. The stored procedure returns the URL to the OAuth endpoint.
  2. Navigate to the URL that the stored procedure returned in Step 1. Log in to the custom OAuth application and authorize the web application. Once authenticated, the browser redirects you to the callback URL.
  3. Call the GetOAuthAccessToken stored procedure. Set AuthMode to WEB and the Verifier input to the "code" parameter in the query string of the callback URL.

Once you have obtained the access and refresh tokens, you can connect to data and refresh the OAuth access token either automatically or manually.

Automatic Refresh of the OAuth Access Token

To have the driver automatically refresh the OAuth access token, set the following on the first data connection:

  • InitiateOAuth: Set this to REFRESH.
  • OAuthClientId: Set this to the Client Id in your application settings.
  • OAuthClientSecret: Set this to the Client Secret in your application settings.
  • OAuthAccessToken: Set this to the access token returned by GetOAuthAccessToken.
  • OAuthRefreshToken: Set this to the refresh token returned by GetOAuthAccessToken.
  • OAuthSettingsLocation: Set this to the location where the Cloud saves the OAuth token values, which persist across connections.
On subsequent data connections, the values for OAuthAccessToken and OAuthRefreshToken are taken from OAuthSettingsLocation.

Manual Refresh of the OAuth Access Token

The only value needed to manually refresh the OAuth access token when connecting to data is the OAuth refresh token.

Use the RefreshOAuthAccessToken stored procedure to manually refresh the OAuthAccessToken after the ExpiresIn parameter value returned by GetOAuthAccessToken has elapsed, then set the following connection properties:

  • OAuthClientId: Set this to the Client Id in your application settings.
  • OAuthClientSecret: Set this to the Client Secret in your application settings.

Then call RefreshOAuthAccessToken with OAuthRefreshToken set to the OAuth refresh token returned by GetOAuthAccessToken. After the new tokens have been retrieved, open a new connection by setting the OAuthAccessToken property to the value returned by RefreshOAuthAccessToken.

Finally, store the OAuth refresh token so that you can use it to manually refresh the OAuth access token after it has expired.

Headless Machines

To configure the driver to use OAuth with a user account on a headless machine, you need to authenticate on another device that has an internet browser.

  1. Choose one of two options:
    • Option 1: Obtain the OAuthVerifier value as described in "Obtain and Exchange a Verifier Code" below.
    • Option 2: Install the Cloud on a machine with an Internet browser and transfer the OAuth authentication values after you authenticate through the usual browser-based flow, as described in "Transfer OAuth Settings" below.
  2. Then configure the Cloud to automatically refresh the access token on the headless machine.

Option 1: Obtain and Exchange a Verifier Code

To obtain a verifier code, you must authenticate at the OAuth authorization URL.

Follow the steps below to authenticate from the machine with an Internet browser and obtain the OAuthVerifier connection property.

  1. Choose one of these options:
    • If you are using the Embedded OAuth Application click Google BigQuery OAuth endpoint to open the endpoint in your browser.
    • If you are using a custom OAuth application, create the Authorization URL by setting the following properties:
      • InitiateOAuth: Set to OFF.
      • OAuthClientId: Set to the client Id assigned when you registered your application.
      • OAuthClientSecret: Set to the client secret assigned when you registered your application.
      Then call the GetOAuthAuthorizationURL stored procedure with the appropriate CallbackURL. Open the URL returned by the stored procedure in a browser.
  2. Log in and grant permissions to the Cloud. You are then redirected to the callback URL, which contains the verifier code.
  3. Save the value of the verifier code. Later you will set this in the OAuthVerifier connection property.
Next, you need to exchange the OAuth verifier code for OAuth refresh and access tokens. Set the following properties:

On the headless machine, set the following connection properties to obtain the OAuth authentication values:

  • InitiateOAuth: Set this to REFRESH.
  • OAuthVerifier: Set this to the verifier code.
  • OAuthClientId: (custom applications only) Set this to the Client Id in your custom OAuth application settings.
  • OAuthClientSecret: (custom applications only) Set this to the Client Secret in the custom OAuth application settings.
  • OAuthSettingsLocation: Set this to persist the encrypted OAuth authentication values to the specified location.

After the OAuth settings file is generated, you need to re-set the following properties to connect:

  • InitiateOAuth: Set this to REFRESH.
  • OAuthClientId: (custom applications only) Set this to the client Id assigned when you registered your application.
  • OAuthClientSecret: (custom applications only) Set this to the client secret assigned when you registered your application.
  • OAuthSettingsLocation: Set this to the location containing the encrypted OAuth authentication values. Make sure this location gives read and write permissions to the Cloud to enable the automatic refreshing of the access token.

Option 2: Transfer OAuth Settings

Prior to connecting on a headless machine, you need to create and install a connection with the driver on a device that supports an Internet browser. Set the connection properties as described in "Desktop Applications" above.

After completing the instructions in "Desktop Applications", the resulting authentication values are encrypted and written to the location specified by OAuthSettingsLocation. The default filename is OAuthSettings.txt.

Once you have successfully tested the connection, copy the OAuth settings file to your headless machine.

On the headless machine, set the following connection properties to connect to data:

  • InitiateOAuth: Set this to REFRESH.
  • OAuthClientId: (custom applications only) Set this to the client Id assigned when you registered your application.
  • OAuthClientSecret: (custom applications only) Set this to the client secret assigned when you registered your application.
  • OAuthSettingsLocation: Set this to the location of your OAuth settings file. Make sure this location gives read and write permissions to the Cloud to enable the automatic refreshing of the access token.

GCP Instance Accounts

When running on a GCP virtual machine, the Cloud can authenticate using a service account tied to the virtual machine. To use this mode, set AuthScheme to GCPInstanceAccount.

CData Cloud

Advanced Integrations

The following sections detail Cloud settings that may be needed in advanced integrations.

Saving Result Sets

Large result sets must be saved in a temporary or permanent table. You can use the following properties to control table persistence:

Automatic Result Tables

Enable the AllowLargeResultSets property to make the Cloud automatically create destination tables when needed. If a query result is too large to fit the BigQuery query cache, the Cloud creates a hidden dataset within the data project and re-executes the query with a destination table in that dataset. The dataset is configured so that all tables created within it expire in 24 hours.

In some situations you may want to change the name of the dataset created by the Cloud. For example, if multiple users are using the Cloud and do not have permissions to write to datasets created by the other users. See TempTableDataset for details on how to do this.

Explicit Result Tables

Enable the DestinationTable property to make the Cloud write query results to the given table. Writing query results to a single table imposes several limitations that you should keep in mind when using this option:

  • Two query results cannot be read at the same time on the same connection. If two queries are executed and their results are read at the same time, the last query to finish executing overwrites the data from the other query.
  • The dataset must be created in the same region as your tables. BigQuery does not support writing a destination table in a different region than where a query was executed.
  • Do not rely on the Cloud to create a temporary table for every query. Some queries are processed internally or read directly from a table without executing a query job on BigQuery.

Limiting Billing

Set MaximumBillingTier to override your project limits on the maximum cost for any given query in a connection.

Bulk Modes

Google BigQuery provides several interfaces for operating on batches of rows. The Cloud supports these methods through the InsertMode option, each of which are specialized to different use cases:

  • The Streaming API is intended for use where the most important factor is being able to insert quickly. However, rows which are inserted via the API are queued and only appear in the table after a delay. Sometimes this delay can be as high as 20-30 minutes which makes this API incompatible with cases where you want to insert data and then run other operations on it immediately. You should avoid modifying the table while any rows are in the streaming queue: Google BigQuery prevents DML operations from running on the table while any rows are in the streaming queue, and changing the table's metadata (name, schema, etc.) may cause streamed rows that haven't been committed to be lost.
  • The DML mode API uses Standard SQL INSERT queries to upload data. This is by the most robust method of uploading data because any errors in the uploaded rows will be reported immediately. The Cloud also uses this API in a synchronous way so once the INSERT is processed, any rows can be used by other operations without waiting. However, it is by far the slowest insert method and should only be used for small data volumes.
  • The Upload mode uses the multipart upload API for uploading data. This method is intended for performing low-cost medium to large data loads within a reasonable time. When using this mode the Cloud will upload the inserted rows to Google-managed storage and then create a load job for them. This job will execute and the Cloud can either wait for it (see WaitForBatchResults) or let it run asyncronously. Waiting for the job will report any errors that the job enconters but will take more time. Determining if the job failed without waiting for it requires manually checking the job status via the job stored procedures.
  • The GCSStaging mode is the same as Upload except that it uses your Google Cloud Storage acccount to store staged data instead of Google-managed storage. The Cloud cannot act asynchronously in this mode because it must delete the file after the load is complete, which means that WaitForBatchResults has no effect.
    Because this depends on external data, you must set the GCSBucket to the name of your bucket and ensure that Scope (a space delimited set of scopes) contains at least the scopes https://www.googleapis.com/auth/bigquery and https://www.googleapis.com/auth/devstorage.read_write. The devstorage scope used for GCS also requires that you connect using a service account because Google BigQuery does not allow user accounts to use this scope.

In addition to bulk INSERTs, the Cloud also supports performing bulk UPDATE and DELETE operations. This requires the Cloud to upload the data containing the filters and rows to set into a new table in BigQuery, then perform a MERGE between the two tables and drop the temporary table. InsertMode determines how the rows are inserted into the temporary table but the Streaming and DML modes are not supported.

In most cases the Cloud can determine what columns need to be part of the SET vs. WHERE clauses of a bulk update. If you receive an error like "Primary keys must be defined for bulk UPDATE support," you can use PrimaryKeyIdentifiers to tell the Cloud what columns to treat as keys. In an update the values of key columns are used only to find matching rows and cannot be updated.

CData Cloud

Minimum Required Roles

Minimum Required Roles for Service Accounts

The following roles allow SELECT queries to work with a service account:

  • BigQuery Data Viewer (roles/bigquery.dataViewer): read data and metadata
  • BigQuery Filtered Data Viewer (roles/bigquery.filteredDataViewer): view filtered table data
  • BigQuery Job User (roles/bigquery.jobUser): run jobs, including queries, within the project

CData Cloud

SSL Configuration

Customizing the SSL Configuration

By default, the Cloud attempts to negotiate TLS with the server. The server certificate is validated against the default system trusted certificate store. You can override how the certificate gets validated using the SSLServerCert connection property.

To specify another certificate, see the SSLServerCert connection property.

CData Cloud

Firewall and Proxy

Connecting Through a Firewall or Proxy

HTTP Proxies

To authenticate to an HTTP proxy, set the following:

  • ProxyServer: the hostname or IP address of the proxy server that you want to route HTTP traffic through.
  • ProxyPort: the TCP port that the proxy server is running on.
  • ProxyAuthScheme: the authentication method the Cloud uses when authenticating to the proxy server.
  • ProxyUser: the username of a user account registered with the proxy server.
  • ProxyPassword: the password associated with the ProxyUser.

Other Proxies

Set the following properties:

  • To use a proxy-based firewall, set FirewallType, FirewallServer, and FirewallPort.
  • To tunnel the connection, set FirewallType to TUNNEL.
  • To authenticate, specify FirewallUser and FirewallPassword.
  • To authenticate to a SOCKS proxy, additionally set FirewallType to SOCKS5.

CData Cloud

Data Model

The CData Cloud models the data as defined within Google BigQuery for the ProjectId and DatasetId configured.

Views

Views are client-side tables that cannot be modified. The Cloud uses these to report metadata about the Google BigQuery projects and datsets it is connected to.

In addition, the Cloud supports server-side views defined within Google BigQuery. These views may be used in SELECT statements the same way as tables. However, view schemas can easily become out of date and require the Cloud to refresh them. Please see RefreshViewSchemas for more details.

External Data Sources

Google BigQuery allows creating external datasets that store data in Amazon S3 regions (like aws-us-east-1) or Azure Storage regions (like azure-useast2). The Cloud supports these datasets with two major limitations:

  1. Google BigQuery treats external tables as read-only. You cannot execute INSERT, UPDATE or DELETE queries on them. They are also incompatible with DestinationTable because Google BigQuery cannot create destination tables in an external dataset.
  2. Google BigQuery does not support the Storage API for external datasets. You must disable the UseStorageApi option in order to query them. This limits the read throughput of the Cloud, so if you are executing large queries it is recommended that you copy your data into Google BigQuery for the best performance.

Stored Procedures

Stored Procedures are function-like interfaces to the data source. The Cloud uses these to manage Google BigQuery tables and jobs and to perform OAuth operations.

In addition to the client-side stored procedures offered by the Cloud, there is also support for server-side stored procedures defined in Google BigQuery. The Cloud supports both CALL and EXEC using the procedure's parameter names. Note that Cloud only supports IN parameters and resultset return values.

CALL `psychic-valve-137816`.Northwind.MostPopularProduct()
CALL `psychic-valve-137816`.Northwind.GetStockedValue(24, 0.75)

EXEC `psychic-valve-137816`.Northwind.MostPopularProduct
EXEC `psychic-valve-137816`.Northwind.GetSockedValue productId = 24, discountRate = 0.75

Additional Metadata

Table Descriptions

Google BigQuery supports setting descriptions on tables but the Cloud does not report these by default. ShowTableDescriptions can be used to report table descriptions.

Primary Keys

Google BigQuery does not support primary keys natively, but the Cloud allows you to define them so they can be used in environments that require primary keys to modify data. Primary keys can be defined using the PrimaryKeyIdentifiers option.

Policy Tags

If policy tags from the Data Catalog service are defined on a table, they can be retrieved from the system tables using the PolicyTags column:

SELECT ColumnName, PolicyTags FROM sys_tablecolumns
WHERE CatalogName = 'psychic-valve-137816'
AND SchemaName = 'Northwind'
AND TableName = 'Customers

CData Cloud

Tables

Tables

Table definitions are dynamically generated based on the table definitions within Google BigQuery for the Project and Dataset specified in the connection string options.

CData Cloud

Views

Views are similar to tables in the way that data is represented; however, views are read-only.

Queries can be executed against a view as if it were a normal table.

CData Cloud - Google BigQuery Views

Name Description
Datasets Lists all the accessible datasets for a given project.
PartitionsList Lists the partitioning definitions for tables
PartitionsValues Lists the partitioning ranges for tables
Projects Lists all the projects for the authorized user.

CData Cloud

Datasets

Lists all the accessible datasets for a given project.

Columns

Name Type Description
Id [KEY] String The fully qualified, unique, opaque Id of the dataset.
Kind String The resource type.
FriendlyName String A descriptive name for the dataset
DatasetReference_ProjectId String A unique reference to the container project.
DatasetReference_DatasetId String A unique reference to the dataset, without the project name.

CData Cloud

PartitionsList

Lists the partitioning definitions for tables

Columns

Name Type Description
Id [KEY] String A unique identifier for the partition.
ProjectId String The project that the table belongs to.
DatasetId String The dataset that the table belongs to.
TableName String The name of the table.
ColumnName String The name of the column used for partitioning.
ColumnType String The type of the partitioning column.
Kind String The type of partitioning used by the table. One of DATE, RANGE or INGESTION.
RequireFilter Boolean Whether a filter on the partition column is required to query the table.

CData Cloud

PartitionsValues

Lists the partitioning ranges for tables

Columns

Name Type Description
Id String A unique identifier for the partition.
RangeLow String The lowest value of the partition column. Either an integer when Kind is RANGE, or a date otherwise.
RangeHigh String The highest value of the partition column. Either an integer when Kind is RANGE, or a date otherwise.
RangeInterval String The range of values which are included in each partition. Only valid when Kind is RANGE
DateResolution String How much of the date is significant to a TIME or INGESTION partition column. One of DAY, HOUR, MONTH or YEAR.

CData Cloud

Projects

Lists all the projects for the authorized user.

Columns

Name Type Description
Id [KEY] String The unique identifier of the Project
Kind String The resource type.
FriendlyName String A descriptive name for the project.
NumericId String The numeric Id of the project.
ProjectReference_ProjectId String A unique reference to the project.

CData Cloud

Stored Procedures

Stored procedures are function-like interfaces that extend the functionality of the Cloud beyond simple SELECT/INSERT/UPDATE/DELETE operations with Google BigQuery.

Stored procedures accept a list of parameters, perform their intended function, and then return any relevant response data from Google BigQuery, along with an indication of whether the procedure succeeded or failed.

CData Cloud - Google BigQuery Stored Procedures

Name Description
CancelJob Cancels a running BigQuery job.
DeleteTable Deletes the specified table from Google BigQuery.
GetJob Retrieves the configuration information and execution state for an existing job
InsertJob Inserts a Google BigQuery job, which can then be selected later to retrieve the query results.
InsertLoadJob Inserts a Google BigQuery load job, which adds data from Google Cloud Storage into an existing table.

CData Cloud

CancelJob

Cancels a running BigQuery job.

Input

Name Type Description
JobId String The Id of the job you wish to cancel.
Region String The region where the job is executing. Not required if the job is a US or EU region.

Result Set Columns

Name Type Description
JobId String The JobId of the cancelled Job.
Region String The region where the job was executing.
Configuration_query_query String The query of the cancelled job.
Configuration_query_destinationTable_tableId String The destination table tableId of the cancelled job.
Configuration_query_destinationTable_projectId String The destination table projectId of the newly inserted job.
Configuration_query_destinationTable_datasetId String The destination table datasetId of the newly inserted job.
Status_State String Running state of the job.
Status_errorResult_reason String A short error code that summarizes the error.
Status_errorResult_message String A human-readable description of the error.

CData Cloud

DeleteTable

Deletes the specified table from Google BigQuery.

Input

Name Type Description
TableId String TableId of the table you wish to delete. ProjectId and DatasetId can come from connection properties, or to override them, the format projectId:datasetId.TableId.

Result Set Columns

Name Type Description
Success String Returns true if operation is successful, else an exception is returned.

CData Cloud

GetJob

Retrieves the configuration information and execution state for an existing job

Input

Name Type Description
JobId String The Id of the job you wish to return.
Region String The region where the job is executing. Not required if the job is a US or EU region.

Result Set Columns

Name Type Description
JobId String The JobId of the newly insert Job.
Region String The region where the job is executing.
Configuration_query_query String The query of the newly inserted Job.
Configuration_query_destinationTable_tableId String The destination table tableId of the newly inserted Job.
Configuration_query_destinationTable_projectId String The destination table projectId of the newly inserted Job.
Configuration_query_destinationTable_datasetId String The destination table datasetId of the newly inserted Job.
Status_State String Running state of the job.
Status_errorResult_reason String A short error code that summarizes the error.
Status_errorResult_message String A human-readable description of the error.

CData Cloud

InsertJob

Inserts a Google BigQuery job, which can then be selected later to retrieve the query results.

Input

Name Type Description
Query String The query to submit to Google BigQuery.
IsDML String Should be true if the query is a DML statement and false otherwise.

The default value is false.

DestinationTable String The destination table for the query, in the format DestProjectId:DestDatasetId.DestTable
WriteDisposition String How to write data to the destination table, such as truncate existing results, appending existing results, or writing only when the table is empty.

The allowed values are WRITE_TRUNCATE, WRITE_APPEND, WRITE_EMPTY.

The default value is WRITE_TRUNCATE.

DryRun String Whether or not this is a dry run of the job.
MaximumBytesBilled String If provided, BigQuery will cancel the job if it attempts to process more than this many bytes.
Region String The region to start executing the job in.

Result Set Columns

Name Type Description
JobId String The JobId of the newly insert Job.
Region String The region where the job is executing.
Configuration_query_query String The query of the newly inserted Job.
Configuration_query_destinationTable_tableId String The destination table tableId of the newly inserted Job.
Configuration_query_destinationTable_projectId String The destination table projectId of the newly inserted Job.
Configuration_query_destinationTable_datasetId String The destination table datasetId of the newly inserted Job.
Status_State String Running state of the job.
Status_errorResult_reason String A short error code that summarizes the error.
Status_errorResult_message String A human-readable description of the error.

CData Cloud

InsertLoadJob

Inserts a Google BigQuery load job, which adds data from Google Cloud Storage into an existing table.

Input

Name Type Description
SourceURIs String A space-separated list of Google Cloud Storage URIs
SourceFormat String The source format that the files are formatted in.

The allowed values are AVRO, NEWLINE_DELIMITED_JSON, DATASTORE_BACKUP, PARQUET, ORC, CSV.

DestinationTable String The destination table for the query, in the format DestProjectId.DestDatasetId.DestTable
DestinationTableProperties String A JSON object containing the table friendlyName, description and list of labels.
DestinationTableSchema String A JSON list contianing the fields used to create the table.
DestinationEncryptionConfiguration String A JSON object giving the KMS encryption settings for the table.
SchemaUpdateOptions String A JSON list giving the options to apply when updating the destination table schema.
TimePartitioning String A JSON object giving the time partitioning type and field.
RangePartitioning String A JSON object giving the range partitioning field and buckets.
Clustering String A JSON object giving the fields to be used for clustering.
Autodetect String Whether options and schema should be automatically determined for JSON and CSV files.
CreateDisposition String Whether to create the destination table if it does not exist.

The allowed values are CREATE_IF_NEEDED, CREATE_NEVER.

The default value is CREATE_IF_NEEDED.

WriteDisposition String How to write data to the destination table, such as truncate existing results, appending existing results, or writing only when the table is empty.

The allowed values are WRITE_TRUNCATE, WRITE_APPEND, WRITE_EMPTY.

The default value is WRITE_APPEND.

Region String The region to start executing the job in. Both the GCS resources and the BigQuery dataset must be in the same region.
DryRun String Whether or not this is a dry run of the job.

The default value is false.

MaximumBadRecords String If provided, the number of records that can be invalid before the entire job is canceled. By default all records must be valid.

The default value is 0.

IgnoreUnknownValues String Whether to ignore unknown fields in the input file or treat them as errors. By default they are treated as errors.

The default value is false.

AvroUseLogicalTypes String Whether to use Avro logical types when converting Avro data into BigQuery types.

The default value is true.

CSVSkipLeadingRows String How many rows to skip at the start of CSV files. Usually used for skipping header rows.
CSVEncoding String The name of the encoding used for CSV files.

The allowed values are ISO-8859-1, UTF-8.

The default value is UTF-8.

CSVNullMarker String If provided, this string is used for NULL values within CSV files. By default CSV files cannot use NULL.
CSVFieldDelimiter String The character used to separate columns within CSV files.

The default value is ,.

CSVQuote String The character used for quoted fields in CSV files. May be set to empty to disable quoting.

The default value is ".

CSVAllowQuotedNewlines String Whether CSV files can contain newlines within quoted fields.

The default value is false.

CSVAllowJaggedRows String Whether lines in CSV files may contain missing fields. False by default

The default value is false.

DSBackupProjectionFields String A JSON list of fields to load from a Cloud datastore backup.
ParquetOptions String A JSON object giving the Parquet-specific import options.
DecimalTargetTypes String A JSON list giving the preference order applied to numeric types.
HivePartitioningOptions String A JSON object giving the source-side partitioning options.

Result Set Columns

Name Type Description
JobId String The JobId of the newly insert Job.
Region String The region where the job is executing.
Configuration_load_destinationTable_tableId String The destination table tableId of the newly inserted Job.
Configuration_load_destinationTable_projectId String The destination table projectId of the newly inserted Job.
Configuration_load_destinationTable_datasetId String The destination table datasetId of the newly inserted Job.
Status_State String Running state of the job.
Status_errorResult_reason String A short error code that summarizes the error.
Status_errorResult_message String A human-readable description of the error.

CData Cloud

System Tables

You can query the system tables described in this section to access schema information, information on data source functionality, and batch operation statistics.

Schema Tables

The following tables return database metadata for Google BigQuery:

  • sys_catalogs: Lists the available databases.
  • sys_schemas: Lists the available schemas.
  • sys_tables: Lists the available tables and views.
  • sys_tablecolumns: Describes the columns of the available tables and views.
  • sys_procedures: Describes the available stored procedures.
  • sys_procedureparameters: Describes stored procedure parameters.
  • sys_keycolumns: Describes the primary and foreign keys.
  • sys_indexes: Describes the available indexes.

Data Source Tables

The following tables return information about how to connect to and query the data source:

  • sys_connection_props: Returns information on the available connection properties.
  • sys_sqlinfo: Describes the SELECT queries that the Cloud can offload to the data source.

Query Information Tables

The following table returns query statistics for data modification queries, including batch operations::

  • sys_identity: Returns information about batch operations or single updates.

CData Cloud

sys_catalogs

Lists the available databases.

The following query retrieves all databases determined by the connection string:

SELECT * FROM sys_catalogs

Columns

Name Type Description
CatalogName String The database name.

CData Cloud

sys_schemas

Lists the available schemas.

The following query retrieves all available schemas:

          SELECT * FROM sys_schemas
          

Columns

Name Type Description
CatalogName String The database name.
SchemaName String The schema name.

CData Cloud

sys_tables

Lists the available tables.

The following query retrieves the available tables and views:

          SELECT * FROM sys_tables
          

Columns

Name Type Description
CatalogName String The database containing the table or view.
SchemaName String The schema containing the table or view.
TableName String The name of the table or view.
TableType String The table type (table or view).
Description String A description of the table or view.
IsUpdateable Boolean Whether the table can be updated.

CData Cloud

sys_tablecolumns

Describes the columns of the available tables and views.

The following query returns the columns and data types for the [publicdata].[samples].github_nested table:

SELECT ColumnName, DataTypeName FROM sys_tablecolumns WHERE TableName='github_nested' AND CatalogName='publicdata' AND SchemaName='samples'

Columns

Name Type Description
CatalogName String The name of the database containing the table or view.
SchemaName String The schema containing the table or view.
TableName String The name of the table or view containing the column.
ColumnName String The column name.
DataTypeName String The data type name.
DataType Int32 An integer indicating the data type. This value is determined at run time based on the environment.
Length Int32 The storage size of the column.
DisplaySize Int32 The designated column's normal maximum width in characters.
NumericPrecision Int32 The maximum number of digits in numeric data. The column length in characters for character and date-time data.
NumericScale Int32 The column scale or number of digits to the right of the decimal point.
IsNullable Boolean Whether the column can contain null.
Description String A brief description of the column.
Ordinal Int32 The sequence number of the column.
IsAutoIncrement String Whether the column value is assigned in fixed increments.
IsGeneratedColumn String Whether the column is generated.
IsHidden Boolean Whether the column is hidden.
IsArray Boolean Whether the column is an array.
IsReadOnly Boolean Whether the column is read-only.
IsKey Boolean Indicates whether a field returned from sys_tablecolumns is the primary key of the table.

CData Cloud

sys_procedures

Lists the available stored procedures.

The following query retrieves the available stored procedures:

          SELECT * FROM sys_procedures
          

Columns

Name Type Description
CatalogName String The database containing the stored procedure.
SchemaName String The schema containing the stored procedure.
ProcedureName String The name of the stored procedure.
Description String A description of the stored procedure.
ProcedureType String The type of the procedure, such as PROCEDURE or FUNCTION.

CData Cloud

sys_procedureparameters

Describes stored procedure parameters.

The following query returns information about all of the input parameters for the RefreshOAuthAccessToken stored procedure:

SELECT * FROM sys_procedureparameters WHERE ProcedureName='RefreshOAuthAccessToken' AND Direction=1 OR Direction=2

Columns

Name Type Description
CatalogName String The name of the database containing the stored procedure.
SchemaName String The name of the schema containing the stored procedure.
ProcedureName String The name of the stored procedure containing the parameter.
ColumnName String The name of the stored procedure parameter.
Direction Int32 An integer corresponding to the type of the parameter: input (1), input/output (2), or output(4). input/output type parameters can be both input and output parameters.
DataTypeName String The name of the data type.
DataType Int32 An integer indicating the data type. This value is determined at run time based on the environment.
Length Int32 The number of characters allowed for character data. The number of digits allowed for numeric data.
NumericPrecision Int32 The maximum precision for numeric data. The column length in characters for character and date-time data.
NumericScale Int32 The number of digits to the right of the decimal point in numeric data.
IsNullable Boolean Whether the parameter can contain null.
IsRequired Boolean Whether the parameter is required for execution of the procedure.
IsArray Boolean Whether the parameter is an array.
Description String The description of the parameter.
Ordinal Int32 The index of the parameter.

CData Cloud

sys_keycolumns

Describes the primary and foreign keys.

The following query retrieves the primary key for the [publicdata].[samples].github_nested table:

         SELECT * FROM sys_keycolumns WHERE IsKey='True' AND TableName='github_nested' AND CatalogName='publicdata' AND SchemaName='samples'
          

Columns

Name Type Description
CatalogName String The name of the database containing the key.
SchemaName String The name of the schema containing the key.
TableName String The name of the table containing the key.
ColumnName String The name of the key column.
IsKey Boolean Whether the column is a primary key in the table referenced in the TableName field.
IsForeignKey Boolean Whether the column is a foreign key referenced in the TableName field.
PrimaryKeyName String The name of the primary key.
ForeignKeyName String The name of the foreign key.
ReferencedCatalogName String The database containing the primary key.
ReferencedSchemaName String The schema containing the primary key.
ReferencedTableName String The table containing the primary key.
ReferencedColumnName String The column name of the primary key.

CData Cloud

sys_foreignkeys

Describes the foreign keys.

The following query retrieves all foreign keys which refer to other tables:

         SELECT * FROM sys_foreignkeys WHERE ForeignKeyType = 'FOREIGNKEY_TYPE_IMPORT'
          

Columns

Name Type Description
CatalogName String The name of the database containing the key.
SchemaName String The name of the schema containing the key.
TableName String The name of the table containing the key.
ColumnName String The name of the key column.
PrimaryKeyName String The name of the primary key.
ForeignKeyName String The name of the foreign key.
ReferencedCatalogName String The database containing the primary key.
ReferencedSchemaName String The schema containing the primary key.
ReferencedTableName String The table containing the primary key.
ReferencedColumnName String The column name of the primary key.
ForeignKeyType String Designates whether the foreign key is an import (points to other tables) or export (referenced from other tables) key.

CData Cloud

sys_primarykeys

Describes the primary keys.

The following query retrieves the primary keys from all tables and views:

         SELECT * FROM sys_primarykeys
          

Columns

Name Type Description
CatalogName String The name of the database containing the key.
SchemaName String The name of the schema containing the key.
TableName String The name of the table containing the key.
ColumnName String The name of the key column.
KeySeq String The sequence number of the primary key.
KeyName String The name of the primary key.

CData Cloud

sys_indexes

Describes the available indexes. By filtering on indexes, you can write more selective queries with faster query response times.

The following query retrieves all indexes that are not primary keys:

          SELECT * FROM sys_indexes WHERE IsPrimary='false'
          

Columns

Name Type Description
CatalogName String The name of the database containing the index.
SchemaName String The name of the schema containing the index.
TableName String The name of the table containing the index.
IndexName String The index name.
ColumnName String The name of the column associated with the index.
IsUnique Boolean True if the index is unique. False otherwise.
IsPrimary Boolean True if the index is a primary key. False otherwise.
Type Int16 An integer value corresponding to the index type: statistic (0), clustered (1), hashed (2), or other (3).
SortOrder String The sort order: A for ascending or D for descending.
OrdinalPosition Int16 The sequence number of the column in the index.

CData Cloud

sys_connection_props

Returns information on the available connection properties and those set in the connection string.

The following query retrieves all connection properties that have been set in the connection string or set through a default value:

SELECT * FROM sys_connection_props WHERE Value <> ''

Columns

Name Type Description
Name String The name of the connection property.
ShortDescription String A brief description.
Type String The data type of the connection property.
Default String The default value if one is not explicitly set.
Values String A comma-separated list of possible values. A validation error is thrown if another value is specified.
Value String The value you set or a preconfigured default.
Required Boolean Whether the property is required to connect.
Category String The category of the connection property.
IsSessionProperty String Whether the property is a session property, used to save information about the current connection.
Sensitivity String The sensitivity level of the property. This informs whether the property is obfuscated in logging and authentication forms.
PropertyName String A camel-cased truncated form of the connection property name.
Ordinal Int32 The index of the parameter.
CatOrdinal Int32 The index of the parameter category.
Hierarchy String Shows dependent properties associated that need to be set alongside this one.
Visible Boolean Informs whether the property is visible in the connection UI.
ETC String Various miscellaneous information about the property.

CData Cloud

sys_sqlinfo

Describes the SELECT query processing that the Cloud can offload to the data source.

See SQL Compliance for SQL syntax details.

Discovering the Data Source's SELECT Capabilities

Below is an example data set of SQL capabilities. Some aspects of SELECT functionality are returned in a comma-separated list if supported; otherwise, the column contains NO.

NameDescriptionPossible Values
AGGREGATE_FUNCTIONSSupported aggregation functions.AVG, COUNT, MAX, MIN, SUM, DISTINCT
COUNTWhether COUNT function is supported.YES, NO
IDENTIFIER_QUOTE_OPEN_CHARThe opening character used to escape an identifier.[
IDENTIFIER_QUOTE_CLOSE_CHARThe closing character used to escape an identifier.]
SUPPORTED_OPERATORSA list of supported SQL operators.=, >, <, >=, <=, <>, !=, LIKE, NOT LIKE, IN, NOT IN, IS NULL, IS NOT NULL, AND, OR
GROUP_BYWhether GROUP BY is supported, and, if so, the degree of support.NO, NO_RELATION, EQUALS_SELECT, SQL_GB_COLLATE
OJ_CAPABILITIESThe supported varieties of outer joins supported.NO, LEFT, RIGHT, FULL, INNER, NOT_ORDERED, ALL_COMPARISON_OPS
OUTER_JOINSWhether outer joins are supported.YES, NO
SUBQUERIESWhether subqueries are supported, and, if so, the degree of support.NO, COMPARISON, EXISTS, IN, CORRELATED_SUBQUERIES, QUANTIFIED
STRING_FUNCTIONSSupported string functions.LENGTH, CHAR, LOCATE, REPLACE, SUBSTRING, RTRIM, LTRIM, RIGHT, LEFT, UCASE, SPACE, SOUNDEX, LCASE, CONCAT, ASCII, REPEAT, OCTET, BIT, POSITION, INSERT, TRIM, UPPER, REGEXP, LOWER, DIFFERENCE, CHARACTER, SUBSTR, STR, REVERSE, PLAN, UUIDTOSTR, TRANSLATE, TRAILING, TO, STUFF, STRTOUUID, STRING, SPLIT, SORTKEY, SIMILAR, REPLICATE, PATINDEX, LPAD, LEN, LEADING, KEY, INSTR, INSERTSTR, HTML, GRAPHICAL, CONVERT, COLLATION, CHARINDEX, BYTE
NUMERIC_FUNCTIONSSupported numeric functions.ABS, ACOS, ASIN, ATAN, ATAN2, CEILING, COS, COT, EXP, FLOOR, LOG, MOD, SIGN, SIN, SQRT, TAN, PI, RAND, DEGREES, LOG10, POWER, RADIANS, ROUND, TRUNCATE
TIMEDATE_FUNCTIONSSupported date/time functions.NOW, CURDATE, DAYOFMONTH, DAYOFWEEK, DAYOFYEAR, MONTH, QUARTER, WEEK, YEAR, CURTIME, HOUR, MINUTE, SECOND, TIMESTAMPADD, TIMESTAMPDIFF, DAYNAME, MONTHNAME, CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, EXTRACT
REPLICATION_SKIP_TABLESIndicates tables skipped during replication.
REPLICATION_TIMECHECK_COLUMNSA string array containing a list of columns which will be used to check for (in the given order) to use as a modified column during replication.
IDENTIFIER_PATTERNString value indicating what string is valid for an identifier.
SUPPORT_TRANSACTIONIndicates if the provider supports transactions such as commit and rollback.YES, NO
DIALECTIndicates the SQL dialect to use.
KEY_PROPERTIESIndicates the properties which identify the uniform database.
SUPPORTS_MULTIPLE_SCHEMASIndicates if multiple schemas may exist for the provider.YES, NO
SUPPORTS_MULTIPLE_CATALOGSIndicates if multiple catalogs may exist for the provider.YES, NO
DATASYNCVERSIONThe CData Data Sync version needed to access this driver.Standard, Starter, Professional, Enterprise
DATASYNCCATEGORYThe CData Data Sync category of this driver.Source, Destination, Cloud Destination
SUPPORTSENHANCEDSQLWhether enhanced SQL functionality beyond what is offered by the API is supported.TRUE, FALSE
SUPPORTS_BATCH_OPERATIONSWhether batch operations are supported.YES, NO
SQL_CAPAll supported SQL capabilities for this driver.SELECT, INSERT, DELETE, UPDATE, TRANSACTIONS, ORDERBY, OAUTH, ASSIGNEDID, LIMIT, LIKE, BULKINSERT, COUNT, BULKDELETE, BULKUPDATE, GROUPBY, HAVING, AGGS, OFFSET, REPLICATE, COUNTDISTINCT, JOINS, DROP, CREATE, DISTINCT, INNERJOINS, SUBQUERIES, ALTER, MULTIPLESCHEMAS, GROUPBYNORELATION, OUTERJOINS, UNIONALL, UNION, UPSERT, GETDELETED, CROSSJOINS, GROUPBYCOLLATE, MULTIPLECATS, FULLOUTERJOIN, MERGE, JSONEXTRACT, BULKUPSERT, SUM, SUBQUERIESFULL, MIN, MAX, JOINSFULL, XMLEXTRACT, AVG, MULTISTATEMENTS, FOREIGNKEYS, CASE, LEFTJOINS, COMMAJOINS, WITH, LITERALS, RENAME, NESTEDTABLES, EXECUTE, BATCH, BASIC, INDEX
PREFERRED_CACHE_OPTIONSA string value specifies the preferred cacheOptions.
ENABLE_EF_ADVANCED_QUERYIndicates if the driver directly supports advanced queries coming from Entity Framework. If not, queries will be handled client side.YES, NO
PSEUDO_COLUMNSA string array indicating the available pseudo columns.
MERGE_ALWAYSIf the value is true, The Merge Mode is forcibly executed in Data Sync.TRUE, FALSE
REPLICATION_MIN_DATE_QUERYA select query to return the replicate start datetime.
REPLICATION_MIN_FUNCTIONAllows a provider to specify the formula name to use for executing a server side min.
REPLICATION_START_DATEAllows a provider to specify a replicate startdate.
REPLICATION_MAX_DATE_QUERYA select query to return the replicate end datetime.
REPLICATION_MAX_FUNCTIONAllows a provider to specify the formula name to use for executing a server side max.
IGNORE_INTERVALS_ON_INITIAL_REPLICATEA list of tables which will skip dividing the replicate into chunks on the initial replicate.
CHECKCACHE_USE_PARENTIDIndicates whether the CheckCache statement should be done against the parent key column.TRUE, FALSE
CREATE_SCHEMA_PROCEDURESIndicates stored procedures that can be used for generating schema files.

The following query retrieves the operators that can be used in the WHERE clause:

SELECT * FROM sys_sqlinfo WHERE Name = 'SUPPORTED_OPERATORS'
Note that individual tables may have different limitations or requirements on the WHERE clause; refer to the Data Model section for more information.

Columns

Name Type Description
NAME String A component of SQL syntax, or a capability that can be processed on the server.
VALUE String Detail on the supported SQL or SQL syntax.

CData Cloud

sys_identity

Returns information about attempted modifications.

The following query retrieves the Ids of the modified rows in a batch operation:

         SELECT * FROM sys_identity
          

Columns

Name Type Description
Id String The database-generated Id returned from a data modification operation.
Batch String An identifier for the batch. 1 for a single operation.
Operation String The result of the operation in the batch: INSERTED, UPDATED, or DELETED.
Message String SUCCESS or an error message if the update in the batch failed.

CData Cloud

sys_information

Describes the available system information.

The following query retrieves all columns:

SELECT * FROM sys_information

Columns

NameTypeDescription
ProductStringThe name of the product.
VersionStringThe version number of the product.
DatasourceStringThe name of the datasource the product connects to.
NodeIdStringThe unique identifier of the machine where the product is installed.
HelpURLStringThe URL to the product's help documentation.
LicenseStringThe license information for the product. (If this information is not available, the field may be left blank or marked as 'N/A'.)
LocationStringThe file path location where the product's library is stored.
EnvironmentStringThe version of the environment or rumtine the product is currently running under.
DataSyncVersionStringThe tier of CData Sync required to use this connector.
DataSyncCategoryStringThe category of CData Sync functionality (e.g., Source, Destination).

CData Cloud

Data Type Mapping

Data Type Mappings

The Cloud maps types from the data source to the corresponding data type available in the schema. The table below documents these mappings.

Google BigQuery CData Schema
STRING string
BYTES binary
INTEGER long
FLOAT double
NUMERIC decimal
BIGNUMERIC decimal
BOOLEAN bool
DATE date
TIME time
DATETIME datetime
TIMESTAMP datetime
STRUCT See below
ARRAY See below
GEOGRAPHY string
JSON string
INTERVAL string

Note that the NUMERIC type supports 38 digits of precision and the BIGDECIMAL type supports 76 digits of precision. Most platforms do not have a decimal type that supports the full precision of these values (.NET decimal supports 28 digits, and Java BigDecimal supports 38 by default). If this is the case, then you can cast these columns to a string when queried, or the connection can be configured to ignore them by setting IgnoreTypes=decimal.

STRUCT and ARRAY Types

Google BigQuery supports two kinds of types for storing compound values in a single row, STRUCT and ARRAY. In some places within Google BigQuery these are also known as RECORD and REPEATED types.

A STRUCT is a fixed-size group of values that are accessed by name and can have different types. The Cloud flattens structs so their individual fields can be accessed using dotted names. Note that these dotted names must be quoted.

-- trade_value STRUCT<currency STRING, value FLOAT>
SELECT CONCAT([trade_value.value], ' ', NULLIF([trade_value.currency], 'USD'))
FROM trades

An ARRAY is a group of values with the same type that can have any size. The Cloud treats the array as a single compound value and reports it as a JSON aggregate.

These types may be combined such that a STRUCT type contains an ARRAY field, or an ARRAY field is a list of STRUCT values. The outer type takes precedence in how the field is processed:

/* Table contains fields: 
  stocks STRUCT<symbol STRING, prices ARRAY<FLOAT>>
  offers: ARRAY<STRUCT<currency STRING, value FLOAT>> 
*/

SELECT [stocks.symbol], /* ARRAY field can be read from STRUCT, but is converted to JSON */
       [stocks.prices], 
       [offers]         /* STRUCT fields in an ARRAY cannot be accessed */
FROM market

INTERVAL Types

The Cloud represents INTERVAL types as strings. Whenever a query requires an INTERVAL type, it must specify the INTERVAL using the BigQuery SQL INTERVAL format:

YEAR-MONTH DAY HOUR:MINUTE:SECOND.FRACTION
. All queries that return INTERVAL values use this format unless they appear in an ARRAY aggregate, where the format depends upon how the Cloud reads the data.

For example, the value "5 years and 11 months, minus 10 days and 3 hours and 2.5 seconds" in the correct format is:

5-11 -10 -3:0:0.2.5

Type Parameters

The Cloud exposes parameters on the following types. In each case the type parameters are optional, Google BigQuery has default values for types that are not parameterized.

  • STRING(length)
  • BYTES(length)
  • NUMERIC(precision) or NUMERIC(precision, scale)
  • BIGNUMERIC(precision) or BIGNUMERIC(precision, scale)

These parameters are primarily for restricting the data written to the table. They are included in the table metadata as the column size for STRING and BYTES, and the numeric precision and scale for NUMERIC and BIGNUMERIC.

Type parameters have no effect on queries and are not reported within query metadata. For example, in the example below the output of CONCAT is a plain STRING even though its inputs are a STRING(100) and b STRING(100).

SELECT CONCAT(a, b) FROM table_with_length_params

CData Cloud

Connection String Options

The connection string properties are the various options that can be used to establish a connection. This section provides a complete list of the options you can configure in the connection string for this provider. Click the links for further details.

For more information on establishing a connection, see Establishing a Connection.

Authentication


PropertyDescription
AuthSchemeThe type of authentication to use when connecting to Google BigQuery.
ProjectIdThe ProjectId used to resolve unqualified tables and execute jobs.
DatasetIdThe DatasetId used to resolve unqualified tables.
BillingProjectIdThe ProjectId of the billing project for executing jobs.

BigQuery


PropertyDescription
AllowLargeResultSetsWhether or not to allow large datasets to be stored in temporary tables for large datasets.
DestinationTableThis property determines where query results are stored in Google BigQuery.
UseQueryCacheSpecifies whether to use Google BigQuery's built-in query cache.
PageSizeThe number of results to return per page from Google BigQuery.
PollingIntervalThis determines how long to wait in seconds, between checks to see if a job has completed.
AllowUpdatesWithoutKeyWhether or not to allow update without primary keys.
FilterColumnsPlease set `AllowUpdatesWithoutKey` to true before you could use this property.
UseLegacySQLSpecifies whether to use BigQuery's legacy SQL dialect for this query. By default, Standard SQL will be used.
PrivateEndpointNameWhen connecting over Private Access, this property specifies the name of the custom endpoint.

Storage API


PropertyDescription
UseStorageAPISpecifies whether to use BigQuery's Storage API for bulk data reads.
UseArrowFormatSpecifies whether to use the Arrow format with BigQuery's Storage API.
StorageThresholdThe minimum number of rows a query must return to invoke the Storage API.
StoragePageSizeSpecifies the page size to use for Storage API queries.

Uploading


PropertyDescription
InsertModeSpecifies what kind of method to use when inserting data. By default streaming INSERTs are used.
WaitForBatchResultsWhether to wait for the job to complete when using the bulk upload API. Only active when InsertMode is set to Upload.
TempTableDatasetThe prefix of the dataset that will contain temporary tables when performing bulk UPDATE or DELETE operations.

OAuth


PropertyDescription
OAuthClientIdSpecifies the client Id that was assigned the custom OAuth application was created. (Also known as the consumer key.) This ID registers the custom application with the OAuth authorization server.
OAuthClientSecretSpecifies the client secret that was assigned when the custom OAuth application was created. (Also known as the consumer secret ). This secret registers the custom application with the OAuth authorization server.
DelegatedServiceAccountsA space-delimited list of service account emails for delegated requests.
RequestingServiceAccountA service account email to make a delegated request.

JWT OAuth


PropertyDescription
OAuthJWTCertThe JWT Certificate store.
OAuthJWTCertTypeThe type of key store containing the JWT Certificate.
OAuthJWTCertPasswordThe password for the OAuth JWT certificate used to access a certificate store that requires a password. If the certificate store does not require a password, leave this property blank.
OAuthJWTCertSubjectThe subject of the OAuth JWT certificate used to locate a matching certificate in the store. Supports partial matches and the wildcard '*' to select the first certificate.
OAuthJWTIssuerThe issuer of the Java Web Token.
OAuthJWTSubjectThe user subject for which the application is requesting delegated access.

SSL


PropertyDescription
SSLServerCertSpecifies the certificate to be accepted from the server when connecting using TLS/SSL.

Logging


PropertyDescription
VerbositySpecifies the verbosity level of the log file, which controls the amount of detail logged. Supported values range from 1 to 5.

Schema


PropertyDescription
BrowsableSchemasOptional setting that restricts the schemas reported to a subset of all available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC .
RefreshViewSchemasAllows the provider to determine up-to-date view schemas automatically.
ShowTableDescriptionsControls whether table descriptions are returned via the platform metadata APIs and sys_tables / sys_views.
PrimaryKeyIdentifiersSet this property to define primary keys.
AllowedTableTypesSpecifies what kinds of tables will be visible.
FlattenObjectsDetermines whether the provider flattens STRUCT fields into top-level columns.

Miscellaneous


PropertyDescription
StorageTimeoutHow long a Storage API connection may remain active before the provider reconnects.
EmptyArraysAsNullWhether empty arrays are represented as 'null' or as '[]'.
HidePartitionColumnsWhether partition tables will show the columns _PARTITIONDATE and _PARTITIONTIME.
AllowAggregateParametersAllows raw aggregates to be used in parameters when QueryPassthrough is enabled.
ApplicationNameAn application name in the form application/version. For example, AcmeReporting/1.0.
AuditLimitThe maximum number of rows which will be stored within an audit table.
AuditModeWhat provider actions should be recorded to audit tables.
AWSWorkloadIdentityConfigConfiguration properties to provide when using Workload Identity Federation via AWS.
BigQueryOptionsA comma separated list of Google BigQuery options.
MaximumBillingTierThe MaximumBillingTier is a positive integer that serves as a multiplier of the basic price per TB. For example, if you set MaximumBillingTier to 2, the maximum cost for that query will be 2x basic price per TB.
MaximumBytesBilledLimits how many bytes BigQuery will allow a job to consume before it is cancelled.
MaxRowsSpecifies the maximum rows returned for queries without aggregation or GROUP BY.
PseudoColumnsSpecifies the pseudocolumns to expose as table columns. Use the format 'TableName=ColumnName;TableName=ColumnName'. The default is an empty string, which disables this property.
SupportCaseSensitiveTablesBy default, the provider treats table names as case-insensitive, so if multiple tables have the same name but different casing, only one will be reported in the metadata.
TableSamplePercentThis determines what percent of a table is sampled with the TABLESAMPLE operator.
TimeoutThe value in seconds until the timeout error is thrown, canceling the operation.
WorkloadPoolIdThe ID of your Workload Identity Federation pool.
WorkloadProjectIdThe ID of the Google Cloud project that hosts your Workload Identity Federation pool.
WorkloadProviderIdThe ID of your Workload Identity Federation pool provider.
CData Cloud

Authentication

This section provides a complete list of the Authentication properties you can configure in the connection string for this provider.


PropertyDescription
AuthSchemeThe type of authentication to use when connecting to Google BigQuery.
ProjectIdThe ProjectId used to resolve unqualified tables and execute jobs.
DatasetIdThe DatasetId used to resolve unqualified tables.
BillingProjectIdThe ProjectId of the billing project for executing jobs.
CData Cloud

AuthScheme

The type of authentication to use when connecting to Google BigQuery.

Possible Values

OAuth, OAuthJWT, AWSWorkloadIdentity

Data Type

string

Default Value

"OAuth"

Remarks

  • OAuth: Set this to perform OAuth authentication using a standard user account.
  • OAuthJWT: Set this to perform OAuth authentication using an OAuth service account.
  • GCPInstanceAccount: Set this to get Access Token from Google Cloud Platform instance.
  • AWSWorkloadIdentity: Set this to authenticate using Workload Identity Federation. The Cloud authenticates to AWS according to the AWSWorkloadIdentityConfig and provides Google Security Token Serivce with an authentication token. The Google STS validates this token and produces an OAuth token that can access Google services.

CData Cloud

ProjectId

The ProjectId used to resolve unqualified tables and execute jobs.

Data Type

string

Default Value

""

Remarks

This property and BillingProjectId are used to determine billing for jobs and resolve unqualified table names.

Job Execution

The Cloud must create a job within Google BigQuery to execute certain kinds of queries. For example, complex SELECT statements, UPDATE and DELETE statements, and INSERT statements (when InsertMode is DML) are all executed using jobs. The project where a job executes determines how the job is billed.

The Cloud determines the billing project using these rules. Note that only the first two rules apply when QueryPassthrough is enabled. Either this property or BillingProjectId must be set to execute passthrough queries.

  1. The BillingProjectId is used if that property is not empty.
  2. Then this property is used.
  3. If both properties are empty, the project is determined from the catalog of the first table in the query. The job created for the following query executes in the psychic-valve-137816 project.

SELECT FirstName, LastName FROM `psychic-valve-137816`.`Northwind`.`customers`

Table Resolution

In addition to setting the billing project, the Cloud also uses this property to determine the default data project. The data project is used to resolve tables included in queries when they are not fully qualified:

/* Unqualified, resolved against connection properties */
SELECT FirstName, LastName FROM `Northwind`.`customers`

/* Qualified, project specified as catalog */
SELECT FirstName, LastName FROM `psychic-valve-137816`.`Northwind`.`customers`

Any unqualified table references in the query are resolved using the following rules. Note that only methods 1 and 2 are supported when QueryPassthrough is enabled. This means that any tables outside the default data project must be explicitly qualified.

  1. This property is used if it is not empty.
  2. Then the BillingProjectId property is used.
  3. If both properties are empty, the catalog from the first table in the query is used. In the following query the `Northwind`.`orders` table is treated as if it comes from the psychic-valve-137186 project.

SELECT ... FROM `psychic-valve-137816`.`Northwind`.`customers`
INNER JOIN `Northwind`.`orders`
ON ...

CData Cloud

DatasetId

The DatasetId used to resolve unqualified tables.

Data Type

string

Default Value

""

Remarks

When a query refers to a table it can leave the dataset implicit, or qualify the dataset directly as the schema portion of the table:

/* Implicit, resolved against connection string */
SELECT FirstName, LastName FROM `customers`

/* Explicit, dataset specified as schema */
SELECT FirstName, LastName FROM `psychic-valve-137816`.`Northwind`.`customers`

Any unqualified table references in the query are resolved using the following rules. Note that only method 1 is supported when QueryPassthrough is enabled. This means that passthrough queries must set this property or qualify all tables.

  1. If this property is set then the specified dataset is used.
  2. Otherwise the schema from the first table in the query is used. In the following query the `orders` table is treated as if it comes from the Northwind dataset.

SELECT ... FROM `psychic-valve-137816`.`Northwind`.`customers`
INNER JOIN `orders`
ON ...

CData Cloud

BillingProjectId

The ProjectId of the billing project for executing jobs.

Data Type

string

Default Value

""

Remarks

This property is used with ProjectId to determine the project the Cloud executes jobs under. Please refer to that page for more information.

CData Cloud

BigQuery

This section provides a complete list of the BigQuery properties you can configure in the connection string for this provider.


PropertyDescription
AllowLargeResultSetsWhether or not to allow large datasets to be stored in temporary tables for large datasets.
DestinationTableThis property determines where query results are stored in Google BigQuery.
UseQueryCacheSpecifies whether to use Google BigQuery's built-in query cache.
PageSizeThe number of results to return per page from Google BigQuery.
PollingIntervalThis determines how long to wait in seconds, between checks to see if a job has completed.
AllowUpdatesWithoutKeyWhether or not to allow update without primary keys.
FilterColumnsPlease set `AllowUpdatesWithoutKey` to true before you could use this property.
UseLegacySQLSpecifies whether to use BigQuery's legacy SQL dialect for this query. By default, Standard SQL will be used.
PrivateEndpointNameWhen connecting over Private Access, this property specifies the name of the custom endpoint.
CData Cloud

AllowLargeResultSets

Whether or not to allow large datasets to be stored in temporary tables for large datasets.

Data Type

bool

Default Value

false

Remarks

Whether or not to allow large datasets to be stored in temporary tables for large datasets.

CData Cloud

DestinationTable

This property determines where query results are stored in Google BigQuery.

Data Type

string

Default Value

""

Remarks

Google BigQuery queries have a maximum amount of data they are allowed to return directly. If this limit is exceeded, then queries will fail with an error message like Response too large to return. When this option is enabled the response limit does not apply, because all query responses are stored in a Google BigQuery table before being returned.

This option is set differently depending upon whether your connection is using UseLegacySQL or not. By default this option is set using the standard SQL syntax:

DestinationTable=project-name.dataset-name.table-name

When UseLegacySQL is enabled, this option is set using the legacy table syntax:

DestinationTable=project-name:dataset-name.table-name

When using this option with multiple connections, make sure that each connection has its own destination table. Sharing a table between connections can lead to results getting lost because parallel queries can overwrite each others results.

CData Cloud

UseQueryCache

Specifies whether to use Google BigQuery's built-in query cache.

Data Type

bool

Default Value

true

Remarks

Google BigQuery will cache the results of recent queries, and will use this cache for queries by default. Google BigQuery automatically updates the cache when a table is modified, so performance is generally better without any risk of queries returning stale data.

If this is set to false, the query is always run against the table directly.

CData Cloud

PageSize

The number of results to return per page from Google BigQuery.

Data Type

string

Default Value

"100000"

Remarks

The pagesize can control the number of results returned per page from Google BigQuery. Setting a higher pagesize will cause more data to come back in a single HTTP request, but may take longer to execute. Setting a smaller pagesize will increase the number of HTTP requests to get all the data, but is generally recommended to ensure timeout exceptions do not occur.

Note that this option does not have an effect if UseStorageApi is enabled and the queries being executed can be executed on the Storage API. See StoragePageSize for more information.

CData Cloud

PollingInterval

This determines how long to wait in seconds, between checks to see if a job has completed.

Data Type

string

Default Value

"1"

Remarks

This only applies to queries which are stored to a table instead of streamed directly to the Cloud. This applies in only three cases:

  • DestinationTable is set.
  • AllowLargeResultSets is true and the query takes longer than Timeout seconds.
  • UseStorageApi is enabled and the query is complex.

This property determines how long to wait between checking whether or not the query's results are ready. Very large resultsets or complex queries may take longer to process, and a low polling interval may result in may unnecessary requests being made to check the query status.

CData Cloud

AllowUpdatesWithoutKey

Whether or not to allow update without primary keys.

Data Type

bool

Default Value

false

Remarks

Whether or not to allow update without primary keys.

CData Cloud

FilterColumns

Please set `AllowUpdatesWithoutKey` to true before you could use this property.

Data Type

string

Default Value

""

Remarks

Remember setting `AllowUpdatesWithoutKey` to true before you could use this property:

Set the property like this:

`filterColumns=col1[,col2[,col3]];`

CData Cloud

UseLegacySQL

Specifies whether to use BigQuery's legacy SQL dialect for this query. By default, Standard SQL will be used.

Data Type

bool

Default Value

false

Remarks

If set to true, the query will use BigQuery's Legacy SQL dialect to rebuild the query. If set to false, the query will use BigQuery's standard SQL: https://cloud.google.com/bigquery/sql-reference/.
When UseLegacySQL is set to false, the values of AllowLargeResultSets is ignored. The query will be run as if AllowLargeResultSets is true.

CData Cloud

PrivateEndpointName

When connecting over Private Access, this property specifies the name of the custom endpoint.

Data Type

string

Default Value

""

Remarks

For example, if this property is set to 'xyz', then the endpoints accessed by the driver will be changed to 'bigquery-xyz.p.googleapis.com', 'bigquerystorage-xyz.p.googleapis.com', 'storage-xyz.p.googleapis.com', etc.

Note that if this property is used, then all endpoints must part of the private endpoint map. For instance, you cannot use the private access endpoint for one endpoint (as in 'bigquery-xyz.p.googleapis.com') but leave the default endpoint for another (as in 'bigquerystorage.googleapis.com').

CData Cloud

Storage API

This section provides a complete list of the Storage API properties you can configure in the connection string for this provider.


PropertyDescription
UseStorageAPISpecifies whether to use BigQuery's Storage API for bulk data reads.
UseArrowFormatSpecifies whether to use the Arrow format with BigQuery's Storage API.
StorageThresholdThe minimum number of rows a query must return to invoke the Storage API.
StoragePageSizeSpecifies the page size to use for Storage API queries.
CData Cloud

UseStorageAPI

Specifies whether to use BigQuery's Storage API for bulk data reads.

Data Type

bool

Default Value

true

Remarks

By default the Cloud will use the Storage API instead of the default REST API. Depending upon the complexity of the query, the Cloud may execute the query in one of two ways:

  • Simple queries that read all columns from only one table, and have no extra clauses except LIMIT, are executed directly within the Storage API.
  • All other queries are executed as a query job which writes to a temporary table. Once the query is complete, the results are read from the temporary table using the Storage API.

The BigQuery Storage API can read data faster and more efficiently than the REST API (accessible by setting this option to false), but is priced differently and requires extra OAuth permissions when using your own OAuth app. It also uses the separate StoragePageSize property instead of PageSize.

The BigQuery REST API requires no extra permissions and uses standard pricing, but is slower than the Storage API.

CData Cloud

UseArrowFormat

Specifies whether to use the Arrow format with BigQuery's Storage API.

Data Type

bool

Default Value

false

Remarks

This property only has an effect when UseStorageApi is enabled. When performing reads against the Storage API, the Cloud can request data in different formats. By default it uses Avro but enabling this option makes it use Arrow.

This option should be enabled when working with time series data or other datasets that have many date, time, datetime or timestamp fields. For these datasets using Arrow can have noticeable improvements over using Avro. Otherwise Avro and Arrow read times are very close and switching between them is unlikely to make a significant difference.

CData Cloud

StorageThreshold

The minimum number of rows a query must return to invoke the Storage API.

Data Type

string

Default Value

"100000"

Remarks

When the Cloud receives a query too complex to be run directly in the Storage API, it creates a query job and uses the Storage API to read from the query results table. If the query job returns fewer than the number of rows provided in this option, then the results are returned directly and the Storage API is not used.

This value should be set between 1 and 100000. Higher values will use the Storage API only for large resultsets, but will be delayed by reading more results from the query job. Lower values will result in smaller delays but will use the Storage API for more queries.

Note that this option only has an effect if UseStorageApi is enabled and the queries being executed cannot be executed directly on the Storage API. Queries which run directly on Storage never create query jobs.

CData Cloud

StoragePageSize

Specifies the page size to use for Storage API queries.

Data Type

string

Default Value

"10000"

Remarks

When UseStorageApi is enabled and the query being executed can be run on the Storage API, this option controls how many rows the Cloud is allowed to buffer on the client.

A higher value will generally make queries faster at the expense of consuming more memory, while lower values will conserve memory but make queries slower.

CData Cloud

Uploading

This section provides a complete list of the Uploading properties you can configure in the connection string for this provider.


PropertyDescription
InsertModeSpecifies what kind of method to use when inserting data. By default streaming INSERTs are used.
WaitForBatchResultsWhether to wait for the job to complete when using the bulk upload API. Only active when InsertMode is set to Upload.
TempTableDatasetThe prefix of the dataset that will contain temporary tables when performing bulk UPDATE or DELETE operations.
CData Cloud

InsertMode

Specifies what kind of method to use when inserting data. By default streaming INSERTs are used.

Possible Values

Streaming, DML, Upload

Data Type

string

Default Value

"Streaming"

Remarks

This section provides only a summary of the mechanisms that each of these modes use. Please see Advanced Integrations for more details on how to use each of these modes.

  • Streaming uses the Google BigQuery streaming API (also called insertAll).
  • DML uses the Google BigQuery query API to generate INSERT SQL statements which insert individual rows.
  • Upload uses the Google BigQuery upload API to create a load job which copies the rows from temporary server-side storage.
  • GCSStaging is similar to the Upload mode except that it uses your Google Cloud Storage account instead of public storage.

When UseLegacySQL is true only Streaming and Upload modes are allowed. The Legacy SQL dialect does not support DML statements.

CData Cloud

WaitForBatchResults

Whether to wait for the job to complete when using the bulk upload API. Only active when InsertMode is set to Upload.

Data Type

bool

Default Value

true

Remarks

This property determines whether the Cloud will wait for batch jobs to report their status. By default property is true and INSERT queries will complete only once Google BigQuery has finished executed them. When this property is false the INSERT query will complete as soon as a job is submitted for it.

The default mode is recommended for reliability:

  1. INSERTs will never fail silently. If the Cloud does not wait for the job to finish, it will never receive an error if the job failed to execute.
  2. If the INSERT batch size is small enough, the Cloud may submit jobs quickly enough that it hits Google BigQuery's load job limits. This does not happen when waiting for batch results because the Cloud will not allow more than one job to execute at the same time on the same connection.

You can disable this option to achieve lower delays when inserting, but you must also make sure to obey the Google BigQuery rate limits and check the status of each job to track their status and determine whether they have succeeded or failed.

CData Cloud

TempTableDataset

The prefix of the dataset that will contain temporary tables when performing bulk UPDATE or DELETE operations.

Data Type

string

Default Value

"_CDataTempTableDataset"

Remarks

Internally bulk UPDATE and DELETE use Google BigQuery MERGE queries, which require creating a table to hold all the update operations. This option is used along with the target table's region to determine the name of the dataset where these temporary tables are created. Each region must have its own temporary dataset so that the temporary table and the MERGE table can be stored in the same project/dataset. This avoids unnecessary data transfer charges.

For example, the Cloud would create a dataset called "_CDataTempTableDataset_US" for tables in the US region and a dataset called "_CDataTempTableDataset_asia_southeast_1" for tables in the Singapore region.

CData Cloud

OAuth

This section provides a complete list of the OAuth properties you can configure in the connection string for this provider.


PropertyDescription
OAuthClientIdSpecifies the client Id that was assigned the custom OAuth application was created. (Also known as the consumer key.) This ID registers the custom application with the OAuth authorization server.
OAuthClientSecretSpecifies the client secret that was assigned when the custom OAuth application was created. (Also known as the consumer secret ). This secret registers the custom application with the OAuth authorization server.
DelegatedServiceAccountsA space-delimited list of service account emails for delegated requests.
RequestingServiceAccountA service account email to make a delegated request.
CData Cloud

OAuthClientId

Specifies the client Id that was assigned the custom OAuth application was created. (Also known as the consumer key.) This ID registers the custom application with the OAuth authorization server.

Data Type

string

Default Value

""

Remarks

OAuthClientId is one of a handful of connection parameters that need to be set before users can authenticate via OAuth. For details, see Establishing a Connection.

CData Cloud

OAuthClientSecret

Specifies the client secret that was assigned when the custom OAuth application was created. (Also known as the consumer secret ). This secret registers the custom application with the OAuth authorization server.

Data Type

string

Default Value

""

Remarks

OAuthClientSecret is one of a handful of connection parameters that need to be set before users can authenticate via OAuth. For details, see Establishing a Connection.

CData Cloud

DelegatedServiceAccounts

A space-delimited list of service account emails for delegated requests.

Data Type

string

Default Value

""

Remarks

The service account emails must be specified in a space-delimited list.

Each service account must be granted the roles/iam.serviceAccountTokenCreator role on its next service account in the chain.

The last service account in the chain must be granted the roles/iam.serviceAccountTokenCreator role on the requesting service account. The requesting service account is the one specified in the RequestingServiceAccount property.

Note that for delegated requests, the requesting service account must have the permission iam.serviceAccounts.getAccessToken, which can also be granted through the serviceAccountTokenCreator role.

CData Cloud

RequestingServiceAccount

A service account email to make a delegated request.

Data Type

string

Default Value

""

Remarks

The service account email of the account for which the credentials are requested in a delegated request. With the list of delegated service accounts in DelegatedServiceAccounts, this property is used to make a delegated request.

You must have the IAM permission iam.serviceAccounts.getAccessToken on this service account.

CData Cloud

JWT OAuth

This section provides a complete list of the JWT OAuth properties you can configure in the connection string for this provider.


PropertyDescription
OAuthJWTCertThe JWT Certificate store.
OAuthJWTCertTypeThe type of key store containing the JWT Certificate.
OAuthJWTCertPasswordThe password for the OAuth JWT certificate used to access a certificate store that requires a password. If the certificate store does not require a password, leave this property blank.
OAuthJWTCertSubjectThe subject of the OAuth JWT certificate used to locate a matching certificate in the store. Supports partial matches and the wildcard '*' to select the first certificate.
OAuthJWTIssuerThe issuer of the Java Web Token.
OAuthJWTSubjectThe user subject for which the application is requesting delegated access.
CData Cloud

OAuthJWTCert

The JWT Certificate store.

Data Type

string

Default Value

""

Remarks

The name of the certificate store for the client certificate.

The OAuthJWTCertType field specifies the type of the certificate store specified by OAuthJWTCert. If the store is password protected, specify the password in OAuthJWTCertPassword.

OAuthJWTCert is used in conjunction with the OAuthJWTCertSubject field in order to specify client certificates. If OAuthJWTCert has a value, and OAuthJWTCertSubject is set, a search for a certificate is initiated. Please refer to the OAuthJWTCertSubject field for details.

Designations of certificate stores are platform-dependent.

The following are designations of the most common User and Machine certificate stores in Windows:

MYA certificate store holding personal certificates with their associated private keys.
CACertifying authority certificates.
ROOTRoot certificates.
SPCSoftware publisher certificates.

In Java, the certificate store normally is a file containing certificates and optional private keys.

When the certificate store type is PFXFile, this property must be set to the name of the file. When the type is PFXBlob, the property must be set to the binary contents of a PFX file (i.e. PKCS12 certificate store).

CData Cloud

OAuthJWTCertType

The type of key store containing the JWT Certificate.

Possible Values

PFXBLOB, JKSBLOB, PEMKEY_BLOB, PUBLIC_KEY_BLOB, SSHPUBLIC_KEY_BLOB, XMLBLOB, BCFKSBLOB, GOOGLEJSONBLOB

Data Type

string

Default Value

"GOOGLEJSONBLOB"

Remarks

This property can take one of the following values:

USERFor Windows, this specifies that the certificate store is a certificate store owned by the current user. Note: This store type is not available in Java.
MACHINEFor Windows, this specifies that the certificate store is a machine store. Note: this store type is not available in Java.
PFXFILEThe certificate store is the name of a PFX (PKCS12) file containing certificates.
PFXBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in PFX (PKCS12) format.
JKSFILEThe certificate store is the name of a Java key store (JKS) file containing certificates. Note: this store type is only available in Java.
JKSBLOBThe certificate store is a string (base-64-encoded) representing a certificate store in Java key store (JKS) format. Note: this store type is only available in Java.
PEMKEY_FILEThe certificate store is the name of a PEM-encoded file that contains a private key and an optional certificate.
PEMKEY_BLOBThe certificate store is a string (base64-encoded) that contains a private key and an optional certificate.
PUBLIC_KEY_FILEThe certificate store is the name of a file that contains a PEM- or DER-encoded public key certificate.
PUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains a PEM- or DER-encoded public key certificate.
SSHPUBLIC_KEY_FILEThe certificate store is the name of a file that contains an SSH-style public key.
SSHPUBLIC_KEY_BLOBThe certificate store is a string (base-64-encoded) that contains an SSH-style public key.
P7BFILEThe certificate store is the name of a PKCS7 file containing certificates.
PPKFILEThe certificate store is the name of a file that contains a PPK (PuTTY Private Key).
XMLFILEThe certificate store is the name of a file that contains a certificate in XML format.
XMLBLOBThe certificate store is a string that contains a certificate in XML format.
BCFKSFILEThe certificate store is the name of a file that contains an Bouncy Castle keystore.
BCFKSBLOBThe certificate store is a string (base-64-encoded) that contains a Bouncy Castle keystore.
GOOGLEJSONThe certificate store is the name of a JSON file containing the service account information. Only valid when connecting to a Google service.
GOOGLEJSONBLOBThe certificate store is a string that contains the service account JSON. Only valid when connecting to a Google service.

CData Cloud

OAuthJWTCertPassword

The password for the OAuth JWT certificate used to access a certificate store that requires a password. If the certificate store does not require a password, leave this property blank.

Data Type

string

Default Value

""

Remarks

This property specifies the password needed to open the certificate store, but only if the store type requires one. To determine if a password is necessary, refer to the documentation or configuration for your specific certificate store.

This is not required when using the GOOGLEJSON OAuthJWTCertType. Google JSON keys are not encrypted.

CData Cloud

OAuthJWTCertSubject

The subject of the OAuth JWT certificate used to locate a matching certificate in the store. Supports partial matches and the wildcard '*' to select the first certificate.

Data Type

string

Default Value

"*"

Remarks

The value of this property is used to locate a matching certificate in the store. The search process works as follows:

  • If an exact match for the subject is found, the corresponding certificate is selected.
  • If no exact match is found, the store is searched for certificates whose subjects contain the property value.
  • If no match is found, no certificate is selected.

You can set the value to '*' to automatically select the first certificate in the store. The certificate subject is a comma-separated list of distinguished name fields and values. For example: CN=www.server.com, OU=test, C=US, [email protected]. Common fields include:

FieldMeaning
CNCommon Name. This is commonly a host name like www.server.com.
OOrganization
OUOrganizational Unit
LLocality
SState
CCountry
EEmail Address

If a field value contains a comma, enclose it in quotes. For example: "O=ACME, Inc.".

CData Cloud

OAuthJWTIssuer

The issuer of the Java Web Token.

Data Type

string

Default Value

""

Remarks

The issuer of the Java Web Token. Enter the value of the service account email address.

This is not required when using the GOOGLEJSON OAuthJWTCertType. Google JSON keys contain a copy of the issuer account.

CData Cloud

OAuthJWTSubject

The user subject for which the application is requesting delegated access.

Data Type

string

Default Value

""

Remarks

The user subject for which the application is requesting delegated access. Enter the email address of the user for which the application is requesting delegated access.

CData Cloud

SSL

This section provides a complete list of the SSL properties you can configure in the connection string for this provider.


PropertyDescription
SSLServerCertSpecifies the certificate to be accepted from the server when connecting using TLS/SSL.
CData Cloud

SSLServerCert

Specifies the certificate to be accepted from the server when connecting using TLS/SSL.

Data Type

string

Default Value

""

Remarks

If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate to be accepted from the server. Any other certificate that is not trusted by the machine is rejected.

This property can take the following forms:

Description Example
A full PEM Certificate (example shortened for brevity) -----BEGIN CERTIFICATE----- MIIChTCCAe4CAQAwDQYJKoZIhv......Qw== -----END CERTIFICATE-----
A path to a local file containing the certificate C:\cert.cer
The public key (example shortened for brevity) -----BEGIN RSA PUBLIC KEY----- MIGfMA0GCSq......AQAB -----END RSA PUBLIC KEY-----
The MD5 Thumbprint (hex values can also be either space or colon separated) ecadbdda5a1529c58a1e9e09828d70e4
The SHA1 Thumbprint (hex values can also be either space or colon separated) 34a929226ae0819f2ec14b4a3d904f801cbb150d

If not specified, any certificate trusted by the machine is accepted.

Use '*' to signify to accept all certificates. Note that this is not recommended due to security concerns.

CData Cloud

Logging

This section provides a complete list of the Logging properties you can configure in the connection string for this provider.


PropertyDescription
VerbositySpecifies the verbosity level of the log file, which controls the amount of detail logged. Supported values range from 1 to 5.
CData Cloud

Verbosity

Specifies the verbosity level of the log file, which controls the amount of detail logged. Supported values range from 1 to 5.

Data Type

string

Default Value

"1"

Remarks

This property defines the level of detail the Cloud includes in the log file. Higher verbosity levels increase the detail of the logged information, but may also result in larger log files and slower performance due to the additional data being captured.

The default verbosity level is 1, which is recommended for regular operation. Higher verbosity levels are primarily intended for debugging purposes. For more information on each level, refer to Logging.

When combined with the LogModules property, Verbosity can refine logging to specific categories of information.

CData Cloud

Schema

This section provides a complete list of the Schema properties you can configure in the connection string for this provider.


PropertyDescription
BrowsableSchemasOptional setting that restricts the schemas reported to a subset of all available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC .
RefreshViewSchemasAllows the provider to determine up-to-date view schemas automatically.
ShowTableDescriptionsControls whether table descriptions are returned via the platform metadata APIs and sys_tables / sys_views.
PrimaryKeyIdentifiersSet this property to define primary keys.
AllowedTableTypesSpecifies what kinds of tables will be visible.
FlattenObjectsDetermines whether the provider flattens STRUCT fields into top-level columns.
CData Cloud

BrowsableSchemas

Optional setting that restricts the schemas reported to a subset of all available schemas. For example, BrowsableSchemas=SchemaA,SchemaB,SchemaC .

Data Type

string

Default Value

""

Remarks

Listing all available database schemas can take extra time, thus degrading performance. Providing a list of schemas in the connection string saves time and improves performance.

CData Cloud

RefreshViewSchemas

Allows the provider to determine up-to-date view schemas automatically.

Data Type

bool

Default Value

true

Remarks

When using BigQuery views, BigQuery stores a copy of the view schema with the view itself. However, these stored view schemas are not updated when the tables used by the view change. This means that the stored view schema can easily become out of date and cause queries using the view to fail.

By default, the Cloud will not use the stored view schema and will instead query the view to determine the available columns. This guarantees that the schema will be up to date although it requires the Cloud to start a query job.

You can disable this option to force the Cloud to use the stored view schemas. This prevents the Cloud from running any queries when getting a view schema, but also means that queries using the view will fail if the schema is out of date.

CData Cloud

ShowTableDescriptions

Controls whether table descriptions are returned via the platform metadata APIs and sys_tables / sys_views.

Data Type

bool

Default Value

false

Remarks

By default table descriptions are not shown, since the Google BigQuery API requires an extra request beyond what is usually required for reading tables.

Enabling this option will show table descriptions, but will cost an extra API request for every table when a table list is fetched. This can slow down metadata operations on large datasets.

CData Cloud

PrimaryKeyIdentifiers

Set this property to define primary keys.

Data Type

string

Default Value

""

Remarks

Google BigQuery does not natively support primary keys, but for certain DML operations or database tools you may need to define them. By default this option is disabled and no tables will have primary keys except for the ones defined in schema files (if you set Location).

Primary keys are defined using a list of rules which match tables and provide a list of key columns. For example, PrimaryKeyIdentifiers="*=key;transactions=tx_date,tx_serial;user_comments=" has three rules separated by semicolons:

  1. The first rule *=key means that every table without a more specific rule will have one primary key column called key. Tables that do not have a key column will not have any primary keys.
  2. The second rule transactions=tx_date,tx_serial means that the transactions table will have the two primary key columns tx_date and tx_serial. If any of those columns are missing from the table then they will be ignored.
  3. The third rule user_comments= means that the user_comments table will have no primary keys. The only use that empty key lists have is in overriding the default rule. If there is no default rule present then the only tables with primary keys would be the ones explicitly listed.

Note that the table names can include just the table, the table and dataset or the table, dataset and project. Both column and table names may be quoted using SQL quotes:

/* Rules with just table names use the connection ProjectId (or DataProjectId) and DatasetId. 
   All these rules refer to the same table with a connection where ProjectId=someProject;DatasetId=someDataset */
someTable=a,b,c
someDataset.someTable=a,b,c
someProject.someDataset.someTable=a,b,c

/* Any table or column name may be quoted */
`someProject`."someDataset".[someTable]=`a`,[b],"c"

CData Cloud

AllowedTableTypes

Specifies what kinds of tables will be visible.

Data Type

string

Default Value

"TABLE,EXTERNAL,VIEW,MATERIALIZED_VIEW"

Remarks

This option is a comma-separated list of the table type values that the Cloud displays. Any table-like or view-like entity that doesn't have a matching type will not be reported when listing tables.

  • TABLE Standard tables
  • EXTERNAL Read-only table stored on another service (like GCS or Drive)
  • SNAPSHOT A read-only table that preserves the data of another table at a specific point in time
  • VIEW Standard views
  • MATERIALIZED_VIEW A view that is recalculated and cached each time its base table changes

For example, to restrict the Cloud to listing only simple tables and views, this option would be set to TABLE,VIEW

CData Cloud

FlattenObjects

Determines whether the provider flattens STRUCT fields into top-level columns.

Data Type

bool

Default Value

true

Remarks

By default the Cloud reports each field in a STRUCT column as its own column while the STRUCT column itself is hidden. This process is recursively applied to nested STRUCT values. For example, if the following table is defined in Google BigQuery then the Cloud reports 3 columns: location.coords.lat, location.coords.lon and location.country:

CREATE TABLE t(location STRUCT<coords STRUCT<lat FLOAT64, lon FLOAT64>, country STRING>);

If this property is disabled, then the top-level STRUCT is not expanded and is left as its own column. The value of this column is reported as a JSON aggregate. In the above example, the Cloud reports only the location column when flattening is disabled.

CData Cloud

Miscellaneous

This section provides a complete list of the Miscellaneous properties you can configure in the connection string for this provider.


PropertyDescription
StorageTimeoutHow long a Storage API connection may remain active before the provider reconnects.
EmptyArraysAsNullWhether empty arrays are represented as 'null' or as '[]'.
HidePartitionColumnsWhether partition tables will show the columns _PARTITIONDATE and _PARTITIONTIME.
AllowAggregateParametersAllows raw aggregates to be used in parameters when QueryPassthrough is enabled.
ApplicationNameAn application name in the form application/version. For example, AcmeReporting/1.0.
AuditLimitThe maximum number of rows which will be stored within an audit table.
AuditModeWhat provider actions should be recorded to audit tables.
AWSWorkloadIdentityConfigConfiguration properties to provide when using Workload Identity Federation via AWS.
BigQueryOptionsA comma separated list of Google BigQuery options.
MaximumBillingTierThe MaximumBillingTier is a positive integer that serves as a multiplier of the basic price per TB. For example, if you set MaximumBillingTier to 2, the maximum cost for that query will be 2x basic price per TB.
MaximumBytesBilledLimits how many bytes BigQuery will allow a job to consume before it is cancelled.
MaxRowsSpecifies the maximum rows returned for queries without aggregation or GROUP BY.
PseudoColumnsSpecifies the pseudocolumns to expose as table columns. Use the format 'TableName=ColumnName;TableName=ColumnName'. The default is an empty string, which disables this property.
SupportCaseSensitiveTablesBy default, the provider treats table names as case-insensitive, so if multiple tables have the same name but different casing, only one will be reported in the metadata.
TableSamplePercentThis determines what percent of a table is sampled with the TABLESAMPLE operator.
TimeoutThe value in seconds until the timeout error is thrown, canceling the operation.
WorkloadPoolIdThe ID of your Workload Identity Federation pool.
WorkloadProjectIdThe ID of the Google Cloud project that hosts your Workload Identity Federation pool.
WorkloadProviderIdThe ID of your Workload Identity Federation pool provider.
CData Cloud

StorageTimeout

How long a Storage API connection may remain active before the provider reconnects.

Data Type

string

Default Value

"300"

Remarks

Google BigQuery and many proxies/firewalls restrict the amount of time that idle connections stay alive before they are forcibly closed. This can be a problem when using the Storage API because the Cloud may stream data faster than it can be consumed. While the consumer is catching up, the Cloud does not use its connection and it may be closed by the next time the Cloud uses it.

To avoid this the Cloud will automatically close and reopen the connection if it has been active for too long. This property controls how many seconds the connection has to be active for the Cloud to reset it. To disable these resets this property can also be set to 0 or a negative value.

CData Cloud

EmptyArraysAsNull

Whether empty arrays are represented as 'null' or as '[]'.

Data Type

bool

Default Value

true

Remarks

This property is enabled by default, so empty arrays are represented as 'null' for consistency with representing empty aggregates. To mimic the native driver and represente empty arrays as '[]', this property can be disabled.

CData Cloud

HidePartitionColumns

Whether partition tables will show the columns _PARTITIONDATE and _PARTITIONTIME.

Data Type

bool

Default Value

false

Remarks

This property is disabled by default, so partition tables will show the pseudocolumns _PARTITIONDATE and _PARTITIONTIME. To hide these columns, as is done in the native driver and the BigQuery console, this property can be enabled.

CData Cloud

AllowAggregateParameters

Allows raw aggregates to be used in parameters when QueryPassthrough is enabled.

Data Type

bool

Default Value

false

Remarks

This option affects how string parameters are handled when using direct queries through QueryPassthrough. For example, consider this query:

INSERT INTO proj.data.tbl(x) VALUES (@x)

By default, this option is disabled and string parameters are quoted and escaped into SQL strings. That means that any value can be safely used as a string parameter, but it also means that parameters cannot be used as raw aggregate values:

/*
 * If @x is set to: test value ' contains quote
 *
 * Result is a valid query
*/
INSERT INTO proj.data.tbl(x) VALUES ('test value \' contains quote')

/*
 * If @x is set to: ['valid', ('aggregate', 'value')]
 *
 * Result contains string instead of aggregate:
*/
INSERT INTO proj.data.tbl(x) VALUES ('[\'valid\', (\'aggregate\', \'value\')]')

When this option is enabled, string parameters are inserted directly into the query. This means that raw aggregates can be used as parameters, but it also means that all simple strings must be escaped:

/*
 * If @x is set to: test value ' contains quote
 *
 * Result is an invalid query
*/
INSERT INTO proj.data.tbl(x) VALUES (test value ' contains quote)

/*
 * If @x is set to: ['valid', ('aggregate', 'value')]
 *
 * Result is an aggregate
*/
INSERT INTO proj.data.tbl(x) VALUES (['valid', ('aggregate', 'value')])

CData Cloud

ApplicationName

An application name in the form application/version. For example, AcmeReporting/1.0.

Data Type

string

Default Value

""

Remarks

The Cloud identifies itself to BigQuery using a Google partner User-Agent header. The first part of the User-Agent is fixed and identifies the client as a specific build of the CData Cloud. The last portion reports the specific application using the Cloud.

CData Cloud

AuditLimit

The maximum number of rows which will be stored within an audit table.

Data Type

string

Default Value

"1000"

Remarks

When auditing is enabled with the AuditMode option, this property is used to determine how many rows will be allowed in the audit table at once.

By default this property is 1000, meaning that only the 1000 most recent audit events will be available within the audit table.

This property can also be set to -1, which places no limits on the size of the audit table. In this mode, the audit table should be periodically cleared to prevent the Cloud from using excessive memory.

DELETE FROM AuditJobs#TEMP

CData Cloud

AuditMode

What provider actions should be recorded to audit tables.

Data Type

string

Default Value

""

Remarks

The Cloud can record certain internal actions taken when it runs queries. For each of those actions listed in this option, the Cloud will create a temproary audit table which logs when the action took place, what query caused the action and any other relevant information.

By default this option is set to 'none' and the Cloud does not record any audit information. This option can also be set to a comma-separated list of the following actions:

Mode Name Audit Table Description Columns
start-jobs AuditJobs#TEMP Records all jobs started by the Cloud Timestamp,Query,ProjectId,Location,JobId

Refer to AuditLimit for more information on how to limit the size of these tables.

CData Cloud

AWSWorkloadIdentityConfig

Configuration properties to provide when using Workload Identity Federation via AWS.

Data Type

string

Default Value

""

Remarks

The properties are formatted as a semicolon-separated list of Key=Value properties, where the value is optionally quoted. For example, this setting authenticates in AWS using a user's root keys:

AWSWorkloadIdentityConfig="AuhtScheme=AwsRootKeys;AccessKey='AKIAABCDEF123456';SecretKey=...;Region=us-east-1"

CData Cloud

BigQueryOptions

A comma separated list of Google BigQuery options.

Data Type

string

Default Value

""

Remarks

A list of Google BigQuery options:

OptionDescription
gbqoImplicitJoinAsUnionThis option will prevent the driver from converting an IMPLICIT JOIN into a CROSS JOIN as expected by SQL92. Instead, it will leave it as an IMPLICIT JOIN, which Google BigQuery will execute as a UNION ALL.

CData Cloud

MaximumBillingTier

The MaximumBillingTier is a positive integer that serves as a multiplier of the basic price per TB. For example, if you set MaximumBillingTier to 2, the maximum cost for that query will be 2x basic price per TB.

Data Type

string

Default Value

""

Remarks

Limits the billing tier for this job. Queries that have resource usage beyond this tier will fail (without incurring a charge). If unspecified, this will be set to your project default. If your query is too compute intensive for BigQuery to complete at the standard per TB pricing tier, BigQuery returns a billingTierLimitExceeded error and an estimate of how much the query would cost. To run the query at a higher pricing tier, pass a new value for maximumBillingTier as part of the query request. The maximumBillingTier is a positive integer that serves as a multiplier of the basic price per TB. For example, if you set maximumBillingTier to 2, the maximum cost for that query will be 2x basic price per TB.

CData Cloud

MaximumBytesBilled

Limits how many bytes BigQuery will allow a job to consume before it is cancelled.

Data Type

string

Default Value

""

Remarks

When this value is provided, all jobs will use this value as their default billing cap. If a job uses more than this many bytes, BigQuery will cancel it and it will not be billed. By default there is no cap and all jobs will be billed for however many bytes they consume.

This only has an effect when using DestinationTable or when using the InsertJob stored procedure. BigQuery does not allow standard query jobs to have byte limits.

CData Cloud

MaxRows

Specifies the maximum rows returned for queries without aggregation or GROUP BY.

Data Type

int

Default Value

-1

Remarks

This property sets an upper limit on the number of rows the Cloud returns for queries that do not include aggregation or GROUP BY clauses. This limit ensures that queries do not return excessively large result sets by default.

When a query includes a LIMIT clause, the value specified in the query takes precedence over the MaxRows setting. If MaxRows is set to "-1", no row limit is enforced unless a LIMIT clause is explicitly included in the query.

This property is useful for optimizing performance and preventing excessive resource consumption when executing queries that could otherwise return very large datasets.

CData Cloud

PseudoColumns

Specifies the pseudocolumns to expose as table columns. Use the format 'TableName=ColumnName;TableName=ColumnName'. The default is an empty string, which disables this property.

Data Type

string

Default Value

""

Remarks

This property allows you to define which pseudocolumns the Cloud exposes as table columns.

To specify individual pseudocolumns, use the following format: "Table1=Column1;Table1=Column2;Table2=Column3"

To include all pseudocolumns for all tables use: "*=*"

CData Cloud

SupportCaseSensitiveTables

By default, the provider treats table names as case-insensitive, so if multiple tables have the same name but different casing, only one will be reported in the metadata.

Data Type

bool

Default Value

false

Remarks

When this property is set to true, tables with the same name but different casing will be renamed so they are all reported in the metadata.

CData Cloud

TableSamplePercent

This determines what percent of a table is sampled with the TABLESAMPLE operator.

Data Type

string

Default Value

""

Remarks

This option can be set to make the Cloud use the TABLESAMPLE for each table referenced by a query. The value determines what percent is provided to the PERCENT clause. That clause will only be generated if this property's value is above zero.

-- Input SQL
SELECT * FROM `tbl`

-- Generated Google BigQuery SQL when TableSamplePercent=10
SELECT * FROM `tbl` TABLESAMPLE SYSTEM (10 PERCENT)

This option is subject to a few limitations:

  • It is applied during query converison and has no effect when QueryPassthrough is set.
  • More rows may be returned than expected due to how the server implements TABLESAMPLE. Please see the Google BigQuery documentation for more information.
  • TABLESAMPLE is not supported on views. If a view is queried in sampling mode, the Cloud will omit the TABLESAMPLE clause for the view.

CData Cloud

Timeout

The value in seconds until the timeout error is thrown, canceling the operation.

Data Type

string

Default Value

"300"

Remarks

If Timeout = 0, operations do not time out. The operations run until they complete successfully or until they encounter an error condition.

If Timeout expires and the operation is not yet complete, the Cloud throws an exception.

CData Cloud

WorkloadPoolId

The ID of your Workload Identity Federation pool.

Data Type

string

Default Value

""

Remarks

The ID of your Workload Identity Federation pool.

CData Cloud

WorkloadProjectId

The ID of the Google Cloud project that hosts your Workload Identity Federation pool.

Data Type

string

Default Value

""

Remarks

The ID of the Google Cloud project that hosts your Workload Identity Federation pool.

CData Cloud

WorkloadProviderId

The ID of your Workload Identity Federation pool provider.

Data Type

string

Default Value

""

Remarks

The ID of your Workload Identity Federation pool provider.

CData Cloud

Third Party Copyrights

protobuf v. 3.5.1

Copyright 2008 Google Inc. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Google Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Code generated by the Protocol Buffer compiler is owned by the owner of the input file used when generating it. This code is not standalone and requires a support library to be linked with it. This support library is itself covered by the above license.

Google API Protobuf Definitions (Arrow)

Apache License Version 2.0, January 2004
http://www.apache.org/licenses/

TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

"License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.

"Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:

(a) You must give any other recipients of the Work or Derivative Works a copy of this License; and

(b) You must cause any modified files to carry prominent notices stating that You changed the files; and

(c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and

(d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.

You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives.

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Google API Protobuf Definitions (Avro)

Apache License Version 2.0, January 2004
http://www.apache.org/licenses/v

vro TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

1. Definitions.

"License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.

"Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.

"Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.

"You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.

"Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.

"Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.

"Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).

"Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.

"Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."

"Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.

2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.

3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.

4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:

(a) You must give any other recipients of the Work or Derivative Works a copy of this License; and

(b) You must cause any modified files to carry prominent notices stating that You changed the files; and

(c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and

(d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.

You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.

5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.

6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.

7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.

8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.

9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.

END OF TERMS AND CONDITIONS

APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives.

Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Copyright (c) 2025 CData Software, Inc. - All rights reserved.
Build 24.0.9175