Google Cloud Storage

Version 25.3.9396


Google Cloud Storage


You can use the Google Cloud Storage connector from the CData Sync application to capture data from Google Cloud Storage and move it to any supported destination. To do so, you need to add the connector, authenticate to the connector, and complete your connection.

Supported File Formats

When Sync writes data to Google Cloud Storage, you can choose the file format for the exported data. The following file formats are supported for the Google Cloud Storage destination:

  • (Default) CSV—Plain text comma-separated values.

  • Avro—A row-based binary format that supports schema evolution.

  • Parquet—A columnar storage format that is optimized for analytics.

Prerequisites

If you want to authenticate with a Google Cloud service account (for silent authentication or delegated organization-wide access), do the following before configuring your connection in Sync:

  • Create a service account in Google Cloud and grant it the required permissions on your bucket or project.

  • For OAuth JWT authentication, register a custom OAuth application and download the certificate file (.p12 or .pfx).

Then proceed to the relevant authentication method below.

Add the Google Cloud Storage Connector

To enable Sync to use data from Google Cloud Storage, you first must add the connector, as follows:

  1. Open the Connections page of the Sync dashboard.

  2. Click Add Connection to open the Select Connectors page.

  3. Click the Sources tab and locate the Google Cloud Storage row.

  4. Click the Configure Connection icon at the end of that row to open the New Connection page. If the Configure Connection icon is not available, click the Download Connector icon to install the Google Cloud Storage connector. For more information about installing new connectors, see Connections.

Authenticate to Google Cloud Storage

After you add the connector, you need to set the required properties.

  • Connection Name: Enter a connection name of your choice.

  • File Format: Select the file format that you want to use: CSV (default), Avro, and Parquet.

    Note: Although the Delta Parquet file format is listed in the available file formats, Sync does not support delta file formats for sources. This option appears in the UI for file-based source connectors only because the delta file format cannot be restricted based on whether the connector is a source or a destination.

  • URI: Enter the path to the resource location for your file format.

  • Project Id: Enter the identifier (Id) of the project where your Google Cloud Storage instance resides.

CData Sync supports authenticating to Google Cloud Storage in several ways. Select your authentication method below to proceed to the relevant section that contains the authentication details.

OAuth

To connect with OAuth custom credentials, specify the following properties:

  • Auth Scheme: Select OAuth.

  • OAuth Version: Select the version of OAuth that you want to use. The default version is 2.0.

  • (Optional) Scope: Specify the scope of your access to the application.

  • (Optional) OAuth Authorization URL: Enter the OAuth authorization URL for the OAuth service.

  • (Optional) OAuth Access Token URL: Enter the URL from which to retrieve the access token.

  • (Optional) OAuth Refresh Token URL: Enter the URL from which to refresh the OAuth token.

OAuth PKCE

To connect with the OAuth PKCE extension, specify the following properties:

  • Auth Scheme: Select OAuthPKCE.

  • (Optional) OAuth Client Id: Enter the client identifier (Id) that you were assigned when you registered your application with an OAuth authorization server.

OAuth JWT

To connect with a Google Cloud Storage account, specify the following properties:

  • Auth Scheme: Select OAuthJWT.

  • OAuth JWT Cert: Enter your Java web tokens (JWT) certificate store.

  • OAuth JWT Cert Type: Enter the type of key store that contains your JWT certificate. The default type is PEMKEY_BLOB.

  • OAuth Client Id: Enter the client identifier (Id) that you were assigned when you registered your application with an OAuth authorization server.

  • OAuth Client Secret: Enter the client secret that you were assigned when you registered your application with an OAuth authorization server.

  • (Optional) Scope: Specify the scope of your access to the application.

  • (Optional) OAuth Authorization URL: Enter the OAuth authorization URL for the OAuth service.

  • (Optional) OAuth Access Token URL: Enter the URL from which to retrieve the access token.

  • (Optional) OAuth Refresh Token URL: Enter the URL from which to refresh the OAuth token.

  • (Optional) OAuth JWT Cert Password: Enter the password for your OAuth JWT certificate.

  • (Optional) OAuth JWT Cert Subject: Enter the subject of your OAuth JWT certificate.

  • (Optional) OAuth JWT Subject: Enter the user subject for which the application is requesting delegated access.

  • (Optional) OAuth JWT Subject Type: Select the subject type (enterprise or user) for the JWT authentication. The default type is enterprise.

  • (Optional) OAuth JWT Public Key Id: Enter the Id of the public key for JWT.

GCP Instance Account

When you run CData Sync on a GCP virtual machine, Sync can authenticate by using the service account that is tied to the virtual machine. Select GCPInstanceAccount for Auth Scheme to use that account. No additional properties are required.

Complete Your Connection

To complete your connection:

  1. Specify the following properties:

    For all file formats:

    • (Optional) Project Id: Enter the identifier (Id) of the project where your Google Cloud Storage instance resides.

      Note: This property is required for the Avro file format, and it is set in Authenticate to Google Cloud Storage

    For the CSV file format:

    • FMT: Enter the format that you want to use to parse all text files. The default format is CsvDelimited

    • Aggregate Files: Specify whether you want to aggregate all the files that are located in the URI directory and that have the same schema into a single table named AggregatedFiles. The default option is False.

    • Include Column Headers: Specify whether you want to obtain column headers from the first lines of the specified files. The default option is True.

    For the Avro and Parquet file formats:

    • Data Model: Select the data model that you want to use to parse documents for your format and to generate the database metadata. The default data model is Document.

    • Aggregate Files: Specify whether you want to aggregate all the files that are located in the URI directory and that have the same schema into a single table named AggregatedFiles. The default option is False.

  2. Define advanced connection settings on the Advanced tab. (In most cases, though, you should not need these settings.)

  3. If you authenticate with OAuth and OAuthPKCE, click Connect to Google Cloud Storage to connect to your Google Cloud Storage account.

  4. Click Create & Test to create your connection.

More Information

For more information about interactions between CData Sync and Google Cloud Storage, see Google Cloud Storage Connector for CData Sync.