Databricks

Version 24.2.8980


Databricks


You can use the Databricks connector from the CData Sync application to capture data from Databricks and move it to any supported destination. To do so, you need to add the connector, authenticate to the connector, and complete your connection.

Add the Databricks Connector

To enable Sync to use data from Databricks, you first must add the connector, as follows:

  1. Open the Connections page of the Sync dashboard.

  2. Click Add Connection to open the Select Connectors page.

  3. Click the Sources tab and locate the Databricks row.

  4. Click the Configure Connection icon at the end of that row to open the New Connection page. If the Configure Connection icon is not available, click the Download Connector icon to install the Databricks connector. For more information about installing new connectors, see Connections.

Authenticate to Databricks

After you add the connector, you need to set the required properties.

  • Connection Name - Enter a connection name of your choice.

  • Server - Enter the host name or the IP address of the server that hosts the Databricks database.

  • HTTP Path - Enter the HTTP path of your Databricks cluster.

CData Sync supports authenticating to Databricks in several ways. Select your authentication method below to proceed to the relevant section that contains the authentication details.

Personal Access Token

To connect with a personal access token (PAT), set the following properties:

  • Auth Scheme - Select PersonalAccessToken.

  • Token - Enter the PAT that you use to connect to Databricks server.

    To obtain your token:

    1. In your Azure Databricks workspace, click your username in the top bar and select User Settings > Access Tokens.

    2. Click Developer.

    3. Click Manage (next to Access tokens). Then, click Generate new token > Generate.

    Copy and save your new token because it will not be displayed again.

Basic

To connect with your user credentials, set the following properties:

  • Auth Scheme: Select Basic.

  • User - Enter the username that you use to authenticate to your Databricks account.

  • Password - Enter the password that you use to authenticate to your Databricks account.

Azure Service Principal

To connect with an Azure service principal, set the following properties:

  • Auth Scheme - Select AzureServicePrincipal.

  • Azure Tenant Id - Enter the tenant identifier (Id) for your Microsoft Azure Active Directory application.

  • Azure Client Id - Enter the client Id for your Microsoft Azure Active Directory application.

  • Azure Client Secret - Enter the client secret for your Microsoft Azure Active Directory application.

  • Azure Subscription Id - Enter the subscription Id for your Microsoft Azure Databricks service workspace.

  • Azure Resource Group - Enter the resource-group name for your Microsoft Azure Databricks service workspace.

  • Azure Workspace - Enter the name of your Microsoft Azure Databricks service workspace.

Azure Active Directory

To connect with an Azure Active Directory user account, set the following properties:

  • Auth Scheme - Select AzureAD.

  • Azure Tenant - Enter the Microsoft Online tenant that you use to access data. If you do not specify a tenant, Sync uses the default tenant.

  • OAuth Client Id - Enter the client Id that you were assigned when you registered your application with an OAuth authorization server.

Complete Your Connection

To complete your connection:

  1. (Optional) For Database, enter the name of your Databricks database.

  2. Define advanced connection settings on the Advanced tab. (In most cases, though, you should not need these settings.)

  3. If you authenticate with AzureAD, click Connect to Databricks to connect to your Databricks account.

  4. Click Create & Test to create your connection.

More Information

For more information about interactions between CData Sync and Databricks, see Databricks Connector for CData Sync.