ODBC Driver for Azure Data Lake Storage

Build 24.0.9060

Linux DSN Configuration

This section describes how to set up ODBC connectivity and configure DSNs on several Linux distributions: Debian-based systems, like Ubuntu, and Red Hat Linux platforms, like Red Hat Enterprise Linux (RHEL) and Fedora.

Minimum Linux Versions

Here are the minimum supported versions for Red Hat-based and Debian-based systems:

OSMin. Version
Ubuntu18.04
Debian10
RHEL8
Fedora28
SUSE15

Installing the Driver Dependencies

Run the following commands as root or with sudo to install the necessary dependencies:

  • Debian/Ubuntu:
    apt-get install libc6 libstdc++6 zlib1g libgcc1
  • RHEL/Fedora:
    yum install glibc libstdc++ zlib libgcc

Installing the Driver

You can use standard package management systems to install the driver.

On Debian-based systems, like Ubuntu, run the following command with root or sudo:

dpkg -i /path/to/driver/setup/ADLSODBCDriverforUnix.deb 

On systems that support the RPM package format, run the following command with root or sudo:

rpm -ivh /path/to/driver/ADLSODBCDriverforUnix.rpm 

Licensing the Driver

Run the following commands to license the driver. To activate a trial, omit the <key> input.

cd /opt/cdata/cdata-odbc-driver-for-adls/bin/
sudo ./install-license.sh <key>

Connecting through the Driver Manager

The driver manager loads the driver and passes function calls from the application to the driver. You need to register the driver with the driver manager and you define DSNs in the driver manager's configuration files.

The driver installation registers the driver with the unixODBC driver manager and creates a system DSN. The unixODBC driver manager can be used from Python and from many other applications. Your application may embed another driver manager.

Creating the DSN

See Using unixODBC to install unixODBC and configure DSNs. See Using the DataDirect Driver Manager to create a DSN to connect to OBIEE, Informatica, and SAS.

Connecting to CData ODBC Driver for Azure Data Lake Storage Gen 2

To connect to a Gen 2 DataLakeStorage account, set the following properties:

  • Account: The name of the storage account.
  • FileSystem: The file system name used for this account. For example, the name of an Azure Blob Container.
  • Directory (Optional): The path to the location where the replicated file should be stored. If no path is specified, the file is stored in the root directory.

Authenticating to CData ODBC Driver for Azure Data Lake Storage Gen 2

CData ODBC Driver for Azure Data Lake Storage supports four different ways to authenticate: using an AccessKey, using a Shared Access Signature, Azure Active Directory OAuth (AzureAD), and Managed Service Identity (AzureMSI).

Access Key

To connect using an access key, you must first obtain an available access key for the ADLS Gen2 storage account.

At the Azure portal:

  1. Go to your ADLS Gen2 Storage Account.
  2. Under Settings, select Access keys.
  3. Copy the value for one of the available access keys to the AccessKey connection property.

When you are ready to connect, set these properties:

  • AuthScheme: AccessKey.
  • AccessKey: The access key value you just retrieved from the Azure Portal.

Shared Access Signature (SAS)

To connect using a Shared Access Signature, you must first generate one using the Azure Storage Explorer tool.

When you are ready to connect, set these properties:

  • AuthScheme: SAS.
  • SharedAccessSignature: The value of the Shared Access Signature you just generated.

Azure AD

Azure AD is Microsoft’s multi-tenant, cloud-based directory and identity management service. It is user-based authentication that requires that you set AuthScheme to AzureAD.

Authentication to Azure AD over a Web application always requires the creation of a custom OAuth application. For details, see Creating an Azure AD Application.

Desktop Applications

CData provides an embedded OAuth application that simplifies connection to Azure AD from a Desktop application.

You can also authenticate from a desktop application using a custom OAuth application. (For further information, see Creating an Azure AD Application.) To authenticate via Azure AD, set these parameters:

  • AuthScheme: AzureAD.
  • Custom applications only:

    • OAuthClientId: The client Id assigned when you registered your custom OAuth application.
    • OAuthClientSecret: The client secret assigned when you registered your custom OAuth application.
    • CallbackURL: The redirect URI you defined when you registered your custom OAuth application.

When you connect, the driver opens Azure Data Lake Storage's OAuth endpoint in your default browser. Log in and grant permissions to the application.

The driver completes the OAuth process, obtaining an access token from Azure Data Lake Storage and using it to request data. The OAuth values are saved in the path specified in OAuthSettingsLocation. These values persist across connections.

When the access token expires, the driver refreshes it automatically.

Headless Machines

To configure the driver with a user account on a headless machine, you must authenticate on another device that has an internet browser.

You can do this in either of the following ways:

  • Obtain the OAuthVerifier value as described below in Option 1: Obtain and Exchange a Verifier Code.
  • Install the driver on another machine as described below in Option 2: Transfer OAuth Settings. After you authenticate via the usual browser-based flow, transfer the OAuth authentication values.

Option 1: Obtain and Exchange a Verifier Code

  1. Find the authorization endpoint.

    Custom applications only: Set these properties to create the Authorization URL:

    • OAuthClientId: The client Id assigned when you registered your application.
    • OAuthClientSecret: The client secret assigned when you registered your application.

    Custom and embedded applications: Call the GetOAuthAuthorizationURL stored procedure.

    1. Open the URL returned by the stored procedure in a browser.
    2. Log in and grant permissions to the driver. You are redirected to the callback URL, which contains the verifier code.
    3. Save the value of the verifier code. You will use this later to set the OAuthVerifier connection property.

  2. Exchange the OAuth verifier code for OAuth refresh and access tokens.

    At the headless machine, set these properties:

    • AuthScheme: AzureAD.
    • OAuthVerifier: The verifier code.
    • OAuthSettingsLocation: The location of the file that holds the OAuth token values that persist across connections.
    • Custom applications only:

      • OAuthClientId: The client Id in your custom OAuth application settings.
      • OAuthClientSecret: The client secret in the custom OAuth application settings.

  3. After the OAuth settings file is generated, reset the following properties to connect:

    • OAuthSettingsLocation: The location containing the encrypted OAuth authentication values. Make sure this location grants read and write permissions to the driver to enable the automatic refreshing of the access token.
    • Custom applications only:

      • OAuthClientId: The client Id assigned when you registered your application.
      • OAuthClientSecret: The client secret assigned when you registered your application.

Option 2: Transfer OAuth Settings

Before you can connect via a headless machine, you must create and install a connection with the driver on a device that supports an internet browser. Set the connection properties as described above, in Desktop Applications.

After you complete the instructions in Desktop Applications, the resulting authentication values are encrypted and written to the location specified by OAuthSettingsLocation. The default filename is OAuthSettings.txt.

Once you have successfully tested the connection, copy the OAuth settings file to your headless machine.

At the headless machine, set these properties:

  • AuthScheme: AzureAD.
  • OAuthSettingsLocation: The location of your OAuth settings file. Make sure this location gives read and write permissions to the driver to enable the automatic refreshing of the access token.
  • Custom applications only:

    • OAuthClientId: The client Id assigned when you registered your application.
    • OAuthClientSecret: The client secret assigned when you registered your application.

Managed Service Identity (MSI)

If you are running Azure Data Lake Storage on an Azure VM and want to leverage MSI to connect, set AuthScheme to AzureMSI.

User-Managed Identities

To obtain a token for a managed identity, use the OAuthClientId property to specify the managed identity's "client_id".

When your VM has multiple user-assigned managed identities, you must also specify OAuthClientId.

Refreshing OAuth Values

The driver can refresh the temporary OAuth access tokens obtained during the browser-based OAuth authentication exchange. By default, the driver saves the encrypted tokens in the odbc.ini file corresponding to the DSN. Access to this odbc.ini file can be restricted in the case of System DSNs.

To enable the automatic token exchange, you can give the driver write access to the system odbc.ini. Or, you can set the OAuthSettingsLocation connection property to an alternate file path, to which the driver would have read and write access.

    OAuthSettingsLocation=/tmp/oauthsettings.txt
    

Installing Dependencies for OAuth Authentication

The OAuth authentication standard requires the authenticating user to interact with Azure Data Lake Storage, using a web-browser. If the first OAuth interaction is to be done on the same machine the driver is installed on, for example, a desktop application, the driver needs access to the xdg-open program, which opens the default browser.

To satisfy this dependency, install the corresponding package with your package manager:

Debian/Ubuntu PackageRHEL/Fedora PackageFile
xdg-utilsxdg-utilsxdg-open

Set the Driver Encoding

The ODBC drivers need to specify which encoding to use with the ODBC Driver Manager. By default, the CData ODBC Drivers for Unix are configured to use UTF-16 which is compatible with unixODBC, but other Driver Managers may require alternative encoding.

Alternatively, if you are using the ODBC driver from an application that uses the ANSI ODBC API it may be necessary to set the ANSI code page. For example, to import Japanese characters in an ANSI application, you can specify the code page in the config file '/opt/cdata/cdata-odbc-driver-for-adls/lib/cdata.odbc.adls.ini':

[Driver]
AnsiCodePage = 932

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060