Establishing a Connection
With the CData Cmdlets users can install a data module, set the connection properties, and start scripting. This section provides examples of using our ADLS Cmdlets with native PowerShell cmdlets, like the CSV import and export cmdlets.
Connecting to Azure Data Lake Storage Gen 2
To connect to a Gen 2 DataLakeStorage account, set the following properties:
- Account: The name of the storage account.
- FileSystem: The file system name used for this account. For example, the name of an Azure Blob Container.
- Directory (Optional): The path to the location where the replicated file should be stored. If no path is specified, the file is stored in the root directory.
Authenticating to Azure Data Lake Storage Gen 2
Azure Data Lake Storage Gen 2 supports five different ways to authenticate: using an Access key (AccessKey), using a Shared Access Signature (SAS), via Azure Active Directory OAuth (AzureAD), via Azure Service Principal (AzureServicePrincipal or AzureServicePrincipalCert), or via Managed Service Identity (AzureMSI).
Access Key
To connect using an access key, you must first obtain an available access key for the ADLS Gen2 storage account.At the Azure portal:
- Go to your ADLS Gen2 Storage Account.
- Under Settings, select Access keys.
- Copy the value for one of the available access keys to the AccessKey connection property.
When you are ready to connect, set these properties:
- AuthScheme: AccessKey.
- AccessKey: The access key value you just retrieved from the Azure Portal.
Shared Access Signature (SAS)
To connect using a Shared Access Signature, you must first generate one using the Azure Storage Explorer tool.When you are ready to connect, set these properties:
- AuthScheme: SAS.
- SharedAccessSignature: The value of the Shared Access Signature you just generated.
Entra ID (Azure AD)
Note: Microsoft has rebranded Azure AD as Entra ID. In topics that require the user to interact with the Entra ID Admin site, we use the same names Microsoft does. However, there are still CData connection properties whose names or values reference "Azure AD".
Microsoft Entra ID is a multi-tenant, cloud-based identity and access management platform. It supports OAuth-based authentication flows that enable the driver to access Azure Data Lake Storage endpoints securely.
Authentication to Entra ID via a web application always requires that you first create and register a custom OAuth application. This enables your application to define its own redirect URI, manage credential scope, and comply with organization-specific security policies.
For full instructions on how to create and register a custom OAuth application, see Creating an Entra ID (Azure AD) Application.
After setting AuthScheme to AzureAD, the steps to authenticate vary, depending on the environment. For details on how to connect from desktop applications, web-based workflows, or headless systems, see the following sections.
Desktop Applications
You can authenticate from a desktop application using either the driver's embedded OAuth application or a custom OAuth application registered in Microsoft Entra ID.
Option 1: Use the Embedded OAuth Application
This is a pre-registered application, included with the driver. It simplifies setup and eliminates the need to register your own credentials and is ideal for development environments, single-user tools, or any setup where quick and easy authentication is preferred.
Set the following connection properties:
- AuthScheme: AzureAD
- InitiateOAuth:
- GETANDREFRESH – Use for the initial login. Launches the login page and saves tokens.
- REFRESH – Use this setting when you have already obtained valid access and refresh tokens. Reuses stored tokens without prompting the user again.
When you connect, the driver opens the Microsoft Entra sign-in page in your default browser. After signing in and granting access, the driver retrieves the access and refresh tokens and saves them to the path specified by OAuthSettingsLocation.
Option 2: Use a Custom OAuth Application
If your organization requires more control, such as managing security policies, redirect URIs, or application branding, you can instead register a custom OAuth application in Microsoft Entra ID and provide its values during connection.
During registration, record the following values:
- OAuthClientId: The client Id that was generated when you registered your custom OAuth application.
- OAuthClientSecret: The client secret that was that was generated when you registered your custom OAuth application.
- CallbackURL: A redirect URI you defined during application registration.
For full instructions on how to register a custom OAuth application and configure redirect URIs, see Creating an Entra ID (Azure AD) Application.
Set the following connection properties:
- AuthScheme: AzureAD
- InitiateOAuth:
- GETANDREFRESH – Use for the initial login. Launches the login page and saves tokens.
- REFRESH – Use this setting when you have already obtained valid access and refresh tokens. Reuses stored tokens without prompting the user again.
- OAuthClientId: The client Id that was generated when you registered your custom OAuth application.
- OAuthClientSecret: The client secret that was generated when you registered your custom OAuth application.
- CallbackURL: A redirect URI you defined during application registration.
After authentication, tokens are saved to OAuthSettingsLocation. These values persist across sessions and are used to automatically refresh the access token when it expires, so you don't need to log in again on future connections.
Headless Machines
Headless environments like CI/CD pipelines, background services, or server-based integrations do not have an interactive browser. To authenticate using AzureAD, you must complete the OAuth flow on a separate device with a browser and transfer the authentication result to the headless system.
Setup options:
- Obtain and exchange a verifier code
- Use another device to sign in and retrieve a verifier code, which the headless system uses to request tokens.
- Transfer an OAuth settings file
- Authenticate on another device, then copy the stored token file to the headless environment.
Using a Verifier Code
- On a device with a browser:
- If using a custom OAuth app, set the following properties:
- OAuthClientId: The client Id that was generated when you registered your custom OAuth application.
- OAuthClientSecret: The client secret that was generated when you registered your custom OAuth application.
- Call the GetOAuthAuthorizationURL stored procedure to generate a sign-in URL.
- Open the returned URL in a browser. Sign in and grant grant permissions to the driver. You are redirected to the callback URL, which contains the verifier code.
- After signing in, save the value of the code parameter from the redirect URL. You will use this later to set the OAuthVerifier connection property.
- If using a custom OAuth app, set the following properties:
- On the headless machine:
- Set the following properties:
- AuthScheme: AzureAD
- OAuthVerifier: The verifier code you saved.
- OAuthSettingsLocation: The path of the file that holds the OAuth token values.
- For custom applications:
- OAuthClientId: The client Id that was generated when you registered your custom OAuth application.
- OAuthClientSecret: The client secret that was generated when you registered your custom OAuth application.
- After tokens are saved, reuse them by setting:
- OAuthSettingsLocation: Make sure this location grants read and write permissions to the driver to enable the automatic refreshing of the access token.
- For custom applications:
- OAuthClientId: The client Id that was generated when you registered your custom OAuth application.
- OAuthClientSecret: The client secret that was generated when you registered your custom OAuth application.
- Set the following properties:
Transferring OAuth Settings
- On a device with a browser:
- Connect using the instructions in the Desktop Applications section.
- After connecting, tokens are saved to the file path in OAuthSettingsLocation. The default filename is OAuthSettings.txt.
- On the headless machine:
- Copy the OAuth settings file to the machine.
- Set the following properties:
- AuthScheme: AzureAD
- OAuthSettingsLocation: Make sure this location grants read and write permissions to the driver to enable the automatic refreshing of the access token.
- For custom applications:
- OAuthClientId: The client Id that was generated when you registered your custom OAuth application.
- OAuthClientSecret: The client secret that was generated when you registered your custom OAuth application.
After setup, the driver uses the stored tokens to refresh the access token automatically, no browser or manual login is required.
Azure Service Principal
Note: Microsoft has rebranded Azure AD as Entra ID. In topics that require the user to interact with the Entra ID Admin site, we use the same names Microsoft does. However, there are still CData connection properties whose names or values reference "Azure AD".
Service principals are security objects within a Microsoft Entra ID (Azure AD) application that define what that application can do within a specific tenant.
Service principals are created in the Entra admin center, also accessible through the Azure portal.
As part of the creation process we also specify whether the service principal will access Entra resources via a client secret or a certificate.
Depending on the service you are connecting to, a tenant administrator may need to enable Service Principal authentication or assign the Service Principal to the appropriate roles or security groups.
Instead of being tied to a particular user, service principal permissions are based on the roles assigned to them. These roles determine which resources the application can access and which operations it can perform.
When authenticating using a service principal, you must register an application with an Entra tenant, as described in Creating a Service Principal App in Entra ID (Azure AD).
This subsection describes properties you must set before you can connect. These vary, depending on whether you will authenticate via a client secret or a certificate.
Authentication with Client Secret
- AuthScheme: AzureServicePrincipal.
- AzureTenant: The Azure AD tenant to which you will connect.
- OAuthClientId: The client ID in your application settings.
- OAuthClientSecret: The client secret in your application settings.
- InitiateOAuth: GETANDREFRESH. You can use InitiateOAuth to avoid repeating the OAuth exchange and manually setting the OAuthAccessToken.
Authentication with Certificate
- AuthScheme: AzureServicePrincipalCert.
- AzureTenant: The Azure AD tenant to which you will connect.
- OAuthClientId: The client Id in your application settings.
- OAuthJWTCert: The JWT Certificate store.
- OAuthJWTCertType: The JWT Certificate store type.
- InitiateOAuth: GETANDREFRESH. You can use InitiateOAuth to avoid repeating the OAuth exchange and manually setting the OAuthAccessToken.
Managed Service Identity (MSI)
If you are running Azure Data Lake Storage on an Azure VM and want to automatically obtain Managed Service Identity (MSI) credentials to connect, set AuthScheme to AzureMSI.
User-Managed Identities
To obtain a token for a managed identity, use the OAuthClientId property to specify the managed identity's client_id.If your VM has multiple user-assigned managed identities, you must also specify OAuthClientId.
Creating a Connection Object
You can then use the Connect-ADLS cmdlet to create a connection object that can be passed to other cmdlets:
$conn = Connect-ADLS -Account "MyStorageAccount" -FileSystem "MyBlobContainer" -AccessKey "MyAccessKey"
Retrieving Data
The Select-ADLS cmdlet provides a native PowerShell interface for retrieving data:
$results = Select-ADLS -Connection $conn -Table "Resources" -Columns @("FullPath, Permission") -Where "Type='FILE'"
The Invoke-ADLS cmdlet provides an SQL interface. This cmdlet can be used to execute an SQL query via the Query parameter.
Piping Cmdlet Output
The cmdlets return row objects to the pipeline one row at a time. The following line exports results to a CSV file:
Select-ADLS -Connection $conn -Table Resources -Where "Type = 'FILE'" | Select -Property * -ExcludeProperty Connection,Table,Columns | Export-Csv -Path c:\myResourcesData.csv -NoTypeInformation
You will notice that we piped the results from Select-ADLS into a Select-Object cmdlet and excluded some properties before piping them into an Export-CSV cmdlet. We do this because the CData Cmdlets append Connection, Table, and Columns information onto each row object in the result set, and we do not necessarily want that information in our CSV file.
However, this makes it easy to pipe the output of one cmdlet to another. The following is an example of converting a result set to JSON:
PS C:\> $conn = Connect-ADLS -Account "MyStorageAccount" -FileSystem "MyBlobContainer" -AccessKey "MyAccessKey"
PS C:\> $row = Select-ADLS -Connection $conn -Table "Resources" -Columns (FullPath, Permission) -Where "Type = 'FILE'" | select -first 1
PS C:\> $row | ConvertTo-Json
{
"Connection": {
},
"Table": "Resources",
"Columns": [
],
"FullPath": "MyFullPath",
"Permission": "MyPermission"
}