CData Cmdlets for Google BigQuery 2019 - Online Help
Questions / Feedback?

Establishing a Connection

CData Cmdlets for Google BigQuery 2019 - Build 19.0.7354

With the CData Cmdlets users can install a data module, set the connection properties, and start scripting. This section provides examples of using our GoogleBigQuery Cmdlets with native PowerShell cmdlets, like the CSV import and export cmdlets.

Installing and Connecting

If you have PSGet, installing the cmdlets can be accomplished from the PowerShell Gallery with the following command. You can also obtain a setup from the CData site.

Install-Module GoogleBigQueryCmdlets

The following line is then added to your profile, loading the cmdlets on the next session:

Import-Module GoogleBigQueryCmdlets;

You can then use the Connect-GoogleBigQuery cmdlet to create a connection object that can be passed to other cmdlets:

$conn = Connect-GoogleBigQuery -ProjectId 'NameOfProject' -DatasetId 'NameOfDataset'

Authenticate via OAuth Authentication

Use the OAuth authentication standard to connect to Google BigQuery. You can authenticate with a user account or with a service account. A service account is required to grant organization-wide access scopes to the cmdlet. The cmdlet facilitates these authentication flows as described below.

Authenticate with a User Account

You can connect without setting any connection properties for your user credentials. After setting the following, you are ready to connect:

  • DatasetId: Set this to the Id of the dataset you want to connect to.
  • ProjectId: Set this to the Id of the project you want to connect to.
When you connect the cmdlet opens the OAuth endpoint in your default browser. Log in and grant permissions to the application. The cmdlet then completes the OAuth process.

See Using OAuth Authentication for other OAuth authentication flows.

Authenticate with a Service Account

Service accounts have silent authentication, without user authentication in the browser. You can also use a service account to delegate enterprise-wide access scopes to the cmdlet.

You need to create an OAuth application in this flow. See Creating a Custom OAuth App in the Getting Started section to create and authorize an app. You can then connect to Google BigQuery data that the service account has permission to access.

After setting the following connection properties, you are ready to connect:

  • OAuthJWTCertType: Set this to "PFXFILE".
  • OAuthJWTCert: Set this to the path to the .p12 file you generated.
  • OAuthJWTCertPassword: Set this to the password of the .p12 file.
  • OAuthJWTCertSubject: Set this to "*" to pick the first certificate in the certificate store.
  • OAuthJWTSubject: Set this to the email address of the user for whom the application is requesting delegate access. Note that delegate access must be granted by an administrator.
  • OAuthJWTIssuer: In the service accounts section, click Manage Service Accounts and set this field to the email address displayed in the service account Id field.
  • DatasetId: Set this to the Id of the dataset you want to connect to.
  • ProjectId: Set this to the Id of the project you want to connect to.
When you connect the cmdlet completes the OAuth flow for a service account.

Retrieving Data

The Select-GoogleBigQuery cmdlet provides a native PowerShell interface for retrieving data:

$results = Select-GoogleBigQuery -Connection $conn -Table "publicdata.samples.github_nested" -Columns @("actor.attributes.email, repository.name") -Where "repository.name='EntityFramework'"
The Invoke-GoogleBigQuery cmdlet provides an SQL interface. This cmdlet can be used to execute an SQL query via the Query parameter.

Piping Cmdlet Output

The cmdlets return row objects to the pipeline one row at a time. The following line exports results to a CSV file:

Select-GoogleBigQuery -Connection $conn -Table publicdata.samples.github_nested -Where "repository.name = 'EntityFramework'" | Select -Property * -ExcludeProperty Connection,Table,Columns | Export-Csv -Path c:\mypublicdata.samples.github_nestedData.csv -NoTypeInformation

You will notice that we piped the results from Select-GoogleBigQuery into a Select-Object cmdlet and excluded some properties before piping them into an Export-CSV cmdlet. We do this because the CData Cmdlets append Connection, Table, and Columns information onto each row object in the result set, and we do not necessarily want that information in our CSV file.

However, this makes it easy to pipe the output of one cmdlet to another. The following is an example of converting a result set to JSON:

 
PS C:\> $conn  = Connect-GoogleBigQuery -ProjectId 'NameOfProject' -DatasetId 'NameOfDataset'
PS C:\> $row = Select-GoogleBigQuery -Connection $conn -Table "publicdata.samples.github_nested" -Columns (actor.attributes.email, repository.name) -Where "repository.name = 'EntityFramework'" | select -first 1
PS C:\> $row | ConvertTo-Json
{
  "Connection":  {

  },
  "Table":  "publicdata.samples.github_nested",
  "Columns":  [

  ],
  "actor.attributes.email":  "Myactor.attributes.email",
  "repository.name":  "Myrepository.name"
} 

Deleting Data

The following line deletes any records that match the criteria:

Select-GoogleBigQuery -Connection $conn -Table publicdata.samples.github_nested -Where "repository.name = 'EntityFramework'" | Remove-GoogleBigQuery

Updating Data

The cmdlets make data transformation easy as well as data cleansing. The following example loads data from a CSV file into Google BigQuery, checking first whether a record already exists and needs to be updated instead of inserted.

Import-Csv -Path C:\Mypublicdata.samples.github_nestedUpdates.csv | %{
  $record = Select-GoogleBigQuery -Connection $conn -Table publicdata.samples.github_nested -Where ("Id = `'"+$_.Id+"`'")
  if($record){
    Update-GoogleBigQuery -Connection $conn -Table publicdata.samples.github_nested -Columns @("actor.attributes.email","repository.name") -Values @($_.actor.attributes.email, $_.repository.name) -Where "Id  = `'$_.Id`'"
  }else{
    Add-GoogleBigQuery -Connection $conn -Table publicdata.samples.github_nested -Columns @("actor.attributes.email","repository.name") -Values @($_.actor.attributes.email, $_.repository.name)
  }
}

 
 
Copyright (c) 2020 CData Software, Inc. - All rights reserved.
Build 19.0.7354.0