Cmdlets for Apache HBase

Build 21.0.7930

Establishing a Connection

With the CData Cmdlets users can install a data module, set the connection properties, and start scripting. This section provides examples of using our ApacheHBase Cmdlets with native PowerShell cmdlets, like the CSV import and export cmdlets.

Installing and Connecting

If you have PSGet, installing the cmdlets can be accomplished from the PowerShell Gallery with the following command. You can also obtain a setup from the CData site.

Install-Module ApacheHBaseCmdlets

The following line is then added to your profile, loading the cmdlets on the next session:

Import-Module ApacheHBaseCmdlets;

You can then use the Connect-ApacheHBase cmdlet to create a connection object that can be passed to other cmdlets:

$conn = Connect-ApacheHBase -Server '127.0.0.1' -Port 8080

Connecting to Apache HBase

The CData Cmdlets PowerShell Module for Apache HBase connects to Apache HBase via the HBase REST (Stargate) server. Set the Port and Server properties to connect to Apache HBase.

The Server property will typically be the host name or IP address of the server hosting Apache HBase. If there are multiple nodes, you will use the host name or IP address of the machine running the REST (Stargate) server.

Starting the Server

Different Hadoop distributions contain different interfaces and means of starting and stopping the HBase REST server, along with different default port settings.

In most distributions, the HBase REST server can be started in the foreground by running the following command: "hbase rest start -p <port>". Please consult your Hadoop distribution's documentation for further information regarding the HBase REST server.

Authenticating to Apache HBase

The CData Cmdlets PowerShell Module for Apache HBase supports authentication over Basic and Negotiate.

No Authentication

By default, no authentication (or anonymous auth) is used. Set AuthScheme to None to explicitly enforce no authentication.

Authenticating with Basic

Basic authentication may be used by setting AuthScheme to Basic. In addition, set the following:

  • User: The Apache HBase user;
  • Password: The Apache HBase password;

Authenticating with Kerberos

To authenticate with Kerberos, set AuthScheme to NEGOTIATE and set the User and Password. Please see Using Kerberos for details on how to authenticate with Kerberos.

Retrieving Data

The Select-ApacheHBase cmdlet provides a native PowerShell interface for retrieving data:

$results = Select-ApacheHBase -Connection $conn -Table "Account" -Columns @("Id, Name") -Where "Industry='Floppy Disks'"
The Invoke-ApacheHBase cmdlet provides an SQL interface. This cmdlet can be used to execute an SQL query via the Query parameter.

Piping Cmdlet Output

The cmdlets return row objects to the pipeline one row at a time. The following line exports results to a CSV file:

Select-ApacheHBase -Connection $conn -Table Account -Where "Industry = 'Floppy Disks'" | Select -Property * -ExcludeProperty Connection,Table,Columns | Export-Csv -Path c:\myAccountData.csv -NoTypeInformation

You will notice that we piped the results from Select-ApacheHBase into a Select-Object cmdlet and excluded some properties before piping them into an Export-CSV cmdlet. We do this because the CData Cmdlets append Connection, Table, and Columns information onto each row object in the result set, and we do not necessarily want that information in our CSV file.

However, this makes it easy to pipe the output of one cmdlet to another. The following is an example of converting a result set to JSON:

 
PS C:\> $conn  = Connect-ApacheHBase -Server '127.0.0.1' -Port 8080
PS C:\> $row = Select-ApacheHBase -Connection $conn -Table "Account" -Columns (Id, Name) -Where "Industry = 'Floppy Disks'" | select -first 1
PS C:\> $row | ConvertTo-Json
{
  "Connection":  {

  },
  "Table":  "Account",
  "Columns":  [

  ],
  "Id":  "MyId",
  "Name":  "MyName"
} 

Deleting Data

The following line deletes any records that match the criteria:

Select-ApacheHBase -Connection $conn -Table Account -Where "Industry = 'Floppy Disks'" | Remove-ApacheHBase

Updating Data

The cmdlets make data transformation easy as well as data cleansing. The following example loads data from a CSV file into Apache HBase, checking first whether a record already exists and needs to be updated instead of inserted.

Import-Csv -Path C:\MyAccountUpdates.csv | %{
  $record = Select-ApacheHBase -Connection $conn -Table Account -Where ("Id = `'"+$_.Id+"`'")
  if($record){
    Update-ApacheHBase -Connection $conn -Table Account -Columns @("Id","Name") -Values @($_.Id, $_.Name) -Where "Id  = `'$_.Id`'"
  }else{
    Add-ApacheHBase -Connection $conn -Table Account -Columns @("Id","Name") -Values @($_.Id, $_.Name)
  }
}

Copyright (c) 2021 CData Software, Inc. - All rights reserved.
Build 21.0.7930