ADO.NET Provider for Google BigQuery

Build 23.0.8839

Batch Processing

The CData ADO.NET Provider for Google BigQuery enables you to take advantage of the bulk load support in Google BigQuery through GoogleBigQueryDataAdapters. You can use the Batch API to execute related SQL data manipulation statements simultaneously. The provider translates all SQL queries in the batch into a single request.

Using the ADO.NET Batch API

Performing a batch update consists of the following basic steps:

  1. Define custom parameterized SQL statements in GoogleBigQueryCommand objects.
  2. Set the UpdatedRowSource property of the GoogleBigQueryCommand object to "UpdateRowSource.None".
  3. Assign the GoogleBigQueryCommand objects to the GoogleBigQueryDataAdapter.
  4. Add the parameters to the command.
  5. Call the GoogleBigQueryDataAdapter's Update method. Pass in a DataSet or DataTable containing your changes.

Controlling Batch Size

Depending on factors such as the size of the request, your network resources, and the performance of the server, you may gain performance by executing several smaller batch requests. You can control the size of each batch by setting the GoogleBigQueryDataAdapter's UpdateBatchSize property to a positive integer.

Bulk INSERT

The following code prepares a single batch that inserts records in bulk and retrieves the new records' Ids. The example executes a batch INSERT of new DataRows, which have the "Added" state.

C#

GoogleBigQueryDataAdapter adapter = new GoogleBigQueryDataAdapter();

using (GoogleBigQueryConnection conn = new GoogleBigQueryConnection("InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProject;DatasetId=NameOfDataset;")) {
  conn.Open();
  adapter.InsertCommand = conn.CreateCommand();
  adapter.InsertCommand.CommandText = "INSERT INTO [publicdata].[samples].github_nested (repository.name) VALUES (@repository.name)";
  adapter.InsertCommand.UpdatedRowSource = UpdateRowSource.None;
  adapter.InsertCommand.Parameters.Add("@repository.name", "repository.name");

  DataTable batchDataTable = new DataTable();
  batchDataTable.Columns.Add("repository.name", typeof(string));
  batchDataTable.Rows.Add("EntityFramework");
  batchDataTable.Rows.Add("CoreCLR");
  adapter.UpdateBatchSize = 2;
  adapter.Update(batchDataTable);

  GoogleBigQueryCommand cmd = new GoogleBigQueryCommand("SELECT * FROM LastResultInfo#TEMP", conn);
  adapter = new GoogleBigQueryDataAdapter(cmd);
  DataTable res = new DataTable();
  adapter.Fill(res);
  foreach (DataRow row in res.Rows) 
    foreach(DataColumn col in res.Columns)
      Console.Write("{0}: {1}", col.ColumnName, row[col]);
}

VB.NET

 
Dim adapter As New GoogleBigQueryDataAdapter()

Using conn As New GoogleBigQueryConnection("InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProject;DatasetId=NameOfDataset;")
  conn.Open()
  adapter.InsertCommand = conn.CreateCommand()
  adapter.InsertCommand.CommandText = "INSERT INTO [publicdata].[samples].github_nested (actor.attributes.email) VALUES (@repository.name)"
  adapter.InsertCommand.UpdatedRowSource = UpdateRowSource.None
  adapter.InsertCommand.Parameters.Add("@repository.name", "repository.name")

  Dim batchDataTable As New DataTable()
  batchDataTable.Columns.Add("repository.name", GetType(String))
  batchDataTable.Rows.Add("CoreCLR")
  batchDataTable.Rows.Add("EntityFramework")
  adapter.UpdateBatchSize = 2
  adapter.Update(batchDataTable)

  Dim cmd As New GoogleBigQueryCommand("SELECT * FROM LastResultInfo#TEMP", conn)
  adapter = New GoogleBigQueryDataAdapter(cmd)
  Dim res As New DataTable()
  adapter.Fill(res)
  For Each row As DataRow In res.Rows 
    For Each col As DataColumn In res.Columns
      Console.WriteLine("{0}: {1}", col.ColumnName, row(col))
    Next
  Next
End Using

Bulk Update

A batch update additionally requires the primary key of each row to update. The following example executes a batch for all DataRow records with a "Modified" state:

C#

GoogleBigQueryDataAdapter adapter = new GoogleBigQueryDataAdapter();

using (GoogleBigQueryConnection conn = new GoogleBigQueryConnection("InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProject;DatasetId=NameOfDataset;")) { 
  conn.Open();
  adapter.UpdateCommand = conn.CreateCommand();
  adapter.UpdateCommand.CommandText = "UPDATE [publicdata].[samples].github_nested SET repository.name=@repository.name WHERE Id=@Id";
  adapter.UpdateCommand.Parameters.Add("@repository.name", "repository.name");
  adapter.UpdateCommand.Parameters.Add("@Id", "Id");
  adapter.UpdateCommand.UpdatedRowSource = UpdateRowSource.None; 
  adapter.UpdateBatchSize = 2;
  adapter.Update(dataTable);
}

VB.NET

 
Dim adapter As New GoogleBigQueryDataAdapter()

Using conn As New GoogleBigQueryConnection("InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProject;DatasetId=NameOfDataset;")
  conn.Open()
  adapter.UpdateCommand = conn.CreateCommand()
  adapter.UpdateCommand.CommandText = "UPDATE [publicdata].[samples].github_nested SET repository.name=@repository.name WHERE Id=@Id"
  adapter.UpdateCommand.Parameters.Add("@repository.name", "repository.name")
  adapter.UpdateCommand.Parameters.Add("@Id", "Id")
  adapter.UpdateCommand.UpdatedRowSource = UpdateRowSource.None
  adapter.UpdateBatchSize = 2
  adapter.Update(dataTable)
End Using

Bulk Delete

The following code prepares a single batch that deletes records in bulk. The primary key for each row is required. The following example executes a batch for all DataRow records with a "Deleted" state:

C#

GoogleBigQueryDataAdapter adapter = new GoogleBigQueryDataAdapter();

using (GoogleBigQueryConnection conn = new GoogleBigQueryConnection("InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProject;DatasetId=NameOfDataset;")) {
  conn.Open();
  adapter.DeleteCommand = conn.CreateCommand();
  adapter.DeleteCommand.CommandText = "DELETE FROM [publicdata].[samples].github_nested WHERE Id=@Id";
  adapter.DeleteCommand.Parameters.Add("@Id", "Id");
  adapter.DeleteCommand.UpdatedRowSource = UpdateRowSource.None; 
  adapter.UpdateBatchSize = 2;
  adpater.Update(table);
}

VB.NET

Dim adapter As New GoogleBigQueryDataAdapter()

Using conn As New GoogleBigQueryConnection("InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProject;DatasetId=NameOfDataset;")
  conn.Open()
  adapter.DeleteCommand = conn.CreateCommand()
  adapter.DeleteCommand.CommandText = "DELETE FROM [publicdata].[samples].github_nested WHERE Id=@Id"
  adapter.DeleteCommand.Parameters.Add("@Id", "Id")
  adapter.DeleteCommand.UpdatedRowSource = UpdateRowSource.None 
  adapter.UpdateBatchSize = 2
  adpater.Update(table)
End Using

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 23.0.8839