Batch Processing
The CData ADO.NET Provider for Spark SQL enables you to take advantage of the bulk load support in Spark SQL through SparkSQLDataAdapters. You can use the Batch API to execute related SQL data manipulation statements simultaneously. The provider translates all SQL queries in the batch into a single request.
Using the ADO.NET Batch API
Performing a batch update consists of the following basic steps:
- Define custom parameterized SQL statements in SparkSQLCommand objects.
- Set the UpdatedRowSource property of the SparkSQLCommand object to "UpdateRowSource.None".
- Assign the SparkSQLCommand objects to the SparkSQLDataAdapter.
- Add the parameters to the command.
- Call the SparkSQLDataAdapter's Update method. Pass in a DataSet or DataTable containing your changes.
Controlling Batch Size
Depending on factors such as the size of the request, your network resources, and the performance of the server, you may gain performance by executing several smaller batch requests. You can control the size of each batch by setting the SparkSQLDataAdapter's UpdateBatchSize property to a positive integer.