SSIS Components for Amazon DynamoDB

Build 24.0.9060

Performance

Setting a Retry Interval

You can set the following properties to retry queries instead of returning a temporary error such as "maximum throughput exceeded":

  • RetryWaitTime: The minimum number of milliseconds the component will wait to retry a request.
  • MaximumRequestRetries: The maximum number of times to retry a request.

The CData SSIS Components for Amazon DynamoDB also has two seperate APIs that may be used depending on the query, PartiQL and Scan. The API that is used depends on the query that is executed.

PartiQL

PartiQL is used on any INSERT/update/delete request query, as well as any select that contains a filter. This is due to the PartiQL API containing more advanced filtering capabilities than the older Scan endpoint. In general, queries where a significant portion of the result is filtered out can be expected to execute faster than a query with very little filtered.

Using Paging Effectively

You can use the Pagesize property to optimize use of your provisioned throughput, based on the size of your items and Amazon DynamoDB's 1MB page size. Set this property to the number of items to return.

Generally, a smaller page size reduces spikes in throughput that cause throttling. A smaller page size also inserts pauses between requests. This interval evens out the distribution of requests and allows more requests to be successful by avoiding throttling.

Scans

A Scan will occur during a SELECT query that contains no filter. In this case, all results must be retrieved, so there is no advantage in using the PartiQL API. Executing a Scan will retrieve all results, but the API contains a key feature that gives it better performance than an unfiltered PartiQL query: multiple threads.

The ThreadCount connection property may be set to influence how many threads will be used when executing a Scan request. Using more threads will cause more memory to be taken up, but will result in faster results per thread. The default is 4. This works best on tables where a high or variable throughput is provisioned.

In cases where the maximum throughput for a table would be exceeded on a single thread, there is no benefit to using a Scan over the single threaded PartiQL API. The Amazon DynamoDB will simply throttle all threads until the maximum throughput is no longer exceeded.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060