SSIS Components for Amazon Athena

Build 21.0.7929

Performance

Cleaning Query Results

Amazon Athena stores the results of every query you execute in CSV files in S3StagingDirectory; these can quickly rack up a lot of space in Amazon S3. You can use CleanQueryResults, enabled by default, to clean these files for every query executed.

Note that this behavior will add a minor performance hit when you disconnect the last connection in a process.

Using Athena's Query Caching

You configure QueryCachingLevel to modify the usage of the query results stored in S3StagingDirectory; note that you have to keep the connection open to benefit from this feature. This is especially helpful when executing a certain query multiple times. This means Amazon Athena will not scan the same data again and simply use the results from the previous execution. These results are cleaned in the amount of seconds specified in QueryTolerance.

Note that failing to properly disconnect the connection when QueryCachingLevel is set to Cloud may lead to a large amount of saved queries in Athena. For most use cases setting QueryCachingLevel to Local should be enough.

Fine Tuning Performance

You can use the PageSize property to optimize use of your provisioned throughput, based on the size of your items and Amazon Athena's 1000MB page size. Set this property to the number of items to return.

Generally, a smaller page size reduces spikes in throughput that cause throttling. A smaller page size also inserts pauses between requests. This interval evens out the distribution of requests and allows more requests to be successful by avoiding throttling.

Copyright (c) 2021 CData Software, Inc. - All rights reserved.
Build 21.0.7929