This property enhances SQL functionality beyond what can be supported through the API directly, by enabling in-memory client-side processing.
When SupportEnhancedSQL = true, the provider offloads as much of the SELECT statement processing as possible to HDFS and then processes the rest of the query in memory. In this way, the provider can execute unsupported predicates, joins, and aggregation.
When SupportEnhancedSQL = false, the provider limits SQL execution to what is supported by the HDFS API.
Execution of Predicates
The provider determines which of the clauses are supported by the data source and then pushes them to the source to get the smallest superset of rows that would satisfy the query. It then filters the rest of the rows locally. The filter operation is streamed, which enables the provider to filter effectively for even very large datasets.
Execution of Joins
The provider uses various techniques to join in memory. The provider trades off memory utilization against the requirement of reading the same table more than once.
Execution of Aggregates
The provider retrieves all rows necessary to process the aggregation in memory.