Excel Add-In for Parquet

Build 24.0.9060

Query Processing

Query Processing

CData has a client-side SQL engine built into the add-in library. This enables support for the full capabilities that SQL-92 offers, including filters, aggregations, functions, etc.

For sources that do not support SQL-92, the add-in offloads as much of SQL statement processing as possible to Parquet and then processes the rest of the query in memory (client-side). This results in optimal performance.

For data sources with limited query capabilities, the add-in handles transformations of the SQL query to make it simpler for the add-in. The goal is to make smart decisions based on the query capabilities of the data source to push down as much of the computation as possible. The Parquet Query Evaluation component examines SQL queries and returns information indicating what parts of the query the add-in is not capable of executing natively.

The Parquet Query Slicer component is used in more specific cases to separate a single query into multiple independent queries. The client-side Query Engine makes decisions about simplifying queries, breaking queries into multiple queries, and pushing down or computing aggregations on the client-side while minimizing the size of the result set.

There's a significant trade-off in evaluating queries, even partially, client-side. There are always queries that are impossible to execute efficiently in this model, and some can be particularly expensive to compute in this manner. CData always pushes down as much of the query as is feasible for the data source to generate the most efficient query possible and provide the most flexible query capabilities.

More Information

For a full discussion of how CData handles query processing, see CData Architecture: Query Execution.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060