CData Python Connector for Azure Analysis Services

Build 24.0.9060

Retrieving Analysis Services Data

Azure Analysis Services is an OLAP database that exposes data as cubes, which you query with MDX (multidimensional expressions). The connector models these cubes in relational views that you can query with SQL-92. The following mapping is for the layout of the model:

  • Catalog - Displayed in the connector as a Catalog.
  • Cube - Displayed in the connector as a Schema.
  • Measure - Available in the connector under the special Measures view.
  • Dimension - Each dimension is exposed as a view.
  • Level - Each individual level of a hierarchy is exposed as a column on the appropriate dimension view.

Joining Measures and Dimensions

In order to retrieve measures per specific level value, issue a join between the Measure view and any Dimension or set of dimensions. For example, issuing the following will retrieve the number of customers in each city:

SELECT m.[Customer Count], c.[City] 
FROM [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Customer AS c 
INNER JOIN [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Measures AS m

Note that there is no ON condition necessary. That is because tables are already related appropriately in Azure Analysis Services. If you are using a tool that requires ON conditions, set IncludeJoinColumns to true. This will append a number of foreign key columns to each view which will relate them to one another another. These columns will not return data on their own, but may be picked up on automatically with tools to construct the ON conditions for joins where needed.

Aggregating Data

Data stored in Azure Analysis Services is already aggregated. In many cases, attempting to retrieve an aggregate may be syntactically equivalent to not specifying anything. For example, the following query will return the exact same data as the previous:

SELECT SUM(m.[Customer Count]), c.[City] 
FROM [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Customer AS c 
INNER JOIN [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Measures AS m
GROUP BY c.[City]

The exception to this rule is when an aggregation of filtered results is requested. In such cases, a calculation will be requested from Azure Analysis Services. For example, to calculate the sum and average of customers in France and Germany:

SELECT SUM(m.[Customer Count]), AVG(m.[Customer Count]), c.[Country] 
FROM [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Customer AS c 
INNER JOIN [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Measures AS m
WHERE c.[Country] IN ('France', 'Germany')
GROUP BY c.[Country]

Selecting Multiple Hierarchies

In Azure Analysis Services, individual dimensions are made up of hierarchies which may have one or more levels. For instance, the AdventureWorks Customers table has City, Country and Gender. City and Country are part of the same hierarchy while Gender is its own hierarchy.

When selecting multiple hierarchies, the method to support this is to cross join the values in MDX. While not obvious from a relational table model of the data as the connector presents, this can cause for very expensive queries to be executed. For example, executing the following:

SELECT c.[Country], m.[Customer Count]
FROM [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Customer AS c 
INNER JOIN [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Measures AS m

Will result in 6 rows. However, selecting Gender as well:

SELECT c.[Country], c.[Gender], m.[Customer Count]
FROM [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Customer AS c 
INNER JOIN [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Measures AS m

Will now result in 12 rows. It is because Gender and Country are on different hierarchies, thus a crossjoin is required in order to return both together. Each additional hierarchy added to the SELECT will multiply the total results by the number of available values in that hierarchy. Thus to get a count of how many rows to expect, one can execute the following:

SELECT (Count(c.[Country])*Count(c.[Gender])) AS totalrows
FROM [AdventureWorksDW2012Multidimensional-SE].[Adventure Works].Customer AS c 

Response Row Limit

Due to how selecting multiple hierarchies will multiply the total number of result rows, it is possible to balloon the number of response rows very quickly, which will result in timeouts. In order to try and give some visibility into what queries will be very expensive, the ResponseRowLimit connection property has been added as a mechanism to try and guide users into better practices. When set, it will calculate how many rows to expect before any query is executed. If the number of predicted rows exceeds the limit, an error will be thrown indicating how many rows to expect back with the query.

It is recommended to select only the columns required or to apply a WHERE criteria. Both can significantly reduce the number of response rows, which will have a huge impact on performance. If you are already familiar with the connector and what queries may be expensive, ResponseRowLimit may be disabled by setting it to 0.

Copyright (c) 2024 CData Software, Inc. - All rights reserved.
Build 24.0.9060