Window Functions

Window functions allow you to create computed fields from a group of rows (a window) that return a result for each row, as opposed to one computed result for a set of rows, as is the case with aggregate functions. The component supports the following window function syntax.

Note: Window function support is an experimental feature of the component. This functionality extends beyond the component's core scope of being SQL-92 compliant. As such, performance with window functions may not be optimal.

Window Function Clauses

OVER

The OVER clause defines the window over which window functions are performed.

SELECT A, B, <window function> OVER (<window frame>) FROM TableName

The <window function> refers to any supported window function clause, and the <window frame> refers to one or more clauses that specify the logic by which the window is defined.

PARTITION BY

The PARTITION BY clause subdivides a window into sub-windows called partitions. For each unique value in the column specified in the PARTITION BY clause, every record with that value collectively forms an individual partition.

SELECT A, B, <window function> OVER (PARTITION BY A ORDER BY B) From Test_xlsx_Sheet1

The <window function> refers to any supported window function clause.

Window Functions

The component supports math, ranking, and analytic window functions.

Math

These window functions perform mathematical operations on the records within the window.

COUNT()

Calculates the number of records in each partition. The calculated column is of the data type "int".

In each partition, every record will display the total number of records in that partition.

SELECT Name, Role, Earnings, COUNT() OVER (PARTITION BY Role) FROM Employees

COUNT_BIG()

Calculates the number of records in each partition. The calculated column is of the data type "bigint".

In each partition, every record will display the total number of records in that partition.

SELECT Name, Role, Earnings, COUNT_BIG() OVER (PARTITION BY Role) FROM Employees

MIN(numeric_column)

Calculates the minimum value of a numerical column per partition.

In each partition, every record will display the minimum value of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, MIN(Earnings) OVER (PARTITION BY Role) FROM Employees

MAX(numeric_column)

Calculates the maximum value of a numerical column per partition.

In each partition, every record will display the maximum value of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, MAX(Earnings) OVER (PARTITION BY Role) FROM Employees

SUM(numeric_column)

Calculates the sum of a numerical column per partition.

In each partition, every record will display the sum of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, SUM(Earnings) OVER (PARTITION BY Role) FROM Employees

AVG(numeric_column)

Calculates the average value of a numerical column per partition.

In each partition, every record will display the average value of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, AVG(Earnings) OVER (PARTITION BY Role) FROM Employees

MEDIAN(numeric_column)

Calculates the median value of a numerical column per partition.

In each partition, every record will display the median value of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, MEDIAN(Earnings) OVER (PARTITION BY Role) FROM Employees

STDEV(numeric_column)

Calculates the standard deviation of a numerical column per partition.

In each partition, every record will display the standard deviation of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, STDEV(Earnings) OVER (PARTITION BY Role) FROM Employees

STDEVP(numeric_column)

Calculates the population standard deviation of a numerical column per partition.

In each partition, every record will display the population standard deviation of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, STDEVP(Earnings) OVER (PARTITION BY Role) FROM Employees

VAR(numeric_column)

Calculates the statistical standard variance of a numerical column per partition.

In each partition, every record will display the statistical standard variance of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, VAR(Earnings) OVER (PARTITION BY Role) FROM Employees

VARP(numeric_column)

Calculates the variance population of a numerical column per partition.

In each partition, every record will display the variance population of numeric_column across the records in that partition.

SELECT Name, Role, Earnings, VARP(Earnings) OVER (PARTITION BY Role) FROM Employees

Ranking

These window functions rank records that fall within the window and its partitions.

RANK()

Assigns a rank number to each record in a window based on the value of the column specified in the required ORDER BY clause.

If two or more records have an equal value in the in ranked column, they all receive the same rank number and the rank count increments internally, skipping ahead one rank number for each record with a duplicate value in the ORDER BY column.

SELECT Id, Column1, RANK() OVER (ORDER BY Column1) AS Rank FROM Test_xlsx_Sheet1

If you add a PARTITION BY clause, a separate set of ranks is calculated for each partition.

SELECT Id, Column1, RANK() OVER (PARTITION BY Id ORDER BY Column1) AS Rank FROM Test_xlsx_Sheet1

DENSE_RANK()

Operates like the RANK() function, but it doesn't increment the internal rank counter for each record with a duplicate value in the ranked column.

This means that, while records with identical values in the ORDER BY column still share a rank number, the function never skips a rank number.

SELECT Id, Column1, DENSE_RANK() OVER (PARTITION BY Id ORDER BY Column1) AS Rank FROM Test_xlsx_Sheet1

If you add a PARTITION BY clause, a separate set of ranks is calculated for each partition.

SELECT Id, Column1, DENSE_RANK() OVER (PARTITION BY Id ORDER BY Column1) AS Rank FROM Test_xlsx_Sheet1

ROW_NUMBER()

Calculates a row number for each record. An ORDER BY clause in the OVER clause is required.

SELECT Name, Role, Earnings, ROW_NUMBER() OVER (ORDER BY Role) FROM Employees

If you define multiple partitions with PARTITION BY, a new set of row numbers are calculated for each partition.

SELECT Name, Role, Earnings, ROW_NUMBER() OVER (PARTITION BY Role ORDER BY Earnings) FROM Employees

NTILE()

Distributes rows of an ordered partition into a specified number of approximately equal groups, or buckets. It assigns each group a bucket number starting from one. For each row in a group, the NTILE() function assigns a bucket number representing the group to which the row belongs.

The syntax of NTILE() is:

NTILE(buckets) OVER (
    [PARTITION BY partition_expression, ... ]
    ORDER BY sort_expression [ASC | DESC], ...
)

The following are paramaters that NTILE() supports:

buckets: The number of buckets into which the rows are divided. The buckets can be an expression or subquery that evaluates to a positive integer. It cannot be a window function.
PARTITION BY: distributes rows of a result set into partitions to which the NTILE() function is applied.
ORDER BY is clause that specifies the logical order of rows in each partition to which the NTILE() is applied.

If the number of rows is not divisible by the buckets, the NTILE() function returns groups of two sizes with the difference by one. The larger groups always precede the smaller group in the order set by ORDER BY in the OVER() clause.

If the total of rows is divisible by the number of buckets, the function divides the rows evenly among buckets. The following statement creates a new table named ntile_demo that stores 10 integers:

CREATE TABLE sales.ntile_demo (
	v INT NOT NULL
);
	
INSERT INTO sales.ntile_demo(v) 
VALUES(1),(2),(3),(4),(5),(6),(7),(8),(9),(10);	
	
SELECT * FROM sales.ntile_demo;

This statement uses the NTILE() function to divide ten rows into three groups:

SELECT 
	v, 
	NTILE (3) OVER (
		ORDER BY v
	) buckets
FROM 
	sales.ntile_demo;

Analytical

These window functions perform analytical operations on the records within the window.

PERCENT_RANK()

Calculates the relative rank SQL Percentile of each row. It returns values greater than zero, but the maximum value is one. It does not count any NULL values. This function is nondeterministic.

The syntax of PERCENT_RANK() is:

PERCENT_RANK() OVER (
      [PARTITION BY partition_expression, ... ]
      ORDER BY sort_expression [ASC | DESC], ...
  )

This syntax uses the following parameters.

PARTITION BY: By default, SQL Server treats the whole data set as a single set. You can specify the PARTITION BY clause to divide data into multiple sets. The Percent_Rank function performs the analytical calculations on each set. This parameter is optional.
ORDER BY: Sorts the data in either ascending or descending order. This parameter is required.

SSIS Components for Microsoft Excel Online

Window Functions

Window Function Clauses

OVER

PARTITION BY

Window Functions

Math

COUNT()

COUNT_BIG()

MIN(numeric_column)

MAX(numeric_column)

SUM(numeric_column)

AVG(numeric_column)

MEDIAN(numeric_column)

STDEV(numeric_column)

STDEVP(numeric_column)

VAR(numeric_column)

VARP(numeric_column)

Ranking

RANK()

DENSE_RANK()

ROW_NUMBER()

NTILE()

Analytical

PERCENT_RANK()