CData Python Connector for HDFS

Build 21.0.7930

Aggregate Functions

Certain aggregate functions can also be used within SQLAlchemy by using the func module. First, you will need to import it:

from sqlalchemy.sql import func

Once imported, the following aggregate functions are available:

COUNT

This example counts the number of records in a set of groups using the session object's query() method.

rs = session.query(func.count(Files.Id).label("CustomCount"), Files.FileId).group_by(Files.FileId)
for instance in rs:
	print("Count: ", instance.CustomCount)
	print("FileId: ", instance.FileId)
	print("---------")

Alternatively, you can execute COUNT using the session object's execute() method.

rs = session.execute(Files_table.select().with_only_columns([func.count(Files_table.c.Id).label("CustomCount"), Files_table.c.FileId])group_by(Files_table.c.FileId))
for instance in rs:

SUM

This example calculates the cumulative amount of a numeric column in a set of groups.

rs = session.query(func.sum(Files.Length).label("CustomSum"), Files.FileId).group_by(Files.FileId)
for instance in rs:
	print("Sum: ", instance.CustomSum)
	print("FileId: ", instance.FileId)
	print("---------")

Alternatively, you can invoke SUM using the session object's execute() method.

rs = session.execute(Files_table.select().with_only_columns([func.sum(Files_table.c.Length).label("CustomSum"), Files_table.c.FileId]).group_by(Files_table.c.FileId))
for instance in rs:

AVG

This example calculates the average amount of a numeric column in a set of groups using the session object's query() method.

rs = session.query(func.avg(Files.Length).label("CustomAvg"), Files.FileId).group_by(Files.FileId)
for instance in rs:
	print("Avg: ", instance.CustomAvg)
	print("FileId: ", instance.FileId)
	print("---------")

Alternatively, you can invoke AVG using the session object's execute() method.

rs = session.execute(Files_table.select().with_only_columns([func.avg(Files_table.c.Length).label("CustomAvg"), Files_table.c.FileId]).group_by(Files_table.c.FileId))
for instance in rs:

MAX and MIN

This example finds the maximum value and minimum value of a numeric column in a set of groups.

rs = session.query(func.max(Files.Length).label("CustomMax"), func.min(Files.Length).label("CustomMin"), Files.FileId).group_by(Files.FileId)
for instance in rs:
	print("Max: ", instance.CustomMax)
	print("Min: ", instance.CustomMin)
	print("FileId: ", instance.FileId)
	print("---------")

Alternatively, you can invoke MAX and MIN using the session object's execute() method.

rs = session.execute(Files_table.select().with_only_columns([func.max(Files_table.c.Length).label("CustomMax"), func.min(Files_table.c.Length).label("CustomMin"), Files_table.c.FileId]).group_by(Files_table.c.FileId))
for instance in rs:

Copyright (c) 2021 CData Software, Inc. - All rights reserved.
Build 21.0.7930