Aggregation Functions

Aggregation functions are used in SELECT statements to return one single value for every group of tuples obtained as result of a grouping operation.

These functions receive as a parameter an expression indicating the name of the field to which it is applied. This parameter can optionally be preceded by one of two modifiers: ALL or DISTINCT. These modifiers affect the semantics of certain aggregation functions, applying them to all tuples in a group or only to those with a different value for the field in question.

Aggregation functions supported by Virtual DataPort

AVG

Description

The AVG function returns the average of the non-null values of a field of the table.

If all the values of the field are NULL, the function returns NULL.

Syntax

AVG( <expression> ) : double
  • expression. Required. The type of the expression has to be int, long, float, double or decimal.

Examples

Example 1

Consider the following view ITEMS:

item price
A 3.45
B 9.99
C 4.99
SELECT AVG(price) AS average_price
FROM ITEMS
average_price
6.1433333333333335

Example 2

Consider the following view ITEMS_2:

item price
A 3.45
B 9.99
C 4.99
D NULL
SELECT AVG(price) AS average_price
FROM ITEMS_2
average_price
6.1433333333333335

COUNT

Description

The COUNT function returns the number of tuples of the result of a selection operation.

If the parameter is *, it returns the number of tuples.

If the parameter is a field, it returns the number of non-null values

It is applicable to any type of field.

This function can be used in queries without a GROUP BY clause, but in that case, it can only be used with *.

Syntax

COUNT( <count parameter> )

<count parameter> ::=
    <field name:identifier>
  | *
  • count parameter. Required. It can be either a field name or *.

Examples

Consider the following view ITEMS:

item price
A 3.45
B 9.99
C 4.99
D NULL

Example 1

SELECT COUNT(*)
FROM items
count
4

Example 2

SELECT COUNT(price)
FROM items
count
3

FIRST

Description

The FIRST function returns the first value of a field of each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

FIRST ( <field name:identifier> )
  • field name. Required. A field of the view.

Examples

Consider the following view V:

A B
group1 one
group1 two
group1 NULL
group2 four

Example 1

SELECT FIRST(b)
FROM v
first
one

Example 2

SELECT a, FIRST(b)
FROM v
GROUP BY a
a first
group1 one
group2 four

GROUP_CONCAT

Description

The GROUP_CONCAT function returns, for each group, a string with the concatenation of all the field/fields values of each group.

Syntax

GROUP_CONCAT( [ <row separator:text> [, <field separator:text> ] ],
    <field name:identifier> [, <field name:identifier>]* )
GROUP_CONCAT( <ignore null:boolean> , <row separator:text>, <field separator:text>,
    <field name:identifier> [, <field name:identifier>]* )
  • ignore null. Optional. If true and any of the fields field name of a row are NULL, GROUP_CONCAT ignores all the fields of that row. If false, no rows are ignored and NULL values are treated as empty characters. The default value is true.
  • row separator. Optional. Literal used to separate the values of each row. Default value: ,.
  • field separator. Optional. Literal used to separate the values of the fields of the same row. The default value is a whitespace.
  • field name. Required. Field which contains the values to concatenate.

Examples

Consider the following view V:

a b c
group1 1 one
group1 2 two
group1 NULL three
group2 4 four

Example 1

SELECT A, GROUP_CONCAT(':', c)
FROM v
GROUP BY a
a group_concat
group1 one:two:three
group2 four

Example 2

SELECT A, GROUP_CONCAT(':', ';', b, c)
FROM v
GROUP BY a
a group_concat
group1 1;one:2;two
group2 4:four

As the field B is NULL in the third row, GROUP_CONCAT ignores all the fields of that row. That is, it ignores the value three.

Example 3

SELECT a, GROUP_CONCAT(false, ':', ';', b, c)
FROM v
GROUP BY a
a group_concat
group1 1;one:2;two:;three
group2 four

As the parameter ignoreNulls is false, GROUP_CONCAT does not ignore the value of the field C of the third row (three), even if the value of the field B of that row is NULL. In this case, NULL values are treated like empty characters.

LAST

Description

The LAST function returns the last value of a field of each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

LAST ( <field name:identifier> )
  • field name. Required. Field name of the view.

Examples

Consider the following view V:

a b
group1 one
group1 two
group1 NULL
group2 four

Example 1

SELECT LAST(b)
FROM v
last
four

Example 2

SELECT a, LAST(b)
FROM v
GROUP BY a
a last
group1 NULL
group2 four

LIST

Description

The LIST function returns an array with all the values of a specified field.

It has the same behavior as the function NEST (section NEST) when invoked with a single field as argument.

Syntax

LIST ( <field name:identifier> )
  • field name. Required. Field of the view.

Examples

Consider the following view V:

a b
group1 one
group1 two
group1 NULL
group2 four

Example 1

SELECT LIST(b)
FROM v
list
Array { one, two, NULL, four }

Example 2

SELECT a, LIST(b)
FROM v
GROUP BY a
a list
group1 Array { one, two, NULL }
group2 Array { four }

MAX

Description

The MAX function returns the highest value of a field for each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

MAX ( <expression> )
  • expression. Required. Expression of type date (deprecated type), intervaldaysecond, intervalyearmonth, time, timestamp, timestamptz or any of the numeric data types.

The return type is the type of the input expression.

Examples

Consider the following view V:

a b
group1 1
group1 2
group1 NULL
group2 4

Example 1

SELECT MAX(b)
FROM v
max
4

Example 2

SELECT a, MAX(b)
FROM v
GROUP BY a
a max
group1 2
group2 4

MIN

Description

The MIN function returns the lowest value of a field for each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

MIN ( <expression> )
  • expression. Required. Expression of type date (deprecated type), intervaldaysecond, intervalyearmonth, time, timestamp, timestamptz or any of the numeric data types.

Examples

Consider the following view V:

a b
group1 1
group1 2
group1 NULL
group2 4

Example 1

SELECT MIN(b)
FROM V
min
1

Example 2

SELECT a, MIN(b)
FROM v
GROUP BY a
a min
group1 1
group2 4

NEST

Description

The NEST function returns an array with the values of the selected fields. Its result is inverse to the result of the FLATTEN views (see section FLATTEN View (Flattening Data Structures) for more information about FLATTEN views).

Syntax

NEST( <field name:identifier> [, <field name:identifier> ]*):array
NEST(*)
  • field name. The name of a field. Using (*) is equivalent to pass all the fields of the view to the function.

Example

Consider the view V:

int_sample text_sample register_sample
1 A Register { hello , how’re you }
1 B Register { hello, good bye }
2 C Register { another string, last string }
SELECT int_sample, NEST(text_sample, register_sample) AS nest_sample
FROM V
GROUP BY int_sample;
int_sample nest_sample
1

Array [ A, Register { hello , how’re you }

B, Register { hello, good bye }

]

2 Array [ C, Register { another string, last string } ]

STDEV

Description

The STDEV function returns the sample standard deviation of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

STDEV( <expression> ) : decimal
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region state revenue
South Alabama 53168
South Missisipi 5681
South Tennessee 80166
Northeast New York 12945
Northeast Pennsylvania 69284
Northeast New Hampshire 53168
SELECT region, STDEV(revenue)
FROM revenue
GROUP BY region
region stdev
South 37709.24377832752
Northeast 29016.36924794922

STDEVP

Description

The STDEVP function returns the population standard deviation of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

STDEVP( <expression> ) : decimal
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region state revenue
South Alabama 53168
South Missisipi 5681
South Tennessee 80166
Northeast New York 12945
Northeast Pennsylvania 69284
Northeast New Hampshire 53168
SELECT region, STDEVP(revenue)
FROM revenue
GROUP BY region
region stdevp
South 30789.46861437455
Northeast 23691.76628188695

SUM

Description

The SUM function returns the sum of all non-null values of a field for each group of values.

Syntax

SUM ( <expression> )
  • expression. Required. Any numeric data type.

The return type is the type of the input expression.

Examples

Consider the following view V:

a b
group1 1
group1 2
group1 NULL
group2 4

Example 1

SELECT SUM(b)
FROM v
sum
7

Example 2

SELECT a, MIN(b)
FROM v
GROUP BY a
a sum
group1 3
group2 4

VAR

Description

The VAR function returns the sample variance of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

VAR( <expression> ) : decimal
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region state revenue
South Alabama 53168
South Missisipi 5681
South Tennessee 80166
Northeast New York 12945
Northeast Pennsylvania 69284
Northeast New Hampshire 53168
SELECT region, VAR(revenue)
FROM revenue
GROUP BY region
region var
South 1.4219870663333333E9
Northeast 8.419496843333333E8

VARP

Description

The VARP function returns the population variance of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

VARP( <expression> ) : double
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region state revenue
South Alabama 53168
South Missisipi 5681
South Tennessee 80166
Northeast New York 12945
Northeast Pennsylvania 69284
Northeast New Hampshire 53168
SELECT region, VARP(revenue)
FROM revenue
GROUP BY region
region varp
South 9.479913775555555E8
Northeast 5.612997895555555E8