USER MANUALS


Aggregation Functions

Aggregate functions return a single result row for each group of rows of the result. Usually, you use aggregation function in the SELECT clause of queries that also have the GROUP BY clause. In these queries, the Execution Engine splits the rows in groups and then, applies the aggregation function over each group. The function returns one row for each group. If the query does not have the GROUP BY clause, the Execution Engine applies the aggregation functions over all the rows and returns one row.

In many aggregation functions, the input parameters can be preceded by one of these modifiers: ALL or DISTINCT. These modifiers affect the semantics of certain aggregation functions, applying them to all tuples in a group or only to those with a different value for the field in question.

The aggregation functions are:

AVG

Description

The AVG function returns the average of the non-null values of a field of the table.

If all the values of the field are NULL, the function returns NULL.

Syntax

AVG( <expression> ) : double
  • expression. Required. The type of the expression has to be int, long, float, double or decimal.

Examples

Example 1

Consider the following view ITEMS:

item

price

A

3.45

B

9.99

C

4.99

SELECT AVG(price) AS average_price
FROM ITEMS

average_price

6.1433333333333335

Example 2

Consider the following view ITEMS_2:

item

price

A

3.45

B

9.99

C

4.99

D

NULL

SELECT AVG(price) AS average_price
FROM ITEMS_2

average_price

6.1433333333333335

COUNT

Description

The COUNT function returns the number of tuples of the result of a selection operation.

If the parameter is *, it returns the number of tuples.

If the parameter is a field, it returns the number of non-null values

It is applicable to any type of field.

This function can be used in queries without a GROUP BY clause, but in that case, it can only be used with *.

Syntax

COUNT( <count parameter> )

<count parameter> ::=
    <field name:identifier>
  | *
  • count parameter. Required. It can be either a field name or *.

Examples

Consider the following view ITEMS:

item

price

A

3.45

B

9.99

C

4.99

D

NULL

Example 1

SELECT COUNT(*)
FROM items

count

4

Example 2

SELECT COUNT(price)
FROM items

count

3

FIRST

Description

The FIRST function returns the first value of a field of each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

FIRST ( <field name:identifier> )
  • field name. Required. A field of the view.

Examples

Consider the following view V:

A

B

group1

one

group1

two

group1

NULL

group2

four

Example 1

SELECT FIRST(b)
FROM v

first

one

Example 2

SELECT a, FIRST(b)
FROM v
GROUP BY a

a

first

group1

one

group2

four

GROUP_CONCAT

Description

The GROUP_CONCAT function returns, for each group, a string with the concatenation of all the field/fields values of each group.

Syntax

GROUP_CONCAT( [ DISTINCT | ALL ] <field name:identifier>)
GROUP_CONCAT( [ <row separator:text> [, <field separator:text> ] ],
    <field name:identifier> [, <field name:identifier>]* )
GROUP_CONCAT( <ignore null:boolean> , <row separator:text>, <field separator:text>,
    <field name:identifier> [, <field name:identifier>]* )
  • ignore null. Optional. If true and any of the fields field name of a row are NULL, GROUP_CONCAT ignores all the fields of that row. If false, no rows are ignored and NULL values are treated as empty characters. The default value is true.

  • row separator. Optional. Literal used to separate the values of each row. Default value: ,.

  • field separator. Optional. Literal used to separate the values of the fields of the same row. The default value is a whitespace.

  • field name. Required. Field which contains the values to concatenate.

Examples

Consider the following view V:

a

b

c

group1

1

one

group1

2

two

group1

NULL

three

group2

4

four

Example 1

SELECT A, GROUP_CONCAT(':', c)
FROM v
GROUP BY a

a

group_concat

group1

one:two:three

group2

four

Example 2

SELECT A, GROUP_CONCAT(':', ';', b, c)
FROM v
GROUP BY a

a

group_concat

group1

1;one:2;two

group2

4:four

As the field B is NULL in the third row, GROUP_CONCAT ignores all the fields of that row. That is, it ignores the value three.

Example 3

SELECT a, GROUP_CONCAT(false, ':', ';', b, c)
FROM v
GROUP BY a

a

group_concat

group1

1;one:2;two:;three

group2

four

As the parameter ignoreNulls is false, GROUP_CONCAT does not ignore the value of the field C of the third row (three), even if the value of the field B of that row is NULL. In this case, NULL values are treated like empty characters.

LAST

Description

The LAST function returns the last value of a field of each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

LAST ( <field name:identifier> )
  • field name. Required. Field name of the view.

Examples

Consider the following view V:

a

b

group1

one

group1

two

group1

NULL

group2

four

Example 1

SELECT LAST(b)
FROM v

last

four

Example 2

SELECT a, LAST(b)
FROM v
GROUP BY a

a

last

group1

NULL

group2

four

LIST

Description

The LIST function returns an array with all the values of a specified field.

It has the same behavior as the function NEST (section NEST) when invoked with a single field as argument.

Syntax

LIST ( <field name:identifier> )
  • field name. Required. Field of the view.

Examples

Consider the following view V:

a

b

group1

one

group1

two

group1

NULL

group2

four

Example 1

SELECT LIST(b)
FROM v

list

Array { one, two, NULL, four }

Example 2

SELECT a, LIST(b)
FROM v
GROUP BY a

a

list

group1

Array { one, two, NULL }

group2

Array { four }

MAX

Description

The MAX function returns the highest value of a field for each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

MAX ( <expression> )
  • expression. Required. Expression of type date (deprecated type), intervaldaysecond, intervalyearmonth, text, time, timestamp, timestamptz or any of the numeric data types.

The return type is the type of the input expression.

When the field is of text type, the function compares the Unicode value of each character of the value.

Examples

Consider the following view V:

a

b

c

group1

1

Virtual DataPort

group1

2

Data Catalog

group2

4

denodo

Example 1

SELECT MAX(b)
FROM v

max

4

Example 2

SELECT a, MAX(b)
FROM v
GROUP BY a

a

max

group1

2

group2

4

Example 3

SELECT MAX(c)
FROM v

max

denodo

In this example, the result is “denodo” because the first letter has the highest Unicode value of all the values: d = 100, V = 86, etc.

MEDIAN

Description

The MEDIAN function returns the middle number of a field for each group of values, taken as the average of the two middle numbers when the group has an even number of values.

This function does not take into account NULL values to calculate the result.

This function ignores the modifiers ALL and DISTINCT.

Syntax

MEDIAN ( <expression> )
  • expression. Required. Expression of type date (deprecated type) , time, timestamp, timestamptz or any of the numeric data types.

Examples

Consider the following view employee:

id

first_name

start_date

salary

01

Jason

25-JUL-96

1234.56

02

Alison

21-MAR-76

6661.78

03

James

12-DEC-78

6544.78

04

Celia

24-OCT-82

2344.78

05

Robert

24-OCT-82

2334.78

05

Jason

30-JUL-87

2224.50

Example 1

SELECT MEDIAN(salary), MEDIAN(start_date)
FROM employee

median(salary)

median(start_date)

2339.78

1983-06-05

Example 2

SELECT first_name, MEDIAN(salary), MEDIAN(start_date)
FROM employee
GROUP BY first_name

first_name

median(salary)

median(start_date)

Alison

6661.78

1976-03-21

Celia

2344.78

1982-10-24

James

6544.78

1978-12-12

Jason

1729.53

1992-01-26

Robert

2334.78

1984-01-15

MIN

Description

The MIN function returns the lowest value of a field for each group of values.

This function ignores the ALL/DISTINCT modifier.

Syntax

MIN ( <expression> )
  • expression. Required. Expression of type date (deprecated type), intervaldaysecond, intervalyearmonth, text, time, timestamp, timestamptz or any of the numeric data types.

When the field is of text type, the function compares the Unicode value of each character of the value.

Examples

Consider the following view V:

a

b

c

group1

1

Virtual DataPort

group1

2

Data Catalog

group2

4

denodo

Example 1

SELECT MIN(b)
FROM V

min

1

Example 2

SELECT a, MIN(b)
FROM v
GROUP BY a

a

min

group1

1

group2

4

Example 3

SELECT MIN(c)
FROM v

min

Data Catalog

In this example, the result is “Data Catalog” because the first letter has the lowest Unicode value of all the values: D = 68, d = 100, V = 86, etc.

NEST

Description

The NEST function returns an array with the values of the selected fields. Its result is inverse to the result of the FLATTEN views (see section FLATTEN View (Flattening Data Structures) for more information about FLATTEN views).

Syntax

NEST( <field name:identifier> [, <field name:identifier> ]*):array
NEST(*)
  • field name. The name of a field. Using (*) is equivalent to pass all the fields of the view to the function.

Example

Consider the view V:

int_sample

text_sample

register_sample

1

A

Register { hello , how’re you }

1

B

Register { hello, good bye }

2

C

Register { another string, last string }

SELECT int_sample, NEST(text_sample, register_sample) AS nest_sample
FROM V
GROUP BY int_sample;

int_sample

nest_sample

1

Array [ A, Register { hello , how’re you }

B, Register { hello, good bye }

]

2

Array [ C, Register { another string, last string } ]

STDEV

Description

The STDEV function returns the sample standard deviation of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

STDEV( <expression> ) : decimal
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region

state

revenue

South

Alabama

53168

South

Mississippi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

SELECT region, STDEV(revenue)
FROM revenue
GROUP BY region

region

stdev

South

37709.24377832752

Northeast

29016.36924794922

STDEVP

Description

The STDEVP function returns the population standard deviation of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

STDEVP( <expression> ) : decimal
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region

state

revenue

South

Alabama

53168

South

Mississippi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

SELECT region, STDEVP(revenue)
FROM revenue
GROUP BY region

region

stdevp

South

30789.46861437455

Northeast

23691.76628188695

SUM

Description

The SUM function returns the sum of all non-null values of a field for each group of values.

Syntax

SUM ( <expression> )
  • expression. Required. Any numeric data type.

The return type is the type of the input expression.

Examples

Consider the following view V:

a

b

group1

1

group1

2

group1

NULL

group2

4

Example 1

SELECT SUM(b)
FROM v

sum

7

Example 2

SELECT a, SUM(b)
FROM v
GROUP BY a

a

sum

group1

3

group2

4

VAR

Description

The VAR function returns the sample variance of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

VAR( <expression> ) : decimal
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region

state

revenue

South

Alabama

53168

South

Mississippi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

SELECT region, VAR(revenue)
FROM revenue
GROUP BY region

region

var

South

1.4219870663333333E9

Northeast

8.419496843333333E8

VARP

Description

The VARP function returns the population variance of each group of values.

This function does not take into account NULL values to calculate the result.

Syntax

VARP( <expression> ) : double
  • expression. Required. Any numeric data type.

Example

Consider the view sales_by_region:

region

state

revenue

South

Alabama

53168

South

Mississippi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

SELECT region, VARP(revenue)
FROM revenue
GROUP BY region

region

varp

South

9.479913775555555E8

Northeast

5.612997895555555E8

Add feedback