# Aggregation Functions¶

Aggregation functions are used in `SELECT` statements to return one single value for every group of tuples obtained as result of a grouping operation.

These functions receive as a parameter an expression indicating the name of the field to which it is applied. This parameter can optionally be preceded by one of two modifiers: `ALL` or `DISTINCT`. These modifiers affect the semantics of certain aggregation functions, applying them to all tuples in a group or only to those with a different value for the field in question.

Aggregation functions supported by Virtual DataPort

## AVG¶

Description

The `AVG` function returns the average of the non-null values of a field of the table.

If all the values of the field are `NULL`, the function returns `NULL`.

Syntax

```AVG( <expression> ) : double
```
• `expression`. Required. Expression of type `int`, `long`, `float`, `double` or `decimal`.

Examples

Example 1

Consider the following view `ITEMS`:

item

price

A

3.45

B

9.99

C

4.99

```SELECT AVG(price) AS average_price
FROM ITEMS
```

average_price

6.1433333333333335

Example 2

Consider the following view `ITEMS_2`:

item

price

A

3.45

B

9.99

C

4.99

D

NULL

```SELECT AVG(price) AS average_price
FROM ITEMS_2
```

average_price

6.1433333333333335

## COUNT¶

Description

The `COUNT` function returns the number of tuples of the result of a selection operation.

If the parameter is `*`, it returns the number of tuples.

If the parameter is a field, it returns the number of non-null values

It is applicable to any type of field.

This function can be used in queries without a `GROUP BY` clause, but in that case, it can only be used with `*`.

Syntax

```COUNT( <count parameter> )

<count parameter> ::=
<field name:identifier>
| *
```
• `count parameter`. Required. It can be either a field name or `*`.

Examples

Consider the following view `ITEMS`:

item

price

A

3.45

B

9.99

C

4.99

D

NULL

Example 1

```SELECT COUNT(*)
FROM items
```

count

4

Example 2

```SELECT COUNT(price)
FROM items
```

count

3

## FIRST¶

Description

The `FIRST` function returns the first value of a field of each group of values.

This function ignores the `ALL`/`DISTINCT` modifier.

Syntax

```FIRST ( <field name:identifier> )
```
• `field name`. Required. A field of the view.

Examples

Consider the following view `V`:

A

B

group1

one

group1

two

group1

NULL

group2

four

Example 1

```SELECT FIRST(b)
FROM v
```

first

one

Example 2

```SELECT a, FIRST(b)
FROM v
GROUP BY a
```

a

first

group1

one

group2

four

## GROUP_CONCAT¶

Description

The `GROUP_CONCAT` function returns, for each group, a string with the concatenation of all the field/fields values of each group.

Syntax

```GROUP_CONCAT( [ <row separator:text> [, <field separator:text> ] ],
<field name:identifier> [, <field name:identifier>]* )
GROUP_CONCAT( <ignore null:boolean> , <row separator:text>, <field separator:text>,
<field name:identifier> [, <field name:identifier>]* )
```
• `ignore null`. Optional. If `true` and any of the fields `field name` of a row are `NULL`, `GROUP_CONCAT` ignores all the fields of that row. If `false`, no rows are ignored and `NULL` values are treated as empty characters. The default value is `true`.

• `row separator`. Optional. Literal used to separate the values of each row. Default value: `,`.

• `field separator`. Optional. Literal used to separate the values of the fields of the same row. The default value is a whitespace.

• `field name`. Required. Field which contains the values to concatenate.

Examples

Consider the following view `V`:

a

b

c

group1

1

one

group1

2

two

group1

NULL

three

group2

4

four

Example 1

```SELECT A, GROUP_CONCAT(':', c)
FROM v
GROUP BY a
```

a

group_concat

group1

one:two:three

group2

four

Example 2

```SELECT A, GROUP_CONCAT(':', ';', b, c)
FROM v
GROUP BY a
```

a

group_concat

group1

1;one:2;two

group2

4:four

As the field `B` is `NULL` in the third row, `GROUP_CONCAT` ignores all the fields of that row. That is, it ignores the value `three`.

Example 3

```SELECT a, GROUP_CONCAT(false, ':', ';', b, c)
FROM v
GROUP BY a
```

a

group_concat

group1

1;one:2;two:;three

group2

four

As the parameter `ignoreNulls` is `false`, `GROUP_CONCAT` does not ignore the value of the field `C` of the third row (`three`), even if the value of the field `B` of that row is `NULL`. In this case, `NULL` values are treated like empty characters.

## LAST¶

Description

The `LAST` function returns the last value of a field of each group of values.

This function ignores the `ALL`/`DISTINCT` modifier.

Syntax

```LAST ( <field name:identifier> )
```
• `field name`. Required. Field name of the view.

Examples

Consider the following view `V`:

a

b

group1

one

group1

two

group1

NULL

group2

four

Example 1

```SELECT LAST(b)
FROM v
```

last

four

Example 2

```SELECT a, LAST(b)
FROM v
GROUP BY a
```

a

last

group1

NULL

group2

four

## LIST¶

Description

The `LIST` function returns an array with all the values of a specified field.

It has the same behavior as the function `NEST` (section NEST) when invoked with a single field as argument.

Syntax

```LIST ( <field name:identifier> )
```
• `field name`. Required. Field of the view.

Examples

Consider the following view `V`:

a

b

group1

one

group1

two

group1

NULL

group2

four

Example 1

```SELECT LIST(b)
FROM v
```

list

Array { one, two, NULL, four }

Example 2

```SELECT a, LIST(b)
FROM v
GROUP BY a
```

a

list

group1

Array { one, two, NULL }

group2

Array { four }

## MAX¶

Description

The `MAX` function returns the highest value of a field for each group of values.

This function ignores the `ALL`/`DISTINCT` modifier.

Syntax

```MAX ( <expression> )
```
• `expression`. Required. Expression of type `date`, `text` or any of the numeric data types.

The return type is the type of the input expression.

When the field is of text type, the function compares the Unicode value of each character of the value.

Examples

Consider the following view `V`:

a

b

c

group1

1

Virtual DataPort

group1

2

Data Catalog

group1

NULL

ITPilot

group2

4

denodo

Example 1

```SELECT MAX(b)
FROM v
```

max

4

Example 2

```SELECT a, MAX(b)
FROM v
GROUP BY a
```

a

max

group1

2

group2

4

Example 3

```SELECT MAX(c)
FROM v
```

max

denodo

In this example, the result is “denodo” because the first letter has the highest Unicode value of all the values: d = 100, V = 86, etc.

## MIN¶

Description

The `MIN` function returns the lowest value of a field for each group of values.

This function ignores the `ALL`/`DISTINCT` modifier.

Syntax

```MIN ( <expression> )
```
• `expression`. Required. Expression of type `date`, `text` or any of the numeric data types.

The return type is the type of the input expression.

When the field is of text type, the function compares the Unicode value of each character of the value.

Examples

Consider the following view `V`:

a

b

c

group1

1

Virtual DataPort

group1

2

Data Catalog

group1

NULL

ITPilot

group2

4

denodo

Example 1

```SELECT MIN(b)
FROM V
```

min

1

Example 2

```SELECT a, MIN(b)
FROM v
GROUP BY a
```

a

min

group1

1

group2

4

Example 3

```SELECT MIN(c)
FROM v
```

min

Data Catalog

In this example, the result is “Data Catalog” because the first letter has the lowest Unicode value of all the values: D = 68, d = 100, V = 86, etc.

## NEST¶

Description

The `NEST` function returns an array with the values of the selected fields. Its result is inverse to the result of the FLATTEN views (see section FLATTEN View (Flattening Data Structures) for more information about FLATTEN views).

Syntax

```NEST( <field name:identifier> [, <field name:identifier> ]*):array
NEST(*)
```
• `field name`. The name of a field. Using `(*)` is equivalent to pass all the fields of the view to the function.

Example

Consider the view `V`:

int_sample

text_sample

register_sample

1

A

Register { hello , how’re you }

1

B

Register { hello, good bye }

2

C

Register { another string, last string }

```SELECT int_sample, NEST(text_sample, register_sample) AS nest_sample
FROM V
GROUP BY int_sample;
```

int_sample

nest_sample

1

Array [ A, Register { hello , how’re you }

B, Register { hello, good bye }

]

2

Array [ C, Register { another string, last string } ]

## STDEV¶

Description

The `STDEV` function returns the sample standard deviation of each group of values.

This function does not take into account `NULL` values to calculate the result.

Syntax

```STDEV( <expression> ) : double
```
• `expression`. Required. Expression of type `int`, `long`, `float` or `double`.

Example

Consider the view `sales_by_region`:

region

state

revenue

South

Alabama

53168

South

Missisipi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

```SELECT region, STDEV(revenue)
FROM revenue
GROUP BY region
```

region

stdev

South

37709.24377832752

Northeast

29016.36924794922

## STDEVP¶

Description

The `STDEVP` function returns the population standard deviation of each group of values.

This function does not take into account `NULL` values to calculate the result.

Syntax

```STDEVP( <expression> ) : double
```
• `expression`. Required. Expression of type `int`, `long`, `float` or `double`.

Example

Consider the view `sales_by_region`:

region

state

revenue

South

Alabama

53168

South

Missisipi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

```SELECT region, STDEVP(revenue)
FROM revenue
GROUP BY region
```

region

stdevp

South

30789.46861437455

Northeast

23691.76628188695

## SUM¶

Description

The `SUM` function returns the sum of all non-null values of a field for each group of values.

Syntax

```SUM ( <expression> )
```
• `expression`. Required. Expression of type `int`, `long`, `float`, `double` or `decimal`.

Examples

Consider the following view `V`:

a

b

group1

1

group1

2

group1

NULL

group2

4

Example 1

```SELECT SUM(b)
FROM v
```

sum

7

Example 2

```SELECT a, SUM(b)
FROM v
GROUP BY a
```

a

sum

group1

3

group2

4

## VAR¶

Description

The `VAR` function returns the sample variance of each group of values.

This function does not take into account `NULL` values to calculate the result.

Syntax

```VAR( <expression> ) : double
```
• `expression`. Required. Expression of type `int`, `long`, `float` or `double`.

Example

Consider the view `sales_by_region`:

region

state

revenue

South

Alabama

53168

South

Missisipi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

```SELECT region, VAR(revenue)
FROM revenue
GROUP BY region
```

region

var

South

1.4219870663333333E9

Northeast

8.419496843333333E8

## VARP¶

Description

The `VARP` function returns the population variance of each group of values.

This function does not take into account `NULL` values to calculate the result.

Syntax

```VARP( <expression> ) : double
```
• `expression`. Required. Expression of type `int`, `long`, `float` or `double`.

Example

Consider the view `sales_by_region`:

region

state

revenue

South

Alabama

53168

South

Missisipi

5681

South

Tennessee

80166

Northeast

New York

12945

Northeast

Pennsylvania

69284

Northeast

New Hampshire

53168

```SELECT region, VARP(revenue)
FROM revenue
GROUP BY region
```

region

varp

South

9.479913775555555E8

Northeast

5.612997895555555E8