UNION Clause

The operator UNION combines the result of two queries into a single result set that contains all the rows of both queries.

The rules of SQL standard define that:

  • The number and the order of the columns of the queries involved in the UNION has to be the same.

  • The data types of each field have to be compatible.

  • The UNION does not return duplicated rows. If there are duplicated rows, the UNION removes them. If you want to keep the duplicates, you have to use UNION ALL.

In Denodo, there exist two types of UNION operations:

  • The SQL Union: this operation is compliant with the SQL standard union operator.

  • The EXTENDED Union: the behavior of the EXTENDED UNION operator is different from the SQL standard:

    • The association of fields of the queries is done by name, not by position.

    • If a field is present in one query but not the other, the UNION is executed anyway. The field is added to the result set as well. In the rows of the query that does not return this field, the value of this field is null.

    • The UNION operator of Denodo does not remove duplicates (it behaves like UNION ALL). If you want to remove duplicates, add the clause DISTINCT.

You can specify the type of UNION operation in the query by explicitly using the syntax SQL UNION and SQL UNION ALL for standard SQL Union, and EXTENDED UNION ALL for the extended union.

Since Denodo 8, the ANSI SQL syntax UNION [DISTINCT] and UNION ALL defaults to the SQL UNION operation. This is a change from previous versions, where the ANSI SQL syntax defaults to the Extended Union. You can change this default mapping to make ANSI SQL syntax to be mapped to extended union. To configure the UNION clause to behave as extended union, follow these steps:

  1. Log in to the administration tool with an administration user.

  2. Open the VQL Shell and execute this command:

    SET 'com.denodo.vdb.union.enableStandardSQLUnion' = 'false';
    

    You do not need to restart the Virtual DataPort server to apply this change; it is applied instantly.

Finally, always consider the performance implications of using SQL UNION vs SQL UNION ALL and EXTENDED UNION ALL: on any database and in Denodo, the process of removing the duplicate rows of a UNION is costly because the execution engine has to evaluate all the rows of the result of the union looking for duplicates, instead of returning them immediately.

If you know that all the rows returned by each branch of the UNION are unique, execute SQL UNION ALL or EXTENDED UNION ALL instead of SQL UNION, because the query will be run faster.