Using the Create Base View From Query Option

Applies to: Denodo 8.0 , Denodo 7.0 , Denodo 6.0
Last modified on: 10 Aug 2020
Tags: Best practices JDBC data sources ODBC data sources View creation

Download document

You can translate the document:

Goal

Denodo allows to create base views on top of JDBC and ODBC data sources that, when queried, will execute the SQL query provided in the definition of the base view. This document explains when and how to use this option. The limitations of this functionality will also be explained.

Introduction

Developers have two options to create base views in Denodo from JDBC and ODBC sources:

  • Graphically: this is the usual way of creating JDBC/ODBC views. After creating the data source, the user has to select the tables to create base views.
  • SQL Sentence: by using the option Create From Query, the user specifies a SQL query to be executed when querying the base view. When writing this SQL query, consider the following:
  • This SQL query has to use the syntax of the queried database and not the syntax of Denodo.
  • The query may have interpolation variables, which allow the SQL query sent to the database to be parameterized according to the conditions of the query executed over the base view.
  • The SQL query should specify an alias to all the columns of the query that are the result of an expression that has one or more input parameters and at least one is an interpolation variable. For instance:

SELECT ABS( ROUND( @value ) ) AS value_abs_round… 

If an alias is not defined,  Denodo will automatically generate a name for the corresponding column in the created base view, which can be counter-intuitive and change from one data source to another. 

  • If the SQL query invokes a PL/SQL function using an interpolation variable, you also need to assign an alias to the return value. For instance:

SELECT function(@INPUTVAR) AS value FROM sales

This document is focused on the ‘SQL Sentence’ option. The following sections explain pros and cons of using this option and some advice on when and how to use it.

Interpolation Variables

Interpolation variables allow the SQL query sent to the database to be parameterized according to the conditions of the query executed on the base view. By using these variables, you do not have to provide the value of a field in an expression at design time (when creating the base view). Instead, you will provide the values of these variables at runtime, when you query the view.

For instance, suppose you have a SALES table in a database and you want to create a base view with a SQL Sentence. You could create a base view SALES_VDB by writing a query like this one:

SELECT Function(amount) as AVERAGE FROM SALES

WHERE product_id = @product_id

and then execute it with different values for the product_id field. Note that the query uses a function in the SELECT clause, this is a typical example of the use of the Create Base View From Query option when using a database-proprietary function.

This way, if we execute the following query on the base view in Denodo:

SELECT * FROM SALES_VDB WHERE product_id = 2

The Server will substitute the variable @product_id with the value from the WHERE clause of the query and execute:

SELECT Function(amount) AS average FROM SALES

WHERE product_id = 2

Each interpolation variable must be related with an attribute belonging to the generated base view, so that the variable can obtain its value at run time. If the SQL query used to access the data source returns any field with the same name as the variable, then the variable will be associated to the corresponding base view attribute. If the SQL query does not return any field with the same name as the variable, Denodo will add a new attribute to the base view, with the same name.

WHEREEXPRESSION

Denodo provides a predefined interpolation variable called WHEREEXPRESSION that simplifies the process of creating a base view from an SQL query. At runtime, the Server will replace WHEREEXPRESSION with the condition sent to the base view.

If we create a base view named SALES_VDB with the following SQL query:

SELECT sales_id, product_id, customer_id, profit_margin as margin, StoredProcedure(amount), sales_date

FROM sales

WHERE @WHEREEXPRESSION

we will be able to execute queries like the following on the base view in Denodo:

SELECT *

FROM SALES_VDB

WHERE margin > 0 AND getyear(sales_date) > 2010

Note that the user has to specify the actual name of the fields for the aliases used in the SELECT clause of the query. For instance, the SQL query defines the margin alias for profit_margin. Therefore, the user must specify the profit_margin value for the attribute named margin in the VQL query.

The Server will substitute the variable WHEREEXPRESSION with the condition of the WHERE clause of the query:

SELECT sales_id, product_id, customer_id, profit_margin as margin, StoredProcedure(amount), sales_date

FROM sales

WHERE profit_margin > 0 AND extract (year from sales_date) > 2010

Note that the alias name is translated to its original name when the query is  finally  executed.

Sometimes, you may need to filter the results of the SQL query using fields that are not projected and, therefore, will not appear in the base view schema. Denodo provides a way to do this. A typical example happens when the SQL query used to create the base view uses GROUP BY. For instance, suppose the SALES table has a field store_country that is not included in the schema of a new virtual view SALES_VDB2 defined as:

SELECT product_id, customer_id, sum(profit_margin) as total_margin

FROM sales

WHERE @WHEREEXPRESSION

GROUP BY product_id, customer_id

but you want to filter the results of the SALES table by store_country, you can execute queries like this over the Denodo base view:

SELECT *

FROM SALES_VDB2

WHERE store_country = ‘US’

and they will be translated to queries like this executed on the data source:

SELECT product_id, customer_id, sum(profit_margin) as total_margin

FROM sales

WHERE store_country = ‘US’

GROUP BY product_id, customer_id

In order to make this work, you have to specify STORE_COUNTRY in the option ‘List of field names not included in the above list that can appear in the delegated conditions’ when you create a base view with WHEREEXPRESSION.

To avoid forcing the queries to always have the WHERE clause, Denodo provides the interpolation function ExecuteIfIsNotNull. For more information about this interpolation function and the WHEREEXPRESSION variable, you can check the section Using the WHEREEXPRESSION Variable of the Virtual DataPort Administration Guide.

Please note that the interpolation variable WHEREEXPRESSION cannot be used with ODBC data sources. Only JDBC data sources are allowed.

Best Practices

This section will explain the main pros and cons of using the ‘Create from query’ option and some recommendations about when and how to use it.

Pros

You should use this option when:

  1. You have complex queries already built. Development process can be faster if you want to import the results of a complex query you have already written (for instance, because you use it for another application, as some BI tool for reporting). You will only have to copy & paste it instead of replicating the hierarchy of views in Denodo.
  2. You have to use database-proprietary functions or constructions.
  • If the reason to create a view with a SQL Sentence is because you want to use a function not included in Denodo, and you are going to use it in several views, it is better to create a delegable custom function instead (when possible). Read the section Developing Custom Functions of the Developer Guide to learn how to develop custom functions.
  1. You have an SQL query that has been tuned to perform better than the query that Denodo will execute. For instance, you may want to provide a certain optimization hint to the database using its proprietary syntax.
  2. You have to execute a stored procedure of the database or a query that uses a stored procedure, for databases where this is not graphically supported (currently, only Oracle, SQL Server and DB2 are supported).

Keep also in mind that, when possible you should enable the option ‘Delegate SQL sentence as subquery’(enabled by default starting from Denodo 6.0), located in Options > Search methods > wrapper source configuration in VDP Admin Tool, for the base views created using this option in order to let Denodo delegate more queries to the data source (as shown in Figure 4 and Figure 5). One limitation on this option is explained in the following section.

Cons and software limitations

  1. Using a SQL sentence may be harder to maintain and reuse:
  • For instance, suppose that you have many views that use the same subquery in their definitions; if the subquery changes, you will need to manually modify it in every view. Conversely, if the subquery were a Denodo view, you would only have to do the modification in one place. A better alternative in this case could be using an hybrid approach where the common subquery is encapsulated in a different view (see Figure 1).

Figure 1. Lack of encapsulation vs. hybrid approach

  1. You will not benefit from the tree view, used by, data lineage and propagation of changes functionalities among the tables used in the SQL Sentence. The query is a ‘black box’ for Denodo.
  2. As stated before, in order to delegate more queries over the base view, it is recommended to enable the option ‘Delegate SQL sentence as subquery’. This option has some limitations:
  • Limitations in delegation: if the SQL Sentence includes mandatory fields (i.e, it has interpolation variables) joins with other views from the same data source will not be delegated.
  • With this option enabled, you should not use interpolation variables or WHEREEXPRESSION. If you do, please, make sure you need them because of one of the following two reasons (otherwise, delete them):
  • When you want to parameterize expressions not supported by Denodo. Eg: parameterize the number of divisions in the percentile functions, as shown in Figure 6.
  • When you want to parameterize queries with conditions whose result is different if executed above or under the query (e.g: group by, outer joins). This way you have two options:
  • Use interpolation variables: has the inconvenience that the condition is wired up.
  • Use a WHEREEXPRESSION: the condition is not wired up, but you cannot use the ‘Delegate SQL sentence as subquery’ option.

Examples of use

Analytical function (RANK)

In this example we want to retrieve the sales grouped by customer, and for each group, list the results in a ranking ordered by each sale’s profit. This way, the first tuples of each group will show the best customer’s sales. Our data source is an Oracle database and we want to make use of the rank analytical function. Figure 2 shows how to create a base view in Denodo for this:

Figure 2. Example of SQL Sentence with rank function

Now, we can execute queries like:

SELECT *

FROM rank_of_sales_by_customer

WHERE ranking <= 5

in order to get the top #5 sales grouped by customer.

Figure 3. Result of query: top #5 sales by customer

Figure 4 and Figure 5 show the difference in delegation depending on whether the option ‘Delegate SQL Sentence as Sub Query’ is enabled or not:

Figure 4. Execution trace with ‘Delegate SQL Sentence as Sub Query’ set to false: filtering executed by Denodo

Figure 5. Tree view with ‘Delegate SQL Sentence as Sub Query’ set to true: filtering executed by Oracle

When the ‘Delegate SQL Sentence as Sub Query’ option is enabled, the WHERE clause is delegated to Oracle, so the filter is done directly in the source, with the corresponding positive impact on performance, because less tuples are moved through the network from the sources to Denodo.

Analytical function (NTILE)

In this example we use the function ntile to divide the results of the previous example in a number of tiles (this number is specified by the user). So the query will be:

Figure 6. NTILE function with an interpolation variable

Figure 7. Results of query with 4 tiles

Difference filtering above or under the SQL Sentence view

Results from queries may differ if the query conditions are directly defined in the base view created by means of a SQL Sentence or if they are applied over that query.

For instance, consider the base view ntile_of_sales_by_customer_parameterized (where the filtering by ss_net_profit is directly defined in the SQL query) defined as:

SELECT

        ss_item_sk,

        ss_ticket_number,

      c_customer_sk customer_id,

        c_first_name customer_first_name,

            c_last_name customer_last_name,

            ss_net_profit,

      rank() OVER(partition by c_customer_sk ORDER BY ss_net_profit DESC) AS ranking,

        ntile(@num_of_tiles) OVER(partition by c_customer_sk ORDER BY ss_net_profit DESC) AS ntile

FROM EDW_TPCDS.CUSTOMER

            INNER JOIN ADMIN.STORE_SALES

            ON (c_customer_sk = ss_customer_sk)

WHERE ss_net_profit > @net_profit

GROUP BY

            c_customer_sk,

            c_first_name,

            c_last_name,

            ss_item_sk,

            ss_ticket_number,

            ss_net_profit

If we execute the query:

SELECT * FROM ntile_of_sales_by_customer_parameterized

WHERE num_of_tiles = 4 AND ss_net_profit > 1000

We get the following results:

Figure 8. Filtering before creating groups

The result is different if we execute the same query over the previously created view ntile_of_sales_by_customer:

SELECT * FROM ntile_of_sales_by_customer

WHERE num_of_tiles = 4 AND ss_net_profit > 1000

Figure 9. Filtering after creating groups

This is because in the first case, the tiles are calculated without the tuples with ss_net_profit <= 1000 and in the second case with them (although they are filtered later from the final results).

Questions

Ask a question
You must sign in to ask a question. If you do not have an account, you can register here

Featured content

DENODO TRAINING

Ready for more? Great! We offer a comprehensive set of training courses, taught by our technical instructors in small, private groups for getting a full, in-depth guided training in the usage of the Denodo Platform. Check out our training courses.

Training