This document explains the best practices when integrating Tableau and Denodo together. It has been created using Denodo 8.0 20210209 and Tableau 2020.3 so certain capabilities may be different if using other versions of the products.
If you already have a set of Tableau Dashboards built and pointing to data sources different from Denodo, the first step should be to replace those Data Sources with a Denodo Data Source. This is an overview of the steps needed to go through that process:
As mentioned above, after you have your Denodo data sources and views created, you will need to change the data source for your Tableau Dashboard. The best way to do this replacement is using the steps described below:
When using Denodo as a data source for Tableau, the data can be obtained from different places due to the different features provided by each product, Tableau and Denodo.
In the case of Tableau, the data can be obtained from Denodo in real time or it can be materialized in a Tableau Data Extract. In the case of Denodo, the data can also be obtained from the sources in real time or it can be materialized using different techniques such as caching or summaries.
The functionalities can be mixed and matched, for example:
Although it will depend on the use case, the recommended architecture will be a “Live connection” from Tableau to Denodo in combination with the different Denodo optimizations provided such as Denodo Cost Based optimizer, Summaries and Cache. This recommendation will enable the maximum level of reusability and minimum maintenance when there are more client applications than Tableau in our architecture.
When accessing a data source in real time from Tableau, this option is called a “Live connection”. Live connections offer the convenience of real-time updates with any changes in the data source reflected in Tableau. Live connections also rely on the database for all queries and unlike extracts, databases are not always optimized for fast performance. With live connections, your data queries are only as fast as the database itself.
When using this option to access Denodo, a query will be sent to Denodo and the Denodo optimizer will be in charge of sending the queries to the data sources or leverage other Denodo optimizations such as caching and summaries.
Extract files are a local copy of a subset or entire data set that you can use to share data with others, when you need to work offline and improve performance. When creating a new data source in Tableau, it is possible to use this option or the “Live connection” explained above.
A very common problem with Tableau dashboards could happen when users would try to extract very large data sets from their sources into a Tableau Extract. This could lead to the generation of a complex query for the data source to handle and may become a performance issue.
Denodo’s recommendation is that if the “Live connection” performance is satisfactory, there's no need to use Extracts. Also, before using Extracts it is possible to leverage other Denodo optimizations as mentioned in section “Avoiding extracts using summaries” so the optimization could be reused by other client applications other than Tableau.
If a Tableau Extract is really needed, Denodo recommends to generate an Extract using Tableau’s Aggregate Extracts based on the fields that will be used only (clicking on the button “Hide All Unused Fields”) in combination with the available feature “Aggregate data for visible dimensions in Tableau”. It is also possible to add extract filters and create incremental extracts by indicating a field that will control this "delta increment" like a code or date. This considerably reduces the amount of data needed when querying Denodo so the extracts are generated more efficiently.
More information on this topic can be found here.
Certain Tableau data sources (like Denodo) support the creation of temporary tables inside the physical data source to improve the performance of a dashboard. An example of when Tableau creates a temporary table is the creation of context filters that are reused in different parts of the dashboard.
By default, Tableau’s Denodo connector has this option enabled although Tableau automatically will switch to a different method if there is any error when creating or inserting the data.
As Denodo does not store any data by default (Temporary tables are created in the Denodo Cache database), in order to leverage this option in Denodo, the cache needs to be enabled for the server or the virtual database where Tableau is connected. If the cache is disabled, the query will still work as mentioned above but it won’t use the temporary table feature but a different type of query.
If the cache is enabled in Denodo, Tableau will try to insert data into the temporary table in two different ways. One of them is using a SELECT INTO statement which is faster and the second one is running a set of individual INSERT statements which could become a bottleneck when the amount of data that Tableau needs to insert into the temporary table is very big.
Certain functionalities (and how the queries are generated) in Tableau can be modified by creating a custom TDC file. For this particular scenario, it will be possible to disable the creation of temporary tables so a different strategy is used by disabling the property CAP_CREATE_TEMP_TABLES.
More information on creating a custom TDC File can be found in the “Modifying Denodo queries using TDC” section.
As it is not possible to know beforehand if the amount of data will be big or not, we can not provide a rule of thumb to determine when this option should be disabled. This document just tries to provide as much information as possible so the reader can take the best approach depending on their scenario.
The Denodo optimizer parses the queries that Tableau sends and sends the optimal queries to the different data sources, in real time.
Due to Denodo’s first-class optimization techniques, in most of the cases this is the only option required and no materialization is needed.
Denodo incorporates a Cache Engine that can store a local copy of the data retrieved from the data sources in an external database such as Oracle, SQL Server, DB2, MySQL, PostgreSQL, Netezza, Teradata, Vertica, Redshift, SQL DW, Athena, Hive, Impala, Presto, SAP Hana, Snowflake, Spark, YellowBrick...This may reduce the impact of repeated queries hitting the data source and speed up data retrieval, especially with certain type of sources.
Although this option may sound similar to Tableau’s Data Extract functionality, it has many benefits such as being able to reuse the contents of the cached data among other queries or other client tools.
This option can be used in combination with Tableau Data Extract although the data will be stored in two different places (Tableau Data Extract and Denodo Cache) and will need to be refreshed accordingly.
With Denodo, a user can store the result of a query on a table of an external database with the purpose of providing smart query acceleration when executing other queries. This feature is called summaries. Similar to the concept of aggregation awareness used by reporting tools (BO, MSTR), materialized views in Oracle or indexed views in SQL Server.
When building the execution plan of a query the optimizer can identify parts that match with the query used to create the summary. If possible, these matches will be substituted with accesses to the summary. With a well defined summary, the performance of the query to execute will be greatly boosted.
One of the main use cases of Summaries is to accelerate the execution of analytical queries like the queries sent by Tableau for most of the dashboards. By creating the right summaries, the Tableau dashboards can benefit from reusing them and improve the performance.
The main benefits are:
From a developer perspective there are additional benefits:
More information about Summaries including when to use the Denodo Cache or Summaries can be found here.
In order to identify which summaries should be created based on the current Tableau Dashboards it is possible to follow these simple steps:
Normally extracts are created for a specific purpose and will not cover all different usage needs that will be required from a data source. It’s very common to have duplicated or almost identical extracts having minor differences such as some fields or filters. This will require lots of processing in Tableau Server and lots of schedules to keep those extracts updated.
For multiple scenarios, it is possible to create a summary in Denodo that will be reused, rather than independent Tableau Data Extracts. That will minimize the processing done by Tableau Server, simplifying the maintenance and will maximize the reusability among Tableau Dashboards or queries coming from other client applications.
From a high-level perspective, Denodo’s Optimizer core objectives are:
The navigation for a query execution is as follows:
Based on this picture we can highlight some areas where a few possible “Bottlenecks” may take place:
From a methodology standpoint, detecting bottlenecks in a query, a source (data source + network), the denodo server or the client itself refer to the following article: Best Practices to Maximize Performance IV: Detecting Bottlenecks in a Query.
Essentially the process to follow is:
When using Denodo in combination with Tableau, there are two different ways of debugging what is happening. One is using Tableau’s monitoring tools and the other one is using Denodo’s.
For example, from Tableau’s Performance Recorder, it will be possible to understand the queries that Tableau is sending to Denodo and what is the complete time that the Dashboard took to generate. From Denodo’s monitoring tools it will be possible to check what are the queries sent to the actual data sources and what optimizations Denodo is using to resolve the query sent by Tableau.
The first step to detect issues in our Tableau dashboard will be to run a performance record in Tableau to determine the step that is taking more time than expected.
This can be done in two places depending on what is the tool we are using, Tableau Desktop or Tableau Server.
The first step to debug a dashboard on Tableau Desktop is create a “blank” sheet, because it has to be the first to load when we open the workbook
Now you should have a blank sheet as part of your dashboard.
Save the file with the blank sheet, open and close Tableau Desktop.
Note: This step is very important so we can ensure that cache is not being used.
Reopen your Tableau workbook. The blank sheet must be the first thing you will see
Click on Help > Settings and Performance > Start Performance Recording.
Click on the dashboard you want to debug and wait for it to load completely.
After the Dashboard loads, we have to stop the performance recorder, by clicking on Help > Settings and Performance > Stop Performance Recording.
A new Tableau Desktop Window will open, and the trace will be recorded.
The green bars are the queries sent to Denodo so let’s click in a green bar and the report will show the query executed in addition to the execution details.
More info on how this process is done on Tableau Desktop can be found here.
In order to be able to record the performance in Tableau Server, our user has to have the proper permissions and performance recording needs to be enabled in Tableau Server. More details in the link at the end of this section.
These are the steps required to record the performance of our dashboard:
More info on how this process is done on Tableau Server can be found here.
Denodo provides different tools to understand the queries that are being executed. This section covers them.
The Query Monitor provides the queries actively running in the server. For the purpose of debugging Tableau dashboards this method will not be sufficient as queries will be deleted from the monitor once they are finished.
This method will store in files or a database (depending on the configuration used) all the queries executed in the Denodo Server. The Denodo Monitor can be enabled in the Solution Manager or a standalone Monitor can also be started depending on your testing needs and configuration.
Before we run the Tableau Dashboard that we want to debug, it is useful execute the following command in Denodo (e.g. using VQL Shell) in order to delimit the beginning of the test:
CALL WRITELOGERROR('Starting Test Dashboard 1')
Although the DMT allows you to check the queries actively running (similar to the Query Monitor), this tool can also be used to generate a Diagnostic and visualize the logs obtained from the Denodo Monitor.
The combination of both the Denodo Monitor and the DMT will give you a good understanding of the queries executed by the Denodo Server.
For certain scenarios, it will be necessary to obtain the full query plan generated by the Denodo Server. In order to store individual plans for every single query executed, you will have to enable the saveTrace feature as described here.
When reviewing the Denodo logs, we will find that there are additional queries than just SELECT / CREATE TEMP TABLES/ INSERT commands that reach our Denodo Server. Those queries include DESC MAP, DESC SESSION, CALL GET_VIEW_COLUMNS, DESC VIEWSTATSSUMMARY, etc. The reason why we are seeing those queries in the log, is that Tableau has to send additional metadata queries to Denodo so it can optimize the queries and obtain the column types, statistics and metadata such as PK/FK information.
This is an example of the queries that will be sent by Tableau to Denodo for a common dashboard
Although it is possible to disable some of the metadata queries using a custom TDC file, the recommendation is not to do so. The reason is that the performance of the Tableau dashboards could be affected by disabling them and it could lead to Tableau generating less optimal queries for Denodo.
Have in mind that sometimes the problem is not related with the databases or extracts, sometimes the performance is related to the Tableau Dashboard itself and how it was designed.
It is important to validate and identify if the majority of the problems are on the queries execution or in the compute layout phase. Dashboards that have so many points to render typically have a high compute layout time. In addition a high connection time (especially on server) means that maybe there are too many layers of network between Denodo and Tableau Server and this will impact the performance of the execution. Normally the execution times should be similar on Desktop or on Server, differences can indicate that server is undersized or with hardware problems.
The integration between Tableau and Denodo supports different authentication methods. When using Denodo as part of our architecture, there are different authentication processes happening at different levels:
As seen in number 1 of the diagram above, a Tableau User first connects to Tableau Server. This can be configured using different mechanisms that can be found in the matrix described here.
To do Single Sign On (SSO) between Tableau Server and Denodo (as described in the next section) the options supported are SAML and Integrated Windows Authentication (Kerberos).
After a user is authenticated in Tableau Server, Tableau Server needs to make a connection to the Denodo data source (number 2 of the diagram above). This can also be done using different methods such as Kerberos, a Service account or prompting for the user credentials once per session. You can see the different options below classified by using SSO or not.
Kerberos delegation enables Tableau Server to use the Kerberos credentials of the viewer of a workbook or view to execute a query on behalf of the viewer. This will enable SSO to access the Denodo data source.
In order to benefit from this option, we need to use SAML or Kerberos as authentication in the Tableau user to Tableau Server login step.
Note: the user account of Tableau Server in Active Directory needs to have “constrained delegation” enabled (there is an exception described below) and you have to add the Denodo server to the list of allowed services. Otherwise, Tableau will not be able to obtain the Kerberos credential to connect to Denodo on behalf of the end user.
It is possible to use “standard delegation” instead of “constrained delegation” in the user account of Tableau when the following circumstances apply:
Detailed information about this topic can be found here.
It is possible to use a “service account” to authenticate to Denodo from Tableau Server. This option is easier to implement but all users authenticated by Tableau will log in to Denodo under the same username. This is the option used when the Denodo data source uses “Username and Password” as the “Authentication” option as shown in the screenshot below:
With this method, when a user logs to a Dashboard for the first time, Tableau will request the username and password to access Denodo. The user only has to enter this information once per session as Tableau will cache the credentials for the duration of the session. This option is configured when publishing a data source separately or embedded in a dashboard as shown below:
When using this option in a Denodo data source, the Denodo server will use the credentials of the user to connect to this database. The behavior changes depending on the authentication method used to connect to the Server:
Example of a full “pass-through” flow:
Denodo will connect to the data source using a Kerberos ticket generated at the Denodo Server level (not at the Tableau Server level like when using the Pass-Through option explained above).
The Denodo server will connect to the data source using the login and password specified.
This section covers some of the common errors that can be found when integrating Tableau and Denodo together.
There could be scenarios where the query generated by Tableau does not work as expected and returns an error when accessing Denodo. In those scenarios it will be useful to provide your Denodo or Tableau Support Agents the queries that are being generated by Tableau, the versions of both products, a VQL file containing the Denodo metadata and a sample of Tableau’s .TWB file if possible.
For these scenarios, one test that can be executed, is to use a Generic ODBC connection to Denodo instead of using the Denodo certified adapter. The purpose of these tests is to validate if the query syntax used by Tableau when selecting the Generic ODBC adapter makes the query work so the Support Agent can pinpoint the issue faster.
Using the generic ODBC connector is only recommended for testing/debugging purposes and Tableau will not provide support to customers if this generic connector is used when accessing Denodo instead of the official Denodo connector.
It is very common for users to share Tableau’s Dashboards URLs after it gets loaded (e.g. during a POC or Testing phase). Those urls usually contain an id parameter with the session id similar to:
When including the iid parameter as part of your url, Tableau may try to reuse a session that may no longer exist and Tableau will return an error.
Instead, you should remove the iid parameter before sharing the URL so Tableau generates a new session id that should work as expected. The url should look like this:
Tableau allows you to modify how queries are sent to Denodo by creating a custom TDC file containing different configuration parameters. By default, Tableau’s Denodo connector has the proper values and there is no need to modify them.
For certain scenarios, such as the creation of a temporary table (as described above), the modification of the fetch size (as explained here) or when debugging a report or dashboard that fails, it could be useful to know how to modify those properties following the steps that can be found in the Denodo community here or Tableau’s official documentation
Although it is not official, this website has a comprehensive list of parameters that can be used in a TDC file
As mentioned in the section How to migrate a Tableau Dashboard to use Denodo as data source of this document, the recommended way to change a Dashboard’s Data Source is using the Replace Datasource Method, mapping the new data source and then use the Replace function.
Sometimes, users try to edit the original data source and then change the connection parameters to point to the new Source without following the suggested steps. Doing this may result in errors as Tableau may use the wrong connector to map the new source and also formulas may not be migrated correctly.
To do this operation safely , always use the Replace Datasource technique described in this document.
Although Tableau and Denodo work together continuously to minimize the impact of upgrading any of the products, the way certain queries are generated by Tableau may change between the different versions. These changes should in most of the cases improve or fix the queries that Tableau is sending to Denodo although there could be certain scenarios where that may not be the case. Denodo’s recommendation is to always test your Tableau dashboards after and upgrade (of Tableau or Denodo) following your current testing procedures.
In the same way, a dashboard or report not working could have been fixed in a more modern version of Tableau (or Denodo) so it is always recommended to test on the latest Tableau/Denodo version if possible when debugging an issue.
If you find any type of issue in your dashboards or reports, please always specify the exact versions of the products that worked / stopped working to your Denodo or Tableau Support Agents when reporting a support case in order to identify the issue faster.
Additional information about Tableau and Denodo can be found in the links provided below: