Queries Optimization¶
In this dialog, you can enable several features of the Execution Engine that optimize the execution of queries:
The cost-based optimizations. See section Cost-Based Optimization.
The Automatic simplification of queries. If enabled, the Execution Engine tries to simplify the queries before executing them. The goal is to decrease the number of operations in the Query plan and increase the number of operations that are delegated to the data sources. This optimization is enabled by default. The section Automatic Simplification of Queries explains how this feature works.
The summary rewrite optimization. If there exist summary views enabled for query rewriting, the query optimizer will try to take advantage of the data stored in those summaries.
The data movement optimization.
The parallel processing query optimization. If enabled, the query optimizer will be able to create temporary tables in the database of the selected data source to insert data and execute operations using massive parallel processing (MPP). You can select:
The cache data source
Another JDBC data source (Custom data source)
Either the cache data source or the selected JDBC data source have to connect to a database that supports parallel processing. In the parallel processing section there are several examples of how to setup the MPP engine combined with different cache configurations.
If you enable this optimization, make sure that:
The Virtual DataPort server and the database are in the same network segment to ensure the data is transferred fast between both systems.
You selected the check box Use bulk data load APIs on the configuration of the cache or the data source so the data is inserted as fast as possible into this database.
The execution engine to process window functions (Only available in Design Studio). Virtual DataPort tries to push down these functions to a data source to delegate them. When that is not possible, Denodo can execute the window functions in an external system by transferring the necessary data to a parallel processing system or to the cache. You can select:
The engine configured for parallel processing (both embedded or external).
The data source configured for caching if there is no parallel processing engine configured. As window functions are resource-consuming operations. We only recommend enabling this option if the cache database has a parallel processing engine. In case there are different cache systems for different databases, the optimizer will choose the one configured for the database the user executing the query is connected to.
Warning
Use the Design Studio instead of the Administration Tool to configure the parallel processing engine as some options are only available on this tool.