Virtual DataPort

Using Elements from Other Database

You can now create elements that reference elements from other Virtual DataPort databases.

For example, you can create a join view in a database with views that belong to other databases.

For example,

CREATE VIEW full_order_detail AS SELECT ...
FROM
    order_detail
INNER JOIN
    warehouse.product AS product
ON order_detail.product_id = product.product_id

This statement creates the view full_order_detail in the database that you are connected. This view joins the view product of the database warehouse with the view order_detail of the current database.

This feature applies to views and other elements of Virtual DataPort. E.g., web services can publish views from several databases; associations can link two views from different databases, etc.

One of the uses of this feature is having a database with the canonical views of the organization and, be able to reference these views from other databases.

This feature removes the need of creating a JDBC data source to the Virtual DataPort server itself and from now on is discouraged because the performance is worse.

This change has certain implications for the import and export process:

  • When you import a VQL file, the commands are processed from top to bottom, as in previous versions. However, now you may have a database A that has a view V1 that is a derived view of a view V2, which belongs to the database B. The command that creates the view V1 fails because the underlying view (V2) does not exist yet.

    Virtual DataPort handles this scenario by storing the commands that fail and re-executing them once the entire VQL file has been processed.

    The section Importing Metadata into a Server of the Virtual DataPort administration guide explains this process in detail.

  • The process of importing an entire database from a Version Control System (VCS), or update some changes has changed from Denodo 5.5.

    When doing a check out from the VCS and a statement fails, the Server continues executing the next one. This is necessary to make it possible to update a database that has elements that depend on elements from another database that does not exist yet.

    In previous versions, when a statement obtained from a VCS failed, nothing was imported.

Administration Tool

The administration tool of Virtual DataPort has been redesigned to simplify the development with Virtual DataPort. The main changes are:

  • Tabs: each element is opened in a different tab so users can easily switch from the definition of one view to another.
  • The Elements Tree displays all the databases of the Server.

Other minor features are:

  • By default, the elements tree now lists the associations.
  • The dialogs to create/edit derived views have an option to project all the fields of a register field (“Project subfields of…”). This is useful if the register has many subfields and the user has to project them all.

Execution Trace

  • When executing a query from VQL Shell, the Tool automatically adds the clause TRACE to the queries so you do not have to.
  • The text trace (not the graphical one) includes the attributes “data source name” and the name of the Virtual DataPort database when the data source is located.

Data Lineage: Show the Implementation Expressions of Interface Views

The dialogs Data lineage and Tree view now show the implementation expressions of interface views.

“Please Wait Dialog” Indicates User is waiting for Transaction

(Feature available in the second update of the Denodo Platform 6.0, scheduled for April 2016).

When you run a query that is blocked by the Execution Engine because there is an active transaction, the administration tool now displays the message “Waiting for active transaction to finish”. In previous versions it said “Waiting for single user mode to finish”, which was misleading.

This occurs when, after another client started a transaction, you execute an operation that needs to switch the entire Server - not just a database - into single user mode. For example, when you create, modify or delete a user, role, i18n map, etc. When this happens, the Server is not in single user mode, but the Server tries to switch to global single user mode and it cannot do so because there is an active transaction.

The appendix “Single User Mode” of the Virtual DataPort administration guide explains this.

Date and Time Values

In Denodo 6.0, the values coming from fields whose type “Source type property” is DATE or TIME are always treated like date or time values and not like timestamps. Because of it, to process these values, the i18n of the query, the database or the Server will be ignored.

In JDBC base views the source type property is automatically set, but not in the other type of base views. Therefore, it is important to set it if you are dealing with dates or time fields that are not timestamps.

This enhancement will avoid problems in scenarios where the i18n of the Server and the client does not match.

If you are migrating from a previous version and you do not run into these issues or you already fixed them, this enhancement will not affect you. All the queries will continue working correctly.

Example where this feature is useful. Let us say this:

  • You have a JDBC base view that has a field that has the type “DATE” in the database.
  • The value of this field is always “2016-03-14”.
  • The i18n of the Virtual DataPort server is “us_est”.
  • The i18n of the administration tool is “us_pst”.

In Denodo 5.5:

  • If from the administration tool, you run SELECT * FROM view, the option “Internationalize query results” is cleared and the administration tool is located in California, you will obtain “Wed Mar 13 21:00:00 PST 2016” (which is three hours earlier of the day before).
  • When executing the same query from the administration tool and “Internationalize query results” is selected, the value of these field changes depending on the parameter “i18n” of the CONTEXT clause.
  • If you query this view from a JDBC client, you will obtain “2016-03-13” (one day earlier).

In Denodo 6.0:

  • If you query this view from the administration tool and the option “Internationalize query results” is selected, the value will always be “2016-03-14” regardless of the i18n of the query, the database or the Server.
  • If you query this view from the administration tool and “Internationalize query results” is clear, the value will always be “2016-03-14 00:00:00” regardless of the i18n of the query, the database or the Server.
  • If you query this view from a JDBC client, you will always obtain “2016-03-14”.

Support for Windows 10

Virtual DataPort officially supports Windows 10.

JDBC Data Sources

Support for Amazon Redshift

Virtual DataPort now includes a JDBC adapter for Amazon Redshift.

Support for Teradata Version 14 and 15

Virtual DataPort now includes a JDBC adapter for the versions 14 and 15 of Teradata. It can be used in JDBC data sources and for the cache.

Upgrade of the JTDS Driver

The JTDS Driver included with the Denodo Platform has been upgraded to the version 1.3.1 that among other things, add support Kerberos authentication in Linux.

Oracle Adapter Indicates Escape Character

The Oracle adapter now indicates the escape character when delegating the operator LIKE to Oracle.

IBM DB2: Support for Nickname Elements

The IBM DB2 adapter now has support for nickname elements.

Microsoft SQL Server: Support for Pass-Through Authentication Using the Domain of the User

The JDBC data sources to Microsoft SQL Server configured with “pass-through authentication” can now use the domain name that the user provided when connecting to Virtual DataPort.

This works in combination with the Kerberos authentication where sometimes the user name is “user@domain” or “domain/user”.

In previous versions, “pass through session credentials” used only the login and password. However, the connection to SQL server with windows users also requires a domain name. The domain had to be hardcoded in the URL of the JDBC data sources and was the same for all connections. Therefore, a data source was limited to users that belonged to the same domain.

In this version, if the URL of the data source contains a domain and the user logs in Virtual DataPort entering another domain, the data source uses the one provided by the user and ignores the one from the data source.

Data Sources with OAuth Authentication: Tokens as Interpolation Variables

In data sources that retrieve the data from sources with OAuth 2.0 authentication, the refresh token and the access token can now be an interpolation variable.

Excel Data Sources: Allow Selecting Multiple WorkSheets

Excel sources now allow retrieving data from several worksheets from an Excel file. In previous versions, an Excel base view could either obtain data from one worksheet of the file or from all of them, but not from some of them.

See more details about this feature in the section Excel Sources of the Virtual DataPort Administration Guide.

Execution Engine

The Execution Engine contains several improvements in the static optimizations, which modify the execution plan to make it more efficient.

Automatic Data Movement Based on Statistics

The Execution Engine is now capable of generating an execution plan that involves doing data movements, based on the statistics of the views involved in a query. To achieve this, do the follow:

  1. Enable “Automatic cost-based optimization”.
  2. Enable “Automatic simplification of queries”.
  3. The data sources to which the data will be moved temporarily have to have the check box “Can be data movement target” selected (“Read & Write” tab of the “Configuration” dialog).

In previous versions, the execution engine only performed data movements if the user had explicitly configured it in the execution plan of the view.

Improvements in the Cost-Based Optimizations

Improvements in the calculations of the cost of running certain queries involving massively parallel processing databases.

Support for Analytical Functions Delegable to the Source

(Feature available in the second update of the Denodo Platform 6.0, scheduled for April 2016).

Support for delegating analytical functions (e.g. RANK, DENSE_RANK, NTILE, etc.) to databases that can process them.

This support is only for delegating them to a source. Virtual DataPort cannot actually execute these functions.

Support for ORDER BY with Expressions

(Feature available in the second update of the Denodo Platform 6.0, scheduled for April 2016).

Support for using expressions in the ORDER BY clause of the queries.

Canceled Queries Are Terminated after Waiting Five Minutes for all the Threads to Finish

The execution engine now terminates queries that have been in the state “Canceled” for more than five minutes.

To run a query, the execution engine, uses one thread to perform the operations of the query (project, select, join, etc.) and one thread per each source involved in the query. For example, a query that joins data from two sources and then filters it will use three threads: one to perform the join and the filtering and two to retrieve data from each source.

When a user cancels a query that involves a JDBC data source, the execution engine requests the thread that is retrieving data from that source to stop, which in turn invokes the method cancel() of the database’s JDBC driver.

Usually, the request to cancel() finishes almost instantly. However, on some occasions, the cancel operation takes a long time to finish. The cancel() method is blocking so the thread cannot do anything else until this method finishes. A thread that invokes a blocking operation cannot be interrupted either.

In this version, once a user cancels a query, the execution engine waits up to five minutes for all the threads to finish. After that, the execution engine considers that the query has finished anyway and removes it from the list of queries that are currently running. Note that the thread will continue waiting for the cancel() method to finish. Once it finishes, the thread will go back to the pool of threads.

In previous versions, Virtual DataPort would wait indefinitely for the cancel() method to finish.

This enhancement applies to JDBC data sources, ODBC data sources and custom wrappers.

Note

A query is listed in the Query Monitor and in the result of the getActiveRequestList() operation of the VDBServerManagementInfo MBean until all the threads started to execute the query are finished. Therefore, if a driver or a custom wrapper do not stop when instructed, a query will be listed there for up to five minutes. In previous version, it could be listed indefinitely.

Performance Improvements

LDAP Authentication

When authenticating an LDAP user, Virtual DataPort now requests less data to the LDAP server thus making the process of authenticating users faster.

Improvements on the Metadata Cache

Virtual DataPort stores the metadata of its elements (data sources, views, associations, etc.) in a Derby database. In addition, it has a cache in memory that stores this information to avoid reading the metadata of an element from this database every time is used.

In previous versions, this cache was reset and disabled during the execution of an action that modified the metadata.

This behavior has changed to keep in this cache the objects that have not been modified. This change decreases the number of times Virtual DataPort will have to read metadata from the Derby database. This reduces the chances of a “lock” occurring when running concurrently requests that need to modify or access metadata.

Custom Functions API: Provide Access to Contextual Information of the Query

(Feature available in the second update of the Denodo Platform 6.0, scheduled for April 2016).

The custom functions API now provides a method to obtain the following information about the query that is invoking the function:

  • The user that runs the query.
  • The list of roles granted to this user.
  • The name of the database over which query is executed.

Out-of-the-box Stored Procedures: CATALOG_PERMISSIONS

Now any user can execute CATALOG_PERMISSIONS. In previous versions, only administrators could execute it.

Bear in mind that the result will change depending on the privileges granted to the user that executes it.

See more about this in the section CATALOG_PERMISSIONS of the VQL Guide.

Monitoring

Denodo Monitor

The Denodo Monitor now logs all the statements sent to Virtual DataPort (INSERT, UPDATE, DELETE, BEGIN, COMMIT, CREATE, DROP, DESC, etc.) In previous versions, it only logged SELECT and CALL statements.

The main benefits are:

  1. Administrators do not have to enable the requests log. I.e. changing to INFO the log level of the category com.denodo.vdp.requests is no longer necessary.
  2. Administrators will have all the information about requests in a single file all the requests sent to Virtual DataPort. In earlier versions, this information is split between the “queries.log” files of the Denodo Monitor and the “vdp-requests.log” of the directory <DENODO_HOME>/logs/vdp.

In addition, the Denodo Monitor has the following new features:

  • The “vdp-queries.log” now includes the IP address of the client that executed the query. This makes it easier to track down which application executed a particular statement.
  • There is a new monitor: the “Data sources monitor” that logs information about the state of each data source. I.e. number of active requests, current number of connections to the source, etc.

MBean VDBServerManagementInfo: New Operation getRequestById.

The MBean VDBServerManagementInfo has a new operation: getRequestById. It returns information about a specific request that is currently running or waiting to be executed.

The section Operations of the VDBServerManagementInfo MBean of the Virtual DataPort administration guide lists the operations of this MBean.

MBeans TransactionsManagementInfo Events: New Attribute SessionId

The notifications generated by the MBeans TransactionsManagementInfo now have a new attribute: SessionId.

You can find more information about these MBeans in the section Information and Events on Transactions of the Virtual DataPort administration guide.

Version Control System (VCS)

Support for GIT

GIT is now supported.

Cache

New Database Adapters: Vertica 7 and Teradata 14 and 15

There are two new database adapters for the cache engine: Vertica 7 and Teradata 14.

(The Vertica 7 adapter will be available in the second update of Denodo 6.0, scheduled for April 2016).

Changing a Parameter of the Connection

Changing a parameter in the connection to the cache database is now instant.

In previous versions, when you changed anything in the cache settings, Virtual DataPort did this:

  • Checked that the tables that hold the data on the cache database are created.
  • Recreated the pool of connections to the cache database.

These operations, particularly the first one, could take several minutes.

Virtual DataPort no longer does this when you change any of the following parameters:

  • Maintenance period and Maintenance
  • Default time to live
  • Fetch size
  • Default batch insert size
  • Connection pool configuration: the Server recreates the pool of connections, but it does not check the tables on the cache database.

Changing other parameters still involves checking if the cache database holds the necessary tables.

Resource Manager

In an environment with multiple concurrent user sessions that run queries, not all user sessions have the same importance. You may want to give more priority to queries executed by transactional applications, which need an immediate response, than to queries executed to generate a daily report.

The Resource Manager allows you to classify sessions into groups based on the attributes of the session, and to allocate resources to those groups in a way that optimizes resources utilization for your application environment.

The section Resource Manager of the Virtual DataPort administration guide explains how to use it.

Upgrade of the Java Secure Channel Library (JSCH)

The Java Secure Channel (JSCH) library has been upgraded to version 0.1.53, which provides support for more secure algorithms when connecting to SFTP sources. You can do this from DF, JSON and XML data sources.

JDBC Driver

Support for the JDBC 4.1 API

The JDBC driver supports the Java Database Connectivity (JDBC) API version 4.1.

Support for Ignoring the “setAutoCommit” Parameter

(Feature available in the second update of the Denodo Platform 6.0, scheduled for April 2016).

The Denodo JDBC driver now has the parameter “autoCommit”. If set to “true”, the driver will ignore the invocations to the methods setAutoCommit(), commit() and rollback() of the JDBC API will be ignored.

E.g. jdbc:vdb://localhost:9999/support?autoCommit=true

Use this parameter if you want to make sure that an application does not start transactions inadvertently. Even with this parameter set to true, an application can start and finish transactions by executing the statements BEGIN, COMMIT and ROLLBACK.

See more about this in the section Parameters of the JDBC Connection URL of the Virtual DataPort Developer Guide.

ODBC Driver

The Denodo Platform now includes an ODBC driver based on the PostgreSQL ODBC driver, with changes specifically tailored to Virtual DataPort that improve the performance of the ODBC interface.

RESTful Web Service: Simpler Format for Conditions with Compound Values

The REST web services now provide a simpler way of filtering by compound values. That is you can provide the value of an array using a JSON document. This is useful if the array has many fields and almost all of them are NULL.

Backward Compatibility Between Updates of the Same Major Version

The Virtual DataPort server now keeps backward compatibility with its clients.

That is, a Virtual DataPort administration tool can connect to a Virtual DataPort server that has a more recent update installed.

This also applies to the clients that connect to Virtual DataPort using the Denodo JDBC driver.

This feature is useful in deployments where there is a large number of developers and clients connecting to the same Server.

In previous versions, the Server and its clients had to have the same update installed.

Consider the following:

  • When the Server has a more recent update installed, the user will not be able to use the new features of the administration tool. However, the existing features will still work.
  • The Server keeps backward compatibility with its clients, but not the other way round. For example, an administration tool with an update installed may not work properly when connecting to a Virtual DataPort server without any updates installed. This scenario is not supported.
  • Connecting to a Virtual DataPort server version 6.0 with an administration of Denodo 5.5 or earlier is not supported either because the compatibility is not kept between major versions.

New Version of the Logging Library: Apache Log4j 2

(Feature available in the second update of the Denodo Platform 6.0, scheduled for April 2016).

In the Virtual DataPort server, the logging library (Log4j) has been upgraded to Log4j version 2, which provides significant improvements over its predecessor Log4j 1.x (used in previous versions).

The main benefit is that it provides asynchronous loggers. This means that storing a log message is no longer a blocking operation. This is very beneficial when the log messages are stored in a database.

In previous versions, logging a message was a blocking operation. This was not an issue when the messages were stored in a text file unless the number of messages was very high (i.e. when the log category com.denodo.vdb is set to TRACE). However, it became a performance issues when the messages were stored on a database.

The syntax of the log4j configuration files, although similar, has changed.

The logging library for the other modules is still Log4j 1.x.