DATA VIRTUALIZATION BASICS

Welcome to the Denodo Platform Basic Tutorials! All new technologies require some kind of training before you can use them capably on your own. Here you will find a complete set of lessons that briefly describe the main topics you need to know to start your Data Virtualization project. To get started, please download the sample data sources included in the Tutorial Files, as we will be using them throughout the tutorials to demonstrate the features of the Denodo Data Virtualization software.

First you have to stop and understand the problem before starting your project. Let's take the following scenario:

  • You are working in a company that has information about its customers (CRM) stored in a MySQL database.
  • The complete billing information for any customer is exposed in an internal Web Service.
  • End users do not want to use different applications to get available information of customers (CRM, Sales Application, etc.).
  • The IT department does not like the idea of creating a specific application for this business need and it would like to reuse this customer global view, if possible, in any other current application.

We said in the previous paragraph: "End users do not want to use different applications to get all the information of customers", so the problem here is to connect disparate data to create a single view of the customers, and the IT deparment wants that single view to be reusable.

If you follow a bad design (not reusing components, creating ad-hoc code/applications, etc.) your systems will grow as in the image below. This is a typical example of exactly what you do not want:

Data virtualization creates a single virtual layer that connects disparate data and provides unified access for consuming applications. These applications will use the semantic components defined in the virtual layer and reuse them as needed. In this way, your applications will be independent from the physical sources where the data is stored.

At the end of this Tutorial you will learn that the Denodo Platform provides:

  • Easy to generate data services.
  • Data Services independent of the physical source(s).
  • A single point to control your data sources.
  • Short and Agile development cycles.
  • Little to no coding!
  • Intuitive solutions to simple needs.
  • Reusabilty of your models by all clients

Before starting this tutorial, make sure you have all the necessary materials by checking the Installation & Bootstrapping section.

If you have already installed all the components, please go ahead to the 5. First Steps section.

Pre-Installation Requirements

This section, Installation & Bootstrapping, shows the prerequisites to start using Denodo and instructions for configuring your Denodo environment and starting to play with Data Virtualization.

Links are mentioned in the notes:

  1. https://community.denodo.com/express
  2. http://dev.mysql.com/downloads/mysql/
  3. http://dev.mysql.com/downloads/connector/j/
  4. http://dev.mysql.com/downloads/tools/workbench/
  5. Tutorial files

Installation Steps

Before you start, be sure you have your development environment set up. You need to:

  1. Install Denodo into a directory (avoid using the %Program Files% folder). This directory will be referred to as <DENODO_HOME> throughout this tutorial.
  2. Import the mysql-connector-java-.jar in to Design Studio. The Denodo Platform does not include the JDBC driver for this adapter. So, upload the jar files of the driver using Design Studio wizard 'File -> Extension management' and selecting "mysql-5" as version.

Refer to the Importing Extensions guide to get more information.

  1. Install & configure the database:
  1. Install MySQL server.
  2. Start MySQL and launch the MySQL Workbench application.
  3. Connect to your MySQL server and then open the /MySQL/schema.sql script by choosing "Open SQL Script" from the File menu.
  1. Once the script has been opened, click on Execute (you can use any other method to load the database).
  2. After doing this, you should see a new database schema called "acme_crm" with three tables defined (address, client and client_type). Test MySQL by logging in to the acme_crm database with the credentials: acme_user / acme_user
  1. Install a web server:
    We are going to use Jetty to run some of the examples in the tutorials:
  1. Go to /jetty and run: java -jar start.jar from the command line. If you do not have a Java Virtual Machine installed on your system you can use the JVM installed with the Denodo Platform under /jre/bin.
  2. Test the billing Web Service to see if it has been properly deployed, direct the web browser to: http://localhost:8080/billing/services. If successfully deployed, you will see there a list of available services "BillProvider", "AdminService", and "Version", along with their exposed methods and an option to view the WSDL descriptor.

This tutorial will incrementally build a set of views in Denodo to learn the basic concepts about Data Virtualization, it is important to follow each step.

Denodo Installation

At this point, you should have already downloaded the Denodo Express installation package from your user account.

The installation package is a .zip file. After decompressing the package you will see the files shown in the image below:

To start the installation GUI:

  • Windows OS: right-click the install.exe file and select Run as administrator.
  • Linux OS: execute install.sh.

The installer will show up.

After accepting the terms of the license, you have to select the installation directory (for example: C:/Denodo/8.0 or /opt/denodo/8.0).

If you already have a Denodo license file, you can select it clicking on the "Browse" button. Otherwise, you can install the license later from the Denodo Control Center

In the next step you have to select the modules to be installed. This tutorial covers every module so we suggest installing them all, but you only need Virtual DataPort to get started.

You can leave the rest of the options with their default values and complete the installation.

Once the installation is complete, you can choose to create a desktop shortcut that can be used to start the Denodo Control Center.

If you did not select any Denodo license during the installation process, you can do it from the Denodo Control Center when you first start the program. (By first clicking on the Configure button and then selecting Use a license file, a new dialog will be opened to select the license file and confirm its installation.


That's all!

Thanks for installing Denodo Express! Now it's time to start playing with Denodo.

Index

In this first tutorial we will cover the most basic functionality to use the Denodo Platform.

The list of topics is the following:

  • Start Denodo server and applications.
  • Discover the Web Design Studio.
  • Create a development database.
  • Create folders for organizing your database.
  • Import relational databases.
  • Execute queries.
  • Create a simple combination view.

Launching Denodo

Denodo is a global solution for heterogeneous and dispersed data source integration using a virtual approach. It can connect to a wide range of data sources like relational databases, web services, XML documents, flat files, multidimensional databases, JSON sources, etc.

Denodo will create wrappers on top of those data sources to create a common interface to access them. Then, a user can combine the data coming from different data sources by defining views, using the Administrator Tool GUI.

The diagram below shows the general architecture of the Denodo Platform:

Control Center

After the Denodo installation, a desktop icon is generated for Denodo 8.0.

First, we have to double-click on that icon to launch the Denodo Platform Control Center.

  1. Click on Virtual DataPort in the top-left.
  2. Click on the blue play button next to Virtual DataPort Server (the first one from the top) to start the Virtual Dataport Server.
  3. When the text before that play button says "Running", start the Web Design Studio by clicking the "Start" button and click on the URL.

Web Design Studio

The Web Design Studio allows the development and administration of your Data Virtualization projects. Specifically, you can perform the following tasks:

  • Create/Edit/Drop Denodo Virtual Databases.
  • Create/Edit/Drop Data Sources.
  • Create/Edit/Drop Views.
  • Publish Data Services.
  • Execute Queries.
  • Add Extensions.
  • Configure the Cache System.
  • Import/Export Metadata.
  • Configure the Denodo Server.

At the end of the previous section you launched the Design Studio. The first screen that the application shows is a login dialog; the credentials you type here will be used to connect to a running Denodo server.

The installation process of the Denodo Platform creates one user by default: Login: admin / Password: admin. Input these credentials now to connect to the Denodo Server.

Check the URL format: //localhost:9999/admin

  • Server (localhost): server name or IP address where Denodo is installed.
  • Port (9999): Denodo port number (this is the default value).
  • Database (admin): database name (by default, "admin" database was created at installation time).

Once you fill the login form, click on the Connect button. The Web Design Studio will show an empty admin and itpilot databases.

Now you can see the four main areas of the Web Design Studio:

  • Menu Bar: where you can find the options for all the sections of the administration tool.
  • Elements Tree: this panel on the left will show a tree with the different components (data sources, views, etc.) created in the database.
  • Quick Search: this box allows the user to filter specific elements in the Elements Tree.
  • Workspace: this is the main panel where the selected element from the Elements Tree and opened views will be shown.

Creating a Virtual Database

We saw in the previous section that we need valid credentials to connect to a Denodo database. In our case, we used the default admin user to connect to the admin database.

Now, we are going to learn how to create another database.

The Denodo server can contain different virtual databases. A virtual database is a schema comprised of data sources, views, stored procedures, web services, etc. Each virtual database is independent of the rest of the virtual databases created in the Denodo server (and different users can have different privileges for each virtual database).

We are going to create a new database called tutorial.

  1. First, click on Administration in the Menu Bar and select the Database Management option.

  1. In the workspace, you will see the predefined Denodo databases (the databases admin and itpilot cannot be dropped):
  1. admin: default database for Denodo Virtual DataPort.
  2. itpilot: default database for Denodo ITPilot.

  1. You are going to start a new project, so you will need to create a new database (this is a best practice):
  1. Click on the New button.
  2. Specify the name of the database: tutorial.
  3. Click Ok.

That's all!

Now, you can follow this Tutorial using this new database, so disconnect your current session (User Icon > Logout admin) and log in again into the tutorial database:

  • Login: admin
  • Password: admin
  • URI: //localhost:9999/tutorial

When changing the user, it is not necessary to disconnect and reconnect. Another option is to navigate through the Elements Tree and simply selecting the tutorial database.

Organizing the Database

Now it is time to create elements in our virtual database. But wait, first we have to stop and think about what are good practices when creating elements. One of the best practices is to have good organization of elements inside of our database.

We are using the Denodo Web Design Studio which offers the option to organize elements inside folders in the Elements Tree, making it easier to work with them.

Denodo Web Design Studio is the new web interface for developers. You can create data sources, base views, derived views, publish web services through Design studio and the layout is similar to the Virtual DataPort Administration Tool. As we have already started the Design studio, lets continue to create folders.

Creating a folder

To create a new folder follow these steps:

  1. Double-click on the database name and click on the dotted lines > New > Folder

  1. Name the folder 1 - first steps.
  2. Later you can simply drag & drop elements into this folder.

In the next section you will learn how to create elements inside this folder.

Connecting to Sources

The MySQL database that you installed in the Installation & Bootstrapping section contains the data of the CRM of a company. This data is split into several tables:

  • a table for client data,
  • a table for client types (a client can be residential or business)
  • and a table for addresses.

The diagram of the organization of this database is the following:

Your goal here is to combine this data using a Data Virtualization approach, this will enable us to create views that are more meaningful for the consumers of the data without having to modify the underlying data source (in many real-world scenarios we are not the owners of the data, just consumers, so changing the data schema will not be possible).

The first step we need to follow to virtualize this relational database is to connect to it using the Denodo Platform. Connecting to the data source will allow us to introspect it and graphically select which of its tables are to be virtualized within the Denodo Platform. Once connected, we will create one base view per table in the CRM.

A base view is a representation, in the Denodo Platform, of existing data in a remote data source. This base view is only metadata that describes how the information is stored and accessed in the original datasource but it does not contain any data. When a base view is queried, the data is retrieved in real-time from the source, unless it is cached (see the Performance tutorial).

Importing Relational Databases

First, let's create the data source for the CRM database. In this case, we will create a JDBC connection to MySQL, but other possibilities are Oracle, Microsoft SQL Server, DB2, PostgreSQL, Hive, Nettezza, Teradata, Denodo VDP, etc.

The recommended way to connect to databases when using Denodo is through JDBC (this is an acronym referring to Java Database Connectivity), so let's start this tutorial creating a new JDBC data source to import a table with a primary key.

In the Installation & Bootstrapping section, you installed a MySQL database server and imported the libraries into the Virtual DataPort Extension management. With this driver added to the Administration Tool, you are ready to create the JDBC Data Source following these steps:

  1. Create two folders nested under the "1 - first steps" folder you have made, one for data sources called "1 - Data Sources" and another for base views called "2 - Base Views".
  2. Right-click on the "1 - Data Sources" folder and select "New > Data source > JDBC".

  1. Name the data source, for example, acme_crm_ds.
  2. Select "MySQL 5" as Database Adapter.
  3. Change the Database URI to match your MySQL installation, for example, jdbc:mysql://localhost:3306/acme_crm
  4. Enter the login and password fields: acme_user / acme_user.
  5. Click on "Test connections".
  6. Click on the "Save" button.
  7. Click the "Create base view" button at the top.

The Design Studio will show the introspected schema of the relational database:

To incorporate some of the tables into the Denodo virtual schema, you have to check the box near the tables or views you want to import. In this case, check address, client and client_type and then click on the Create selected button.

Later, you will be able to query these base views or combine them with other views.

When the importing process is finished, you will see the new views in the elements tree panel. If you double-click on the view name, the schema of the base view will be shown in the workspace.


As you can see, each new base view will be listed in the Elements Tree :

  • In the folder where the view has been created ("2 - "Base Views").

Finally, let's move the base views to the folder that we created for them by dragging them to the "2 - Base Views" folder.

After these steps are completed, we have a virtual representation of our CRM in Denodo. In the next section, we are going to learn how to query it to see how the data comes in real time from our MySQL database, and, after that, we will start creating data combinations that will add semantic value to the client applications that are consuming this data.

Execution of Queries

After creating the base views in the previous section, we are ready to query our Data Virtualization server to retrieve the data from our CRM. Let's learn how to issue queries to the Denodo server.

There are several ways to perform the execution of a view, we will use the "client" view for this example, from the Administration Tool:

Contect menu

To use the Context menu you have to:

  1. Click on the name of the view in the Elements Tree panel and click on More Options(dotted lines)
  2. Select "Execute...".

  1. Then click the Execute button at the bottom of the panel to send the current sentence (shown at the top) to the Denodo server.

View detail window

In this case you have to:

  1. Double-click on the name of the view.
  2. Press the Execution panel button at the top of the workspace.

VQL Shell

Another option is to use the VQL Shell Client:

  1. In the Menu Bar, go to Tools > VQL Shell

  1. Click on the name of the view and choose More Options > select "VQL Shell > Select ..." (a sentence is created in the top-right panel).

  1. Then click the Execute button.

Show Results

In the three examples above, a simple query is executed:

Code:

 SELECT * FROM client

After receiving that query, the Denodo server will create a physical query plan to query the data sources (in this case, your MySQL database) and get the results:

You can filter the results obtained in the output using a WHERE condition in the sentence. This can be done using the graphical interface in the execution window shown when using the Context menu or View detail window options.

In this panel,you have specify the where condition. Click on Ctrl+space for autocomplete

To summarize, now you have learned how to execute queries over Denodo views and how the server is querying the source database in real-time to return the results. In the next section, we are going to learn how to create new views using combinations between existing Denodo views.

Simple Derived Views

Now you are ready to explore the capabilities of the Denodo Platform that make Data Virtualization a very powerful tool. In the previous sections, you have connected to the CRM database and queried its tables. Client

,

client_type and address data is now available within Denodo, so you can start to build a unified view of a customer that will contain the information of the three tables and remove the need for each consumer of the data to create this unified view.

We create this unified view within the virtualization platform so it can be reused by all clients. Having it defined in a single point will make it easier to manage changes to our definition of the customers (For example, if, in the future, we only want to expose residential customers to the consuming applications or the source database is changed.) With Denodo, you will maintain the proper data definition and your final applications won't require any changes.

The way of creating this unified view of a customer is through a JOIN operation, just like it's done in a traditional relational database.

Join Operation

Let's see an example of a derived view creation process using the join Operation. Before we begin, let's create a new folder named 3 - Derived Views to stay organized. Then, right-click in the elements tree and select New > Join. Now, an empty Join View panel will be shown in the Administration Tool workspace.

To select the views on which the join operation is going to be executed, you have to drag & drop them from the list of views that appear on the Elements Tree. As input views are added, the schema of the resulting join view is generated automatically.

In our example, you have to follow these steps:

  1. Drag & drop the client, client_type and address views into the workspace.
  2. When using the Design Studio for join, first we will map the views and then we will map the fields of the view. Drag the client view and connect the arrow to the address field to set one of the join conditions. Now map the fields by dragging the client.client_id to address.client_id on the panel.
  3. Drag the client view and drop the arrow on the client_type. Map the fiels by dragging the client.client_type to client_type.code

  1. Then, go to the Output tab. Here you will see both address and client views are producing the same field with different names: client_id and client_fid. You have to remove one of them selecting the field and clicking on "Remove" button.

  1. Rename the view "personal_data_crm" by typing in the input box labeled "View name" at the top.

  1. Click on the button to save the view.
  2. Drag & drop the new element into our new 3 - Derived Views folder.

The Design Studio will show the schema of the new derived view:

Once these steps are completed, you will have a derived view (virtual, data is not stored in Denodo) that represents the concept of a customer within your organization. This view can be queried in the same way that you did for the base views (check how to in the previous section). Now that this data is defined, your client applications can just retrieve this information directly from the virtualization server without having to define the data combination themselves.

The Denodo Web Design Stduio also provides a full set of relational operations, in addition to the join, to create new views:

  • UNION(extended)
  • UNION(Standard SQL)
  • PROJECTION
  • SELECTION
  • AGGREGATION
  • MINUS/INTERSECTION
  • FLATTEN

These operations can be used in the same way than the Join operation (Right-click > New > OPERATION), you can try yourself!

Index

In the previous section, you learned how to access a customer database to get personal and contact information about the company's customers.

The billing department of the sample company exposes the billing information using an internal web service that exposes all the open bills for each customer: amount due, due date, etc. In this section you will combine the unified customer view that you have built with this billing information so you can have a report that lists the total amount that is due for each of your customers.

List of topics:

  • Create a Web Service data source.
  • Flatten: a hierarchical structure.
  • See how derived views are constructed using Tree View.
  • Create a join view between heterogeneous sources.
  • Create an Aggregation view.

Importing SOAP Web Services

As part of the installation steps, you have deployed a billing web application that exposes several SOAP web services. The different services are available at http://localhost:8080/billing/services. Make sure the billing web service is up and running before following the steps of this section (Installation & Bootstrapping)

These web services have been created by the billing department and expose the information about the customers' bills using three different operations:

  • getBills: returns a list with all the bills in the system.
  • getBillByCustomerID: returns a list with all the bills for the specified input customer id.
  • getBillByPhoneCenter: returns a list with all the bills for the specified input phone center.

All the different operations will return the billing information using a hierarchical structure: the bills will be returned as part of a list. For instance, if you invoke the getBillByCustomerId operation using a customer id as input parameter, you will get a list with all the bills for that customer. For each item (bill) in the list, you will see: the customer id and ssn, the amount due for that bill, the billing start date, end date and due date, the phone center that provided the service for that bill, and the bill id.

In this case, you are going to use the BillProvider service from the list of available services. The descriptor for this service will be available at http://localhost:8080/billing/services/BillProvider?wsdl.

The Denodo Platform can integrate both SOAP and REST web services. For SOAP web services you will use the Web service type of data source, for REST web services you could use either JSON or XML data sources depending on the output format of the REST web service. Follow these steps to create a SOAP data source:

  1. Click on the '1 - Data Sources' folder and select New > Data source > Web service.
  2. In the New Web service Data Source creation wizard enter:

  1. Click on the button (you can leave the default values for the remaining options).

Now you have to click on Create base view button. You will see on the screen a list with the different operations available in the web service and the option of creating a new base view for each one of them.

In this case, you are interested in the getBillByCustomerId operation, so click on the Create Base View link associated with this operation.

In the Creating Web service View for operation <getBillByCustomerId> screen just choose Do not stream output and click Ok.

After the creation of the view, rename the input parameter for the web service operation from in0 to customer_id by simply clicking on the in0 field name.

Click on after these changes.

The web service operation that you just imported has a mandatory input parameter: the customer id.

To reflect this restriction in the data source, the new base view will also have limited query capabilities and when querying the view (clicking the 'Execute' button) we will have to provide a value for this input parameter (customer_id) as a WHERE condition.

Actually, if you try to execute a query from the VQL shell without providing the WHERE condition you will get an error.

If we execute and query and use 'C003' as the value for the customer id we will get the complete list of bills for that customer as the result of the query.

A list of items is represented with an Array data type. You can see in the results of the query that the return column is displayed as Array... If you click on the icon you can see the contents of the array. In this example, you can see the 2 bills that belong to this customer.

Flatten Operation

In the previous section, you have created a base view on top of a SOAP web service data source. This web service returns a hierarchichal structure that includes a return element of type array.

To represent these hierarchical structures, Denodo supports two different complex data types:

  • Registers or Records: a register or record represents an element that is formed by several subelements or subfields. Each one of the subelements in a record will have their own name, data type and value.
  • Arrays: an array represents multi-valued data that is presented as a list of items. An element of the type array can be thought of as a sub-view included in a main view. An array type always has an associated record type that acts like the schema of the sub-view and defines the columns that make up each item in the list.

To simplify array columns, Denodo has the special Flatten operation.

Let's see how to create a new view that flattens the get_bill_by_customer_id base view:

  1. Click on the get_bill_by_customer_id view and select New > Flatten.
  2. In the Flatten View wizard, we will select the element that needs to be flattened. In this case, the only array that we have in the base view is: return.

  1. Go to the Output tab, rename the view to billing_information and also rename the field customer_id to return_customer_id then click on .

The new derived view has the same query capabilites as get_bill_by_customer_id but if you query the new view using the same customer_id = 'C003' condition, you will get the fields that were part of the array in the base view schema. This customer has two pending bills so there are two rows in the result set of the query, one for each bill.

Tree View

In the previous section, you saw how to create new derived views by combining other views. Once you start combining views the complexity of the new views will grow and it will be useful to visualize how those views are built. For this, you have the Tree View functionality.

In the Simple Derived Views section, you created the personal_data_crm view using two join operations over several base views. To see the Tree View you have to right-click on the view name and select Tree View (see image below).

In the tree view you can see how the view has been created. In this example, you can see that the view personal_data_crm is created as a join Join icon between three views: client, address and client_type.

At the same time that the join is performed, a projection Projection icon operation takes place because only a subset of fields is projected.

In addition to the general overview, you can see the details of the different operations involved in the creation of the view as well as the data sources and wrappers information by clicking on the different elements.

For instance, clicking on a join icon will display information about the join type, method used and the join condition.

If you want to save a snapshot for documentation purposes just click on the download image button.

Combination Between Different Data Sources

At the beginning of the tutorial, you saw how to create new views using the join operation but the views involved in the joins were all coming from the same datasource. In this section, you will see how you can create a new join view using the exact same procedure but coming from two different and heterogeneous data sources.

To create the derived view you can follow these steps:

  1. Click on the personal_data_crm view and select New > Join.
  2. Drag & drop the billing_information view to the Join View wizard Model tab.
  3. We will use client_id = customer_id as the join condition so drag & drop a line from the client_id field in the personal_data_crm view to the customer_id field in the billing_information view.

In the Output tab:

  • Name the view: client_with_bills.
  • Remove the field return_customer_id from the output schema of the view (you already have the customer_id field).

  • click Save.

Now, if you execute the new view we will get the information about the bills from the different clients.

Aggregation Operation

In the previous section,you have created a view that obtained the billing information for all the customers in your database. In this view we have one record for each bill, but we want to calculate the total amount due by the customers instead of having the separate bills.

To do so, you can create a new view that aggregates (group by) the different customers using the customer_id to compute the total amount.

To create the new aggregation view you can follow these steps:

  1. Click on the client_with_bills view and select New > Selection.
  2. In the Projection / Selection View wizard go to the Group by tab and enable the Use group by Checkbox.
  3. Click on Ok in the pop-up dialog.
  4. From the list of fields on the left Add the following to the Selected group by fields and expressions box: name, surname and client_id.

  1. Go to the Output tab and rename the view to amount_due_by_client.
  2. Now, you have to create a new field that computes the addition of the amount due for each client. Click on New > New aggr. field button and define a new field called total_amount that will be calculated using the SUM aggregation function and the amount_due field.

Code:

 SUM(CAST('float', client_with_bills.amount_due))

  1. Click Ok to add the new field to the view.
  2. And click to confirm the creation of the view.

Now, if you execute the new view, you will get as result one record per customer with the total billing balance:

Index

In previous sections, you created several views with Denodo to allow client applications to retrieve the information directly from the Denodo server. In particular, you have created the view 'amount_due_by_client' which combines data from several sources and exposes the information about the billing balance of a company's clients.

You already know how to execute queries over that view, but now it's time to connect to the Denodo server from your external applications. Denodo Platform is based on a client-server architecture, where clients issue requests to the server. These requests can be sent using one of the following interfaces:

  • JDBC: Denodo provides its own JDBC driver.
  • ODBC: Denodo provides an ODBC interface (requires the installation of additional components).
  • ADO .Net: Denodo is compatible with the Npgsql ADO.Net provider for PostgreSQL.
  • RESTful Web service (XML, JSON, HTML outputs): useful for applications that cannot use the JDBC or ODBC interfaces to connect to Denodo.

In this section, you will see several ways to access Denodo from external applications:

  • Using a Third Party JDBC Client.
  • Using ODBC.
  • Consume a Denodo RESTful Web service.
  • Browse Linked Data.

JDBC Clients

JDBC (Java DataBase Connectivity) is a Java data access technology from Oracle Corporation. JDBC provides an API for the Java programming language for database-independent connectivity, and it is based on the use of drivers for each database. A client application requires separate drivers, usually vendor supplied, to connect to different types of databases.

Denodo includes a JDBC driver jar file named denodo-vdp-jdbcdriver.jar, and it is located under the /tools/client-drivers/jdbc/ directory.

Accessing Denodo using a JDBC Client

In this section, you are going to see how to access to the Denodo server using a JDBC client. This information is valid for any Java-based application. For example, we will use DBVisualizer (a generic database management tool for developers) but feel free to use any other JDBC client.

The first thing that you have to do when connecting using JDBC is to add the Denodo's JDBC driver to the client application.

To use the JDBC driver in your client, you have to add the .jar file to the classpath of your application.

In DBVisualizer, you have to go to Tools > Driver Manager... and in the Driver Manager window go to Driver > Create Driver and then browse to the VDP's driver file. Use the following driver settings and close the window to save the configuration:

  • Name: Denodo 8.0
  • URL Format: jdbc:vdb://<hostname>:<port>/<database>
  • Driver Class: com.denodo.vdp.jdbc.Driver

Now that you have added the driver, you can configure a connection to your Denodo virtual database. Go to Database > Create Database Connection and use the following settings for the connection:

  • Driver (JDBC): Denodo 8.0
  • Database URL: jdbc:vdb://localhost:9999/tutorial
  • Database Userid: admin
  • Database Password: admin

Click on the Connect button, and you will establish a connection to the tutorial database. In the left panel of the window, you will see that the base views are listed as tables and the derived views are listed as views.

If you run any query from the SQL Commander, you will get the same results that you were seeing before from the Administration Tool. Try, for example:

Code:

 SELECT * FROM client

ODBC Clients

ODBC (Open DataBase Connectivity) is a standard to access databases originally developed by Microsoft. ODBC provides an API to make the code independent of database systems and operating systems.

Denodo provides an ODBC interface, but it requires the installation of the ODBC driver. Like any other ODBC driver, you have to install it on the machine where the client application is running.

In this section you will learn how to access to the Denodo server using an ODBC client. This information is also valid for any other ODBC connection. For the example, we will use MS Excel but feel free to use any other ODBC client.

Creating the DSN

The first thing that we have to do when connecting using ODBC is to install the Denodo ODBC driver. Denodo Platform 8.0 includes an ODBC driver named DenodoODBC and it is located under the \tools\client-drivers\odbc directory. Extract the folders in this directory and run the programs inside to install the drivers. Once this is complete, restart your Virtual DataPort Server from the Denodo Control Panel.

Once you've installed the ODBC driver you will need to add a new user data source:

  1. Go to Control Panel > Administrative Tools > Data Sources (ODBC).
  2. Select Add User DSN or Add System DSN. The difference is that "User DSN" can only be used by the current user and "System DSN" can be used by all the users of the system.
  3. Select the DenodoODBC ANSI or Unicode driver, and click on the Finish button.

In the configuration dialog fill in the following information:

  1. Data Source: name of the ODBC source (e.g. Denodo Tutorial).
  2. Database: database in Denodo. (e.g. tutorial).
  3. Server: host name of the Denodo server (e.g. localhost).
  4. Port: port of the Denodo server (e.g. 9996).
  5. User Name / Password: credentials to connect to Denodo (e.g. admin/admin).

Now, you have to configure some of the Advanced properties by clicking on the Datasource button (a pop-up will open). Select the same options shown in the screenshots below and write "SET QUERYTIMEOUT TO 3600000" in the Connect Settings Box on page 2:

Finally, click the Ok button and then click Save button to finish.

Accessing Denodo using an ODBC client

Now, you have your environment ready to connect to Denodo using ODBC (remember than the previous steps are only valid to connect to the "tutorial" virtual database, so if you want to connect to another database you will have to create a new DSN).

For an example of an ODBC client application you can use the well-known Microsoft Excel. You will only have to select this DSN as a data provider to import the customer data into the spreadsheet.

Please, follow these steps and see the results:

  1. Open a new workbook.
  2. Select Data > From Other Sources > From Microsoft Query
  3. In the pop-up, select "Denodo Tutorial" and click on Ok.

  1. Select "amount_due_by_client" view.

  1. Click the > button in the middle and you should see name, surname, client_id, and total_amount appear underneath "Columns in your query:"
  2. Click on Next (three times) and then Finish.

And there! The results from Denodo are populated into the MS Excel spreadsheet!

RESTFUL Web Service

RESTful Web service apply the ideas of the web to data-delivery, by providing scalable, flexible and stateless access to data assets based on well-known protocols and formats like HTTP, HTML, XML and JSON. Additionally to traditional SQL type access methods such as JDBC or ODBC, all the views in Denodo can be accessed using the RESTful interface.

The Denodo RESTful Web service is an HTTP service deployed by default in the URL http://localhost:9090/denodo-restfulws that exposes resources like databases and views in the following standard representation formats:

  • XML
  • JSON
  • HTML (more user-friendly)

This Web service allows Denodo to work inside applications following the REST service architecture style and provides support for linked data in the enterprise deployment (see next section for more information about this).

When accessed from a browser, the Denodo RESTful endpoint will look like this:

See the examples below to know how to query this service:

Representations

We mentioned at the beginning of this section that Denodo supports three representation formats (XML, JSON and XHTML). You saw in the previous examples the XHTML format using a browser. How can you get the response in the other formats? The answer is easy, by adding a query parameter to the URL:

In the next section, you will learn how to link your Denodo views to allow browsing linked data through the RESTful Web service.

Linked Data

In the previous section, you learned how to issue queries from the RESTful interface of Denodo. Now you will see how to enable linked data using a new Denodo element: associations.

Associations in Denodo

Associations represent a relationship between elements of two Denodo views. The concept is very similar to the Primary Key / Foreign Key restrictions in relational databases.

For example, the elements of the client view can be related with the elements of the address view (every customer is related with an address).

Based on the definition of the associations, the Denodo RESTful Web service will show links that will allow you to traverse the associations. Let's see how it works with an example, by using the views created in previous sections of the tutorial.

How to Create a New Association

To create a new association in Denodo, you can follow these steps:

  1. Right-click inside the elements tree and select New > Association.

  1. Drag & drop the client and address views involved in the association into the workspace.
  2. Link the client_id and client_fid fields that map the association.
  3. Go to the Output tab and give a name for the association: client_address.
  4. You have to provide names for each of the endpoints (Role name fields).
  1. Endpoint 'client': address.
  2. Endpoint 'address': belongs_to_client.

  1. Mark the Referential contraint checkbox: this will allow the association to be exposed like as a Primary Key / Foreign Key restriction (for JDBC and ODBC clients).
  2. Finally, click Save.

Using the Association

Now it's time to return to the RESTful Web service and get the results of the client view: http://localhost:9090/denodo-restfulws/tutorial/views/client (see the previous section for more information about how to query using the RESTful Web Sevice).

As you can see in the screenshot below, a new column with a link is added to the output table and the text of the link is the Role name configured in the association for the endpoint. In this example, if we click on address link for the customer John Smith, Denodo will follow the association and display his address.

In the same way, since we made the association, a belongs_to_client column appears with the address that loads the associated customer information when clicked.

Index

Data Virtualization software accesses and extracts information from target sources at runtime and combines them in real-time to get the results. As you know, no local copy of the data will be available within Denodo.

With this in mind, it is clear that some of the more traditional performance optimization practices used in database and data warehouse implementations, such as index construction, will fall outside the scope of a real-time Data Virtualization framework but strategies such as caching can help to improve the performance of real-time source access and combination goals.

The Denodo advanced cache system is based on a relational database (traditional or in-memory database).

Denodo is an important component of any data management infrastructure, but not the only one. When measuring performance, it is important to make sure which of the elements are bottlenecks. For example, a data source might be returning data in a slow fashion; in some cases you will be able to increase the performance by adding a new index to that source. If these actions cannot be perfomed, you can configure an intelligent caching system in Denodo to speed up your queries.

What are the motivations for using a Cache?

  1. Some data sources might be slow and you want to speed up your queries.
  2. You want to avoid workloads in the data sources.
  3. Pre-computed transformations are done in the Denodo layer, so they do not need to be recomputed every time.
  4. You want to delegate some queries with data coming from several different data sources.
  5. Data sources temporal unavailability (especially when they are external sources).

In this section you will see how to:

  • Configure Denodo to Use Cache.
  • Activate Cache in Base/Derived Views
  • View Some Examples of Different Cache Modes

Cache at Server Level

As you already know, Denodo includes a module to store local copies of the data as required. This cache will use a Relational Database accessible through JDBC protocol (MySQL, Microsoft SQL Server, Oracle, DB2, Netezza, Oracle TimesTen, etc).

To use the cache system it can be enabled at server level or at database level. For this tutorial we will configure the cache at server level. Let's see how to configure the Denodo server to use cache:

  • Note: We will continue to use the Design Studio to perform the cache operations.
  1. Log-in to the Design studio using a global administrator user (for example, the default admin).
  2. In the Menu Bar, go to Administration > Server configuration and then click Cache.

  1. Set the Cache status to On.

  1. The default Embedded Derby server is the Database adapter that will be used this tutorial.

  1. Click Ok.

In the next section, you will learn how to configure what views have to be cached.

Cache at View Level

In the previous section, you activated the cache module in your Denodo server. Now, you need to configure your views to make use of the cache.

For example, let's activate the cache in the client_with_bills view. We have two main reasons to select this view as a cached view:

  • It queries two data sources (a MySQL database and a SOAP Web Service).
  • Usually Web Service response times are worse than traditional databases response times.

How to activate cache for a view from Web Design Studio

Please, follow these steps:

  1. In the Elements Tree, double-click on the view name client_with_bills. The view schema will show in the Workspace.
  2. Click on the Options tab at the top (third tab).
  3. Under Cache mode:, check Partial. It will enable a partial cache for the client_with_bills view.

  1. Leave the rest of fields with the default values.
  2. Finally, click Save.

Now test if the cache works as expected by performing the following test:

  1. Open Tools > VQL Shell.
  2. Execute the following query (make sure you have the tutorial database selected from the drop-down Database menu): SELECT * FROM client_with_bills TRACE
  3. After the execution, click on the Execution Trace button (above the results) to see the query execution plan. You can see that the data comes directly from the different data sources.

  1. Click Execute and perform the same query again.
  2. Click on the Execution Trace button again and you will see that the data comes from the cache (displayed with a box a icon), and the execution time of the query is faster.

In the next section, you will learn more about the available cache modes. In the above example, you configured the view to use Partial mode, in the next section we will learn the the behavior of the different cache modes.

Cache Modes

In the previous section, you activated the cache in one of your views. Now, it's time to learn more about how the Denodo Cache works. Denodo has the following cache modes:

  • Partial:
    the first time a query over the view is executed, the cache table will be populated with the tuples in the output from the datasource. At runtime, when a user queries the view, the Denodo server checks if the cache contains the data required to answer the query. If it does not have this data, Denodo will query the data source and populate the cache with that output.

    When the Time To Live (TTL) of the data has passed, the cache system will invalidate the cached data of the view so the next query will hit the data source.

    This mode supports the following options:
  • With explicit loads: if this option is selected, the cache has to be loaded explicitly.
  • Match exact queries only: if this option is selected, the cache stores the result of each query. Then, if the same query is executed, and the entries of this query in cache have not expired (TTL), the data returned to the client is retrieved from the cache.
  • Full:
    The data of the view is always retrieved from the cache engine instead of from the source, this mode always requires explicit cache loads.

    The main benefit of this mode over the partial cache is that complex operations (joins, unions, group by...) involving several views (even from different data sources) can be delegated to the cache database. Therefore, the performance of these operations is significantly improved.

Cache Examples

Let's see how the cache works using the following example (make sure you have the tutorial database selected from the drop-down Database menu):

Code:

 SELECT * FROM client_with_bills
     WHERE client_id = 'C077'

Run this query and check the Execution Trace with these different cache configurations:

  • With the Cache Off
    The data sources are always queried.
  • With a Partial Cache
    The data source will be queried the first time and subsequent queries will retrieve the data from the cache, if it is available.

    To demonstrate, execute these queries:
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C066' will query the data source again.
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' AND CODE = '02' will query the cache.
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C066' queries the cache now.
  • Partial Cache with Explicit loads
    Denodo server will query the data source until the cache is told to be loaded with CONTEXT('cache_preload' = 'true').
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' will query the data source.
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' CONTEXT('cache_preload' = 'true') will query the data source and load the cache.
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' will now query the cache.
  • Partial cache with Match exact queries only
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' will query the cache.
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' AND CODE = '02' will query the data source and load the cache for this query.
  • SELECT * FROM client_with_bills WHERE CLIENT_ID = 'C077' AND CODE = '02' will query the cache now.
  • Full Cache
    All of the view's underlying data will be loaded into the cache, and all queries will only go to the cache.

Congratulations! You have completed the Denodo Basics Tutorial.

This is just the first step in understanding the Denodo Data Virtualization software. Now, you are prepared to go even further with the rest of our tutorials and become a Master Data Ninja!