Denodo Governance Bridge - User Manual

Overview

Denodo Governance Bridge exports Virtual DataPort (VDP) elements, including the lineage, to IBM® InfoSphere® Information Governance Catalog (IGC) to support governance across Denodo databases.

To achieve this, Denodo Governance Bridge, using the Open IGC API, extends IGC by registering new types of assets from the Denodo catalog:

  • Database-icon.gif Databases
  • DataSource-icon.gif Data Sources  
  • BaseView-icon.gif Base Views
  • DerivedView-icon.gif Derived Views
  • InterfaceView-icon.gif Interface Views
  • Column-icon.gif Columns
  • Association-icon.gif Associations
  • StoredProcedure-bigIcon.gif Stored Procedures
  • Parameter-bigIcon.gif Parameters
  • Folder-icon.gif Folders

These Denodo assets:

  • Are available in IGC via the “detailed browse” option and they appear in the query tool.

  • Can be be governed using Glossary Terms, Governance Rule, Data Stewards, Custom Properties and Collections.

  • Support Data Lineage.

Governance Tool.png

Figure 1 Denodo Governance Bridge architecture

The Denodo Governance Tool bridges the gap between IBM IGC and Denodo VDP metadata repository.

It offers two execution modes:

  1. Interactive mode. The tool loads VDP elements and the flow of data that defines the lineage, into IGC using the IGC REST API. Under the hood, Denodo Governance Bridge generates XMLs from the VDP elements metadata accessible through the Denodo JDBC API. This process is hidden to the user, who only interacts with the web interface.

See the Synchronize VDP with IGC: interactive mode section for a more detailed explanation.

  1. Batch mode. The tool generates the XMLs from the VDP elements and their lineage, but it does not communicate with IBM IGC server. This mode is useful in case you want to post-process the XMLs (e.g. enriching them), before submitting them, manually, to IBM IGC server.

See the Export VDP elements to XML: batch  mode section for a more detailed explanation.

IGC Requirements

Denodo Governance Bridge requires Open IGC, the technology that allows to define new asset types within the Information Governance Catalog, with their own names, icons, custom properties and containment relationships.

Denodo Governance Bridge requires at least:

 

  • Information Server 11.3.1.2 release 19, or

  • Information Server 11.5 rollup 1.1

 

as these versions are the first that allow the possibility to assign more than one parent type to an asset in Open  IGC.

More information about Open IGC is available at

http://www-01.ibm.com/support/docview.wss?uid=swg21699130

Installation

The Denodo Governance Bridge distribution consists of:

  • Command-line executable scripts for Windows and Linux (/bin folder)

  • The IGC bundle file that defines the Denodo asset types (/bundle-igc folder)

  • Configuration files: application.properties and log4j2.xml (/conf folder)

  • A sample JSON file required for the batch mode (/conf folder), see Export VDP elements to XML: batch mode section.

  • Java libraries (/lib folder)
  • Denodo Governance Bridge application jar: denodo-governance-bridge-<version>-jar

  • Denodo driver jar: denodo-vdp-jdbcdriver-dist-<version>-full.jar 

If you need to use a different Denodo driver version that the one that is distributed, you have to replace this jar by the Denodo driver of the proper version.

For installing the Denodo Governance Bridge, just download the .zip file and extract the tool into the desired folder.

For running it, you need Java 8 and the environment variables JAVA_HOME and PATH correctly configured.

After running the script denodo-governance-bridge.sh|bat in the /bin folder, point your browser to http://localhost:10099 to access the home page:

       

 Figure 2 Denodo Governance Bridge home page

Enable SSL Connection

When SSL is enabled in the IGC server, Denodo Governance Bridge has to trust the public key of the IGC server. For this to happen, you need to import the certificate into the trust store of the Denodo Governance Bridge.

  1. Copy the certificate file from the IGC Server to the Denodo Governance Bridge computer.

  1. Locate the Java Runtime Environment (JRE) the Denodo Governance Bridge uses.

  1. Open a command line there and execute this:

keytool -importcert -alias some_alias

    -file IGC_SERVER_PUBLIC_KEY.cer

    -keystore jre\lib\security\cacerts 

    -storepass "changeit" -noprompt

Usage

Register Denodo Asset Types in IGC

In the interactive mode, the Denodo Governance Bridge registers in IGC the Denodo asset types transparently, before the first synchronization between a VDP database and IGC.

In the batch mode, the user should register, manually, the Denodo asset types before submitting the XML files to IGC.

For this, go to https://<IGC-SERVER>/ibm/iis/igc-rest-explorer, click on bundle and choose the POST /bundles/ to register the Denodo assets: the file is located in the bundle-igc folder of the Denodo Governance Bridge distribution.

Figure 3  Register bundle in IGC Rest explorer

After the registration (manual or transparent), Denodo asset types are displayed in the IGC browser in the Denodo Models group:

Figure 4 Denodo type assets in IGC

Synchronize VDP with IGC: interactive mode

First of all, the Denodo Governance Bridge needs the connection details to communicate with the Denodo VDP server:

  • Host

  • Port

  • A database the login user has privileges to connect to.

Once the authentication is complete, the user could choose a database to synchronize with IGC among all the databases the user can connect to.

  • Login/password

Take into account that for the elements involved in the synchronization process, the login user should have:

  • connect privileges over the databases

  • for views and stored procedures

  • write privileges in Denodo 7.0

  • read privileges in Denodo 6.0

  • for the rest of elements

  • metadata privileges in Denodo 7.0

  • read privileges in Denodo 6.0

Figure 5 VDP server connection

Then, the process to synchronize a Denodo VDP model with IGC is a five-step process.

Step 1. Users select which database in the VDP server they are interested in.

Figure 6 Databases listing

Step 2. The tool inspects the database and let users choose the elements to export to IGC.

Figure 7 Database elements tree

Figure 8 Database elements selection

Step 3. The tool informs the users about the transitive dependencies for the selected elements. These dependencies will be also exported to IGC.

In the picture below we can see that the derived view web_logs_all_with_blokinfo depends of:

  • Derived views: 
  • web_logs_all
  • linux_web_logs
  • nonlinux_web_logs_impala

  • Base views:
  • impala_web_logs
  • hbase_blocklistip

  • DataSources:
  • impala_ds
  • hbase_ds

All these VDP elements will be present in IGC at the end of the process.

Figure 9 Transitive dependencies are also synchronized

Step 4. Users identify which IGC Database assets are connected to Denodo JDBC DataSources to complete their lineage path. This is an optional step.

For Denodo JDBC BaseViews and Columns it is possible to identify which other Database assets that have been formerly imported into the Information Server Repository, participate in the lineage report. These other Database assets are Host, Database, Database Schema, Database Table, View and Database Column.

Figure 10 IGC Database assets

To achieve this, the user can identify the Hosts  and the Databases in IGC, that are related with each Denodo DataSource of type JDBC that will be exported, as the Denodo Governance Bridge cannot deduce this information from the JDBC URI of Denodo DataSources.

Figure 11 Map Denodo JDBC assets with IGC Database assets

For each Denodo JDBC DataSource that the user maps to IGC Host  and Database  assets, the Denodo Governance Bridge will export the lineage from its Denodo JDBC BaseViews to IGC Database Tables  (or Views ) and from its Denodo Columns to IGC Database Columns--only for Denodo BaseViews that were not built using a SQL statement in Denodo VDP.

Important

When the property enable.virtual.assets of the conf/application.properties file is false (the default value), the IGC Hosts  and Databases  entered in this step must refer to existing assets in IGC. In the same way, Database Tables , Views  and Database Columns referred by Denodo JDBC BaseViews and Columns, must exist in IGC and belong to those Hosts/Databases; otherwise the IGC REST API will complain with the following error:

 

Instead, when the property enable.virtual.assets of the conf/application.properties file is true if some of the IGC Database asset referred by Denodo assets does not exist in IGC a virtual asset will be created in IGC.

Virtual assets are only displayed in lineage reports and in the Usage Information section of the assets Details page. The icons for virtual assets are lighter-colored icons than the ones which are used for assets in the catalog.

Step 5. Users give the details to connect to an IGC server.

The registration of the Denodo bundle in IGC is done transparently by the Governance Bridge, before the first synchronization between a VDP database and IGC. Later updates of the Governance Bridge could need to modify the Denodo bundle; the Update Denodo bundle version in IGC? checkbox forces the update of the Denodo bundle in IGC. This checkbox should only be checked the first time a synchronization process is executed after installing a Governance Bridge update.

Figure 12 IGC server connection

In the end, the tool shows all the elements that were synchronized to the IGC, the selected and the dependent ones.

The flow of data that defines the lineage is not explicitly displayed, but it will also be loaded in the IGC server.

Figure 13 Synchronization complete

Let’s show the synchronization process result in the IGC side.

The image below shows the details of the web_logs_all_with_blockinfo derived view:

  • the database it belongs to

  • the columns assets that it contains

  • the VQL expression that defined it

  • etc.

Figure 14 Derived view in IGC

And, these two images below display the data lineage for the same view, web_logs_all_with_blockinfo.

Figure 15 Derived view lineage in IGC

Figure 16  After drilling down on the derived view lineage

Data lineage view in IGC is similar to the Data lineage view available in the VDP Admin Tool. The following image shows the VDP data lineage for the web_logs_all_with_blockinfo derived view.

Figure 17  Data lineage view in VDP

Export VDP elements to XML: batch mode

The batch mode of the Denodo Governance Bridge offers a service endpoint that exports the assets and flows to XML, but it does not communicate with the IBM IGC server. This execution mode does not involve user interaction with its web interface.

This mode is useful in case you want to post-process the XMLs (e.g. enriching them with additional data in an ad-hoc process), before submitting them to the IBM IGC server.

For this batch mode you need that the Denodo Governance Bridge is up and running (script denodo-governance-bridge.sh|bat). Then,  execute the script denodo-export-governance-bridge.sh|bat you will find in the bin directory of the distribution:

$ cd denodo-governance-bridge-<VERSION>

$ bin/denodo-export-governance-bridge.sh conf/input.json

This script uses curl for invoking the export service endpoint of the Denodo Governance Bridge. You can check if you have curl installed in your system using the command:

$ curl --version

        

If curl is not there you can install it from https://curl.haxx.se/dlwiz/.

The denodo-export-governance-bridge.sh|bat script needs a JSON file with the configuration:

input.json sample file

{

 "vdpServerUrl":"//host:port/db",

 "login":"userlogin",

 "password":"ENC(encrypted_password)",

 "outputFile":"dir/file.xml",

 "startDateISO8601":"2019-06-04T14:20:00+02:00",

 "endDateISO8601":"date in ISO 8601 format",

 "mappings":

   [

     {"datasource": "denodo_datasource1", "host":"igc_host1", 

         "database":"igc_database1"},

     {"datasource": "denodo_datasource2", "host":"igc_host2", 

         "database":"igc_database2"}

   ]

}

The parameters in the JSON file are:

  • vdpServerUrl (required): //host:port/database. All the database elements will be exported, filtered by date if any is supplied.

  • login (required): the login user, that must have privileges to connect to the database that is being exported.

  • password (required): the password user. It could be encrypted or clear. See the How to encrypt passwords section for a detailed explanation.

  • outputFile (required): the file and directory where the XML files will be written.

The export process will generate two files:

  • one for the assets, adding the suffix "_assets_timestamp"

  • one for the flows, adding the suffix "_flows_timestamp"

E.g.: if the outputFile is C:/export/denodo_sakila.xml. The result files will be:

  • C:/export/denodo_sakila_assets_2019-10-23-124942.xml

  • C:/export/denodo_sakila_flows_2019-10-23-124942.xml

Note that the user account running the Denodo Governance Bridge tool needs write privileges in the output folder.

  • startDateISO8601 (optional): The tool will export VDP elements  that were created or modified after the specified date. The date  should be specified in ISO 8601 format, e.g. 2019-06-04T14:20:00+02:00.

  • endDateISO8601 (optional): The tool will export VDP elements that were created or modified before the specified date. The date should be specified in ISO 8601 format, e.g. 2019-06-05T14:20:00+02:00.

When providing a value for startDateISO8601 and endDateISO8601, the VDP elements created or modificated between the two intervals will be exported.

If
startDateISO8601 is not provided, all the VDP elements that were created or modified before endDateISO8601 will be exported.

If
endDateISO8601 is not provided, all the VDP elements that were created or modified after startDateISO8601 will be exported.

If neither startDateISO8601 nor endDateISO8601 are provided, all the VDP elements of the database will be exported.

  • mappings (optional): the IGC Host  and the IGC Database  that are related with each Denodo DataSource of type JDBC that will be exported. This information is required for linking Denodo lineage with other existing assets as the Governance Bridge cannot deduce this information from the JDBC URI of the Denodo DataSource.

This field is mandatory if you want to complete the lineage path between Denodo JDBC DataSources and other IGC Database assets that have been formerly imported into the Information Server Repository. These other Database assets are Host , Database , Database Schema , Database Table , View  and Database Column .

For each Denodo JDBC DataSource that the user maps to IGC Host  and Database  assets, the Denodo Governance Bridge will export the lineage from its Denodo JDBC BaseViews to IGC Database Tables  (or Views ) and from its Denodo Columns  to IGC Database Columns--only for Denodo BaseViews that were not built using a SQL statement in Denodo VDP.

Important

When the property enable.virtual.assets of the conf/application.properties file is false (the default value), the IGC Hosts  and Databases  entered in the input.json file must refer to existing assets in IGC. In the same way, Database Tables , Views  and Database Columns  referred by Denodo JDBC BaseViews and Columns, must exist in IGC and belong to those Hosts/Databases; otherwise when submitting the XML files to the IGC server, the IGC REST API will complain with the following error:

{

  "errors": [

    {

       "code": "CDIGC1072E",

       "message": "Tried uploading flows of a flow unit which is  

                   missing from the metadata server. Asset

                   ID=\"ID243\".",

       "explanation": "A flow unit asset must get imported before

                       uploading its flows.",

       "userResponse": "First, import this flow unit, and then

                        try to upload its flows again."

    }

  ]

}

 

Instead, when the property enable.virtual.assets of the conf/application.properties file is true if some of the IGC Database asset referred by Denodo assets does not exist in IGC a virtual asset will be created in IGC.

Virtual assets are only displayed in lineage reports and in the Usage Information section of the assets Details page. The icons for virtual assets are lighter-colored icons than the ones which are used for assets in the catalog.

The output of the script includes the HTTP response headers. You can check the HTTP status code to see if the export process was OK:  

$ bin/denodo-export-governance-bridge.sh conf/input.json

HTTP/1.1 200

Content-Type: text/plain;charset=UTF-8

Content-Length: 21

Date: Tue, 8 June 2019 15:44:24 GMT

Successfully exported

Or it failed:

$ bin/denodo-export-governance-bridge.sh conf/input.json

HTTP/1.1 500

Content-Type: application/json;charset=UTF-8

Transfer-Encoding: chunked

Date: Tue, 8 June 2019 15:58:35 GMT

Connection: close

{"timestamp":1559059115009,"status":500,"error":"Internal Server Error","exception":"org.springframework.jdbc.CannotGetJdbcConnectionException","message":"Could not get JDBC Connection; nested exception is java.sql.SQLException: authentication error: Database 'test' not found","path":"/export"}

How to submit XMLs to the IBM IGC server

Before submitting the XMLs files to the IBM IGC server the Denodo assets should have been registered in IGC. See Register Denodo Asset Types in IGC section for a detailed explanation.

Assets should be uploaded before flows to IGC. To submit the assets, go to https://<IGC-SERVER>/ibm/iis/igc-rest-explorer, click on bundles, choose POST /bundles/assets and paste the content of the file generated by the exportation process that ends with the suffix: _assets_timestamp.

Figure 18  Upload assets in IGC Rest explorer

To submit the flows, go to https://<IGC-SERVER>/ibm/iis/igc-rest-explorer, click on flows, choose POST /flows/uploads and paste the content of the file generated by the exportation process that ends with the suffix: _flows_timestamp.

Figure 19  Upload flows in IGC Rest explorer

How to encrypt passwords

The Denodo Governance Bridge expects encrypted passwords in the input.json file, to appear surrounded by ENC(...). You can compute these values using the Jasypt CLI tools, and use the DENODO_EXPORT_ENCRYPTION_PASSWORD environment variable, or Java system property, to communicate the encryption password to the Denodo Governance Bridge.

This way, you can use encrypted passwords in the input.json file:

...

"password":"ENC(s2FdirMK4QORq1HZ6tcTTQ==)"

...

 

These are the steps for encrypting passwords:

  1. Download Jasypt CLI tools.

  1. Choose an encryption password, e.g., mypassword.

  1. Go to jasypt/bin.

  1. Run encrypt.bat with the input parameter and password parameter:

  • input parameter - this is the string you want to encrypt.

  • password parameter - this is the password that Jasypt is going to use to encrypt and decrypt the input parameter.

Your command should look like this:


        

Take note of the output. Example output: zrass64ls4LIx5hdFoXXyA==.

  1. Open your input.json file, replace the password you want to encrypt with  the output from Step 4: ENC(zrass64ls4LIx5hdFoXXyA==).

Before Jasypt:

"password":"admin"

After Jasypt

"password":"ENC(zrass64ls4LIx5hdFoXXyA==)"

  1. Add an environment variable, or Java system property to the Denodo Governance Bridge start script, with the name DENODO_EXPORT_ENCRYPTION_PASSWORD, and value of mypassword, but use your real encryption password.

  1. Run the denodo-export-governance-bridge.sh|bat script.

Limitations

View deletion

Successive exports of the same database (or folder) accomplished by the Denodo Governance Bridge do not delete views in IGC that no longer exist in Denodo VDP. In these cases the user should delete these views manually in IGC.

Lineage between Denodo elements and other IGC elements

The Denodo Governance Bridge exclusively manages Denodo elements and, in case of Denodo JDBC elements, their relation with native IGC Database assets. It is not aware of the presence of other metadata elements for the Denodo data source, elements that could be present in IGC or not, and managed in ways that might not be recognizable by external tools such as the Denodo Governance Bridge.

If this limitation affects your scenario, see the Export VDP elements to XML: batch mode section for an alternative that could fit your needs.