Denodo Governance Bridge for IGC - User Manual
You can translate the document:
This is the user manual for Denodo Governance Bridge for IGC. If you are using Collibra Platform, check the Denodo Governance Bridge for Collibra - User Manual. Alternatively, if you are using Microsoft Purview, refer to the Denodo Governance Bridge for Microsoft Purview - User Manual.
Overview
The Denodo Governance Bridge for IGC exports Virtual DataPort (VDP) elements, including the lineage, to IBM® InfoSphere® Information Governance Catalog (IGC).
To achieve this, the Denodo Governance Bridge for IGC, using the Open IGC API, extends IGC by registering new types of assets from the Denodo catalog:
- Databases
- Data Sources
- Base Views
- Derived Views
- Interface Views
- Columns
- Associations
- Stored Procedures
- Parameters
- Folders
These Denodo assets:
- Are available in IGC via the “detailed browse” option and they appear in the query tool.
- Can be governed using Glossary Terms, Governance Rule, Data Stewards, Custom Properties and Collections.
- Support Data Lineage.
Figure 1 Architecture of the Denodo Governance Bridge for IGC
The Denodo Governance Tool bridges the gap between IBM IGC and Denodo VDP metadata repository.
It offers two execution modes:
- Interactive mode. The tool loads VDP elements and the flow of data that defines the lineage, into IGC using the IGC REST API with Basic Authentication. Under the hood, the Denodo Governance Bridge for IGC generates XMLs from the VDP elements metadata accessible through the Denodo JDBC API. This process is hidden to the user, who only interacts with the web interface.
See the Synchronize Denodo VDP with IGC: interactive mode section for a more detailed explanation.
- Batch mode. The tool generates the XMLs from the VDP elements and their lineage, but it does not communicate with IBM IGC server. This mode is useful in case you want to post-process the XMLs (e.g. enriching them), before submitting them, manually, to IBM IGC server.
See the Export VDP elements to XML: batch mode section for a more detailed explanation.
Requirements
The Denodo Governance Bridge for IGC requires Java 8 or later and the environment variables JAVA_HOME and PATH correctly configured. It also needs Open IGC, the technology that allows to define new asset types within the Information Governance Catalog, with their own names, icons, custom properties and containment relationships.
The Denodo Governance Bridge for IGC requires at least:
- Information Server 11.3.1.2 release 19, or
- Information Server 11.5 rollup 1.1
as these versions are the first that allow the possibility to assign more than one parent type to an asset in Open IGC.
More information about Open IGC is available at
http://www-01.ibm.com/support/docview.wss?uid=swg21699130
Installation
The Denodo Governance Bridge distribution consists of three folders that contains the distribution for each module:
- denodo-igc-governance-bridge-<version> for the Denodo Governance Bridge for IGC.
- denodo-collibra-governance-bridge-<version> for the Denodo Governance Bridge for Collibra.
- denodo-purview-governance-bridge-<version> for the Denodo Governance Bridge for Microsoft Purview.
Installation of the Denodo Governance Bridge for IGC
The distribution of the Denodo Governance Bridge for IGC consists of:
- Command-line executable scripts for Windows and Linux (/bin folder).
- The IGC bundle file that defines the Denodo asset types (/bundle-igc folder)
- Configuration files: application.properties and log4j2.xml (/conf folder)
- A sample JSON file required for the batch mode (/conf folder), see Export VDP elements to XML: batch mode section.
- Java libraries (/lib folder)
- Governance Bridge application jar: denodo-igc-governance-bridge-<version>-jar
- Denodo driver jar: denodo-vdp-jdbcdriver-dist-<version>-full.jar
If you need to use a different Denodo driver version than the one that is distributed, you have to replace this jar with the Denodo driver of the proper version.
In order to install the Denodo Governance Bridge for IGC, just download the .zip file and extract the tool into the desired folder.
After running the script denodo-igc-governance-bridge.sh|bat in the /bin folder, point your browser to http://localhost:10099 to access the home page:
Figure 2 Home page of the Denodo Governance Bridge for IGC
Note that there are, in the /bin folder, some deprecated scripts ( denodo-governance-bridge.sh|bat, denodo-export-governance-bridge.sh|bat) in order to avoid compatibility issues. They will be removed in future versions.
Enable SSL Connection
SSL in IGC server
When SSL is enabled in the IGC server, the Denodo Governance Bridge for IGC has to trust the public key of the IGC server. For this to happen, you need to import the certificate into the trust store of the Denodo Governance Bridge for IGC.
- Copy the certificate file from the IGC Server to the Denodo Governance Bridge for IGC computer.
- Locate the Java Runtime Environment (JRE) the Denodo Governance Bridge for IGC uses.
- Open a command line there and execute this:
keytool -importcert -alias some_alias -file IGC_SERVER_PUBLIC_KEY.cer -keystore jre\lib\security\cacerts -storepass "changeit" -noprompt |
SSL in the Denodo Governance Bridge for IGC
Configuring the Denodo Governance Bridge for IGC with HTTPs requires two steps:
- Obtaining an SSL certificate
- Configuring SSL in Denodo GraphQL Service
In this section we focus on the second step, as it is the one that affects the Governance Bridge configuration.
For enabling HTTPS in the Denodo Governance Bridge for IGC you have to add Spring Boot server.ssl.* properties to the config/application.properties file. Here is an example:
# custom port instead of the default 10099 server.port=8443 # path to the key store that holds the SSL certificate server.ssl.key-store=path_to_keystore.jks # password used to generate the certificate server.ssl.key-store-password=secret |
Usage
Register Denodo Asset Types in IGC
In the interactive mode, the Denodo Governance Bridge for IGC registers in IGC the Denodo asset types transparently, before the first synchronization between a VDP database and IGC.
In the batch mode, the user should register, manually, the Denodo asset types before submitting the XML files to IGC.
For this, go to https://<IGC-SERVER>/ibm/iis/igc-rest-explorer, click on bundle and choose the POST /bundles/ to register the Denodo assets: the file is located in the bundle-igc folder of the distribution of the Denodo Governance Bridge for IGC.
Figure 3 Register bundle in IGC Rest explorer
After the registration (manual or transparent), Denodo asset types are displayed in the IGC browser in the Denodo Models group:
Figure 4 Denodo type assets in IGC
Synchronize Denodo VDP with IGC: interactive mode
First of all, the Denodo Governance Bridge for IGC needs the connection details to communicate with the Denodo VDP server:
- Host
- Port
- A database the login user has privileges to connect to.
Once the authentication is complete, the user could choose a database to synchronize with IGC among all the databases the user can connect to.
- Login/password
Take into account that for the elements involved in the synchronization process, the login user should have:
- connect privileges over the databases
- for views and stored procedures
- write privileges in Denodo 7.0
- read privileges in Denodo 6.0
- for the rest of elements
- metadata privileges in Denodo 7.0
- read privileges in Denodo 6.0
Figure 5 VDP server connection
Then, the process to synchronize a Denodo VDP model with IGC is a five-step process.
Step 1. Users select which database in the VDP server they are interested in.
Figure 6 Databases listing
Step 2. The tool inspects the database and lets users choose the elements to export to IGC.
Figure 7 Database elements tree
Figure 8 Database elements selection
Step 3. The tool informs the users about the transitive dependencies for the selected elements. These dependencies will also be exported to IGC.
In the picture below we can see that the derived view web_logs_all_with_blokinfo depends of:
- Derived views:
- web_logs_all
- linux_web_logs
- nonlinux_web_logs_impala
- Base views:
- impala_web_logs
- hbase_blocklistip
- DataSources:
- impala_ds
- hbase_ds
All these VDP elements will be present in IGC at the end of the process.
Figure 9 Transitive dependencies are also synchronized
Step 4. Users identify which IGC Database assets are connected to Denodo JDBC DataSources to complete their lineage path. This is an optional step.
For Denodo JDBC BaseViews and Columns it is possible to identify which other Database assets that have been formerly imported into the Information Server Repository, participate in the lineage report. These other Database assets are Host, Database, Database Schema, Database Table, View and Database Column.
Figure 10 IGC Database assets
To achieve this, the user can identify the Hosts and the Databases in IGC, that are related with each Denodo DataSource of type JDBC that will be exported, as the Denodo Governance Bridge for IGC cannot deduce this information from the JDBC URI of Denodo DataSources.
Figure 11 Map Denodo JDBC assets with IGC Database assets
For each Denodo JDBC DataSource that the user maps to IGC Host and Database assets, the Denodo Governance Bridge for IGC will export the lineage from its Denodo JDBC BaseViews to IGC Database Tables (or Views ) and from its Denodo Columns to IGC Database Columns--only for Denodo BaseViews that were not built using a SQL statement in Denodo VDP.
Important |
When the property enable.virtual.assets of the conf/application.properties file is false (the default value), the IGC Hosts and Databases entered in this step must refer to existing assets in IGC. In the same way, Database Tables , Views and Database Columns referred by Denodo JDBC BaseViews and Columns, must exist in IGC and belong to those Hosts/Databases; otherwise the IGC REST API will complain with the following error:
Instead, when the property enable.virtual.assets of the conf/application.properties file is true if some of the IGC Database assets referred by Denodo assets does not exist in IGC a virtual asset will be created in IGC. Virtual assets are only displayed in lineage reports and in the Usage Information section of the assets Details page. The icons for virtual assets are lighter-colored icons than the ones which are used for assets in the catalog. |
Step 5. Users give the details to connect to an IGC server, using Basic Authentication under the hood.
The registration of the Denodo bundle in IGC is done transparently by the Governance Bridge, before the first synchronization between a VDP database and IGC. Later updates of the Governance Bridge could need to modify the Denodo bundle; the Update Denodo bundle version in IGC? checkbox forces the update of the Denodo bundle in IGC. This checkbox should only be checked the first time a synchronization process is executed after installing a Governance Bridge update.
Figure 12 IGC server connection
In the end, the tool shows all the elements that were synchronized to the IGC, the selected and the dependent ones.
The flow of data that defines the lineage is not explicitly displayed, but it will also be loaded in the IGC server.
Figure 13 Synchronization complete
Let’s show the synchronization process result on the IGC side.
The image below shows the details of the web_logs_all_with_blockinfo derived view:
- the database it belongs to
- the columns assets that it contains
- the VQL expression that defined it
- etc.
Figure 14 Derived view in IGC
And, these two images below display the data lineage for the same view, web_logs_all_with_blockinfo.
Figure 15 Derived view lineage in IGC
Figure 16 After drilling down on the derived view lineage
Data lineage view in IGC is similar to the Data lineage view available in the VDP Admin Tool. The following image shows the VDP data lineage for the web_logs_all_with_blockinfo derived view.
Figure 17 Data lineage view in VDP
Export VDP elements to XML: batch mode
The batch mode of the Denodo Governance Bridge for IGC offers a service endpoint that exports the assets and flows to XML, but it does not communicate with the IBM IGC server. This execution mode does not involve user interaction with its web interface.
This mode is useful in case you want to post-process the XMLs (e.g. enriching them with additional data in an ad-hoc process), before submitting them to the IBM IGC server.
For this batch mode you need that the Denodo Governance Bridge for IGC is up and running (script denodo-igc-governance-bridge.sh|bat). Then, execute the script denodo-igc-export-governance-bridge.sh|bat you will find in the bin directory of the distribution:
$ cd denodo-governance-bridge-<VERSION> $ bin/denodo-igc-export-governance-bridge.sh conf/input.json |
This script uses curl for invoking the export service endpoint of the Denodo Governance Bridge for IGC. You can check if you have curl installed in your system using the command:
$ curl --version |
If curl is not there you can install it from https://curl.haxx.se/dlwiz/.
The denodo-igc-export-governance-bridge.sh|bat script needs a JSON file with the configuration:
input.json sample file
{ "vdpServerUrl":"//host:port/db", "login":"userlogin", "password":"ENC(encrypted_password)", "outputFile":"dir/file.xml", "startDateISO8601":"2019-06-04T14:20:00+02:00", "endDateISO8601":"date in ISO 8601 format", "mappings": [ {"datasource": "denodo_datasource1", "host":"igc_host1", "database":"igc_database1"}, {"datasource": "denodo_datasource2", "host":"igc_host2", "database":"igc_database2"} ] } |
The parameters in the JSON file are:
- vdpServerUrl (required): //host:port/database. All the database elements will be exported, filtered by date if any is supplied.
- login (required): the login user, that must have privileges to connect to the database that is being exported.
- password (required): the password user. It could be encrypted or clear. See the How to encrypt passwords section for a detailed explanation.
- outputFile (required): the file and directory where the XML files will be written.
The export process will generate two files:
- one for the assets, adding the suffix "_assets_timestamp"
- one for the flows, adding the suffix "_flows_timestamp"
E.g.: if the outputFile is C:/export/denodo_sakila.xml. The result files will be:
- C:/export/denodo_sakila_assets_2019-10-23-124942.xml
- C:/export/denodo_sakila_flows_2019-10-23-124942.xml
Note that the user account running the Denodo Governance Bridge for IGC needs write privileges in the output folder.
- startDateISO8601 (optional): The tool will export VDP elements that were created or modified after the specified date. The date should be specified in ISO 8601 format, e.g. 2019-06-04T14:20:00+02:00.
- endDateISO8601 (optional): The tool will export VDP elements that were created or modified before the specified date. The date should be specified in ISO 8601 format, e.g. 2019-06-05T14:20:00+02:00.
When providing a value for startDateISO8601 and endDateISO8601, the VDP elements created or modified between the two intervals will be exported.
If startDateISO8601 is not provided, all the VDP elements that were created or modified before endDateISO8601 will be exported.
If endDateISO8601 is not provided, all the VDP elements that were created or modified after startDateISO8601 will be exported.
If neither startDateISO8601 nor endDateISO8601 are provided, all the VDP elements of the database will be exported.
- mappings (optional): the IGC Host and the IGC Database that are related with each Denodo DataSource of type JDBC that will be exported. This information is required for linking Denodo lineage with other existing assets as the Governance Bridge cannot deduce this information from the JDBC URI of the Denodo DataSource.
This field is mandatory if you want to complete the lineage path between Denodo JDBC DataSources and other IGC Database assets that have been formerly imported into the Information Server Repository. These other Database assets are Host , Database , Database Schema , Database Table , View and Database Column .
For each Denodo JDBC DataSource that the user maps to IGC Host and Database assets, the Denodo Governance Bridge for IGC will export the lineage from its Denodo JDBC BaseViews to IGC Database Tables (or Views ) and from its Denodo Columns to IGC Database Columns--only for Denodo BaseViews that were not built using a SQL statement in Denodo VDP.
Important |
|
When the property enable.virtual.assets of the conf/application.properties file is false (the default value), the IGC Hosts and Databases entered in the input.json file must refer to existing assets in IGC. In the same way, Database Tables , Views and Database Columns referred by Denodo JDBC BaseViews and Columns, must exist in IGC and belong to those Hosts/Databases; otherwise when submitting the XML files to the IGC server, the IGC REST API will complain with the following error:
Instead, when the property enable.virtual.assets of the conf/application.properties file is true if some of the IGC Database assets referred by Denodo assets does not exist in IGC a virtual asset will be created in IGC. Virtual assets are only displayed in lineage reports and in the Usage Information section of the assets Details page. The icons for virtual assets are lighter-colored icons than the ones which are used for assets in the catalog. |
The output of the script includes the HTTP response headers. You can check the HTTP status code to see if the export process was OK:
$ bin/denodo-igc-export-governance-bridge.sh conf/input.json HTTP/1.1 200 Content-Type: text/plain;charset=UTF-8 Content-Length: 21 Date: Tue, 8 June 2019 15:44:24 GMT Successfully exported |
Or it failed:
$ bin/denodo-igc-export-governance-bridge.sh conf/input.json HTTP/1.1 500 Content-Type: application/json;charset=UTF-8 Transfer-Encoding: chunked Date: Tue, 8 June 2019 15:58:35 GMT Connection: close {"timestamp":1559059115009,"status":500,"error":"Internal Server Error","exception":"org.springframework.jdbc.CannotGetJdbcConnectionException","message":"Could not get JDBC Connection; nested exception is java.sql.SQLException: authentication error: Database 'test' not found","path":"/export"} |
How to submit XMLs to the IBM IGC server
Before submitting the XMLs files to the IBM IGC server the Denodo assets should have been registered in IGC. See Register Denodo Asset Types in IGC section for a detailed explanation.
Assets should be uploaded before flows to IGC. To submit the assets, go to https://<IGC-SERVER>/ibm/iis/igc-rest-explorer, click on bundles, choose POST /bundles/assets and paste the content of the file generated by the exportation process that ends with the suffix: _assets_timestamp.
- REST endpoint is https://<IGC-SERVER>/ibm/iis/igc-rest/v1/bundles/assets.
Figure 18 Upload assets in IGC Rest explorer
To submit the flows, go to https://<IGC-SERVER>/ibm/iis/igc-rest-explorer, click on flows, choose POST /flows/uploads and paste the content of the file generated by the exportation process that ends with the suffix: _flows_timestamp.
- REST endpoint is https://<IGC-SERVER>/ibm/iis/igc-rest/v1/flows/upload.
Figure 19 Upload flows in IGC Rest explorer
Limitations
View deletion
Successive exports of the same database (or folder) accomplished by the Denodo Governance Bridge for IGC do not delete views in IGC that no longer exist in Denodo VDP. In these cases the user should delete these views manually in IGC.
Lineage between Denodo elements and other IGC elements
The Denodo Governance Bridge for IGC exclusively manages Denodo elements and, in case of Denodo JDBC elements, their relation with native IGC Database assets. It is not aware of the presence of other metadata elements for the Denodo data source, elements that could be present in IGC or not, and managed in ways that might not be recognizable by external tools such as the Denodo Governance Bridge for IGC.
If this limitation affects your scenario, see the Export VDP elements to XML: batch mode section for an alternative that could fit your needs.
How to encrypt passwords
The Denodo Governance Bridge for IGC expects encrypted passwords in the input.json to appear surrounded by ENC(...). You can compute these values using the Jasypt CLI tools, and use the DENODO_EXPORT_ENCRYPTION_PASSWORD environment variable, or Java system property, to communicate the encryption password to the Denodo Governance Bridge.
This way, you can use encrypted passwords in the input.json file:
... "password":"ENC(s2FdirMK4QORq1HZ6tcTTQ==)" ... |
These are the steps for encrypting passwords:
- Download Jasypt CLI tools.
- Choose an encryption password, e.g., mypassword.
- Go to jasypt/bin.
- Run encrypt.bat with the input parameter and password parameter:
- input parameter - this is the string you want to encrypt.
- password parameter - this is the password that Jasypt is going to use to encrypt and decrypt the input parameter.
Your command should look like this:
Take note of the output. Example output: zrass64ls4LIx5hdFoXXyA==.
- Open your input.json file, replace the password you want to encrypt with the output from Step 4: ENC(zrass64ls4LIx5hdFoXXyA==).
Example in the input.json file:
Before Jasypt: "password":"admin" After Jasypt "password":"ENC(zrass64ls4LIx5hdFoXXyA==)" |
- Add an environment variable, or Java system property to the Denodo Governance Bridge start script, with the name DENODO_EXPORT_ENCRYPTION_PASSWORD, and value of mypassword, but use your real encryption password.