Denodo Governance Bridge for Microsoft Purview - User Manual
You can translate the document:
This is the user manual for Denodo Governance Bridge for Microsoft Purview. If you are using IBM® InfoSphere® Information Governance Catalog (IGC), check the Denodo Governance Bridge for IGC - User Manual. Alternatively, if you are using Collibra Platform, check the Denodo Governance Bridge for Collibra - User Manual.
Overview
Denodo Governance Bridge for Microsoft Purview retrieves metadata from Denodo Platform, transforms, and upserts it to Purview assets and relationships.
The following metadata is fetched from Denodo Platform:
- Databases
- Data Sources
- Base Views and Derived Views
- Columns
To carry out this integration, Denodo Governance Bridge for Microsoft Purview needs the creation of a metamodel that is necessary to provide a structure for representing Denodo elements as Purview assets. The table below shows how the data is mapped:
Denodo |
Purview Definition Type |
Database |
Denodo_DB |
Data Source |
Denodo_Datasource |
Base View and Derived Views |
Denodo_Denodo_View |
Column |
Denodo_Column |
Installation
The Denodo Governance Bridge distribution consists of three folders that contains the distribution for each module:
- denodo-igc-governance-bridge-<version> for the Denodo Governance Bridge for IGC.
- denodo-collibra-governance-bridge-<version> for the Denodo Governance Bridge for Collibra.
- denodo-purview-governance-bridge-<version> for the Denodo Governance Bridge for Microsoft Purview.
Installation of the Denodo Governance Bridge for Purview
The distribution of the Denodo Governance Bridge for Purview consists of:
- Command-line executable scripts for Windows and Linux (/bin folder).
- Configuration files: application.properties, log4j2.xml and input.json(it is only an example to execute the synchronization service).(/conf folder).
- Java libraries (/lib folder)
- Governance Bridge application jar: denodo-purview-governance-bridge-<version>-jar
- Denodo driver jar: denodo-vdp-jdbcdriver-dist-<version>-full.jar
If you need to use a different Denodo driver version than the one that is distributed, you have to replace this jar with the Denodo driver of the proper version.
In order to install the Denodo Governance Bridge for Purview, just download the .zip file and extract the tool into the desired folder.
Before running the Denodo Governance Bridge for Microsoft Purview, the user has to review and complete the application.properties file. It has some default properties that can be reset and some connection information that has to be configured. See the section on the Configuration of the Denodo Governance Bridge for Microsoft Purview for more detailed information.
After completing the configuration and running the script denodo-purview-governance-bridge.sh|bat available in the /bin folder, you can trigger the application.
Configuration of the Denodo Governance Bridge for Microsoft Purview
The application.properties file, available in the /conf folder, allows the user to set the properties required to run the application.
Purview connection properties
- purview.tenant.id: unique identifier of the Azure Active Directory instance
- purview.client.id: client identifier of the Azure application used for the access
- purview.client.secret: the client secret of the Azure application used for accessing
- Purview.endpoint: URL to Microsoft Purview service.
Denodo connection properties
- denodo.driver-class-name: the class name of the Denodo JDBC driver
- denodo.host: Denodo host name
- denodo.username: username used to connect to Denodo Platform
- denodo.password: password used to connect to Denodo Platform. It could be encrypted or clear. See the How to encrypt passwords section for a detailed explanation.
- denodo.url: Denodo JDBC connection URL
- denodo.catalog.url: Denodo Data Catalog URL
Governance bridge for Purview properties
- prefix.qualilified.name: Prefix for building the qualified name parameter of the assets in Purview.
Usage
Denodo Governance Bridge for Microsoft Purview
Create Purview metamodel
The Denodo Governance Bridge for Microsoft Purview needs a specific metamodel on the Microsoft Purview instance. The first time the synchronization service is executed, several entity and relationship definitions are created in Purview.
Asset Types
Asset Type |
Description |
Parent Asset Type |
Denodo_db |
Represents a Denodo database |
azure_resource |
Denodo_Datasource |
Represents a Denodo data source |
azure_resource |
Denodo_view |
Represents a Denodo base view or derived view |
Purview_Table, DataSet |
Denodo_column |
Represents a Denodo column |
DataSet |
Denodo_process |
Represents the lineage between two entities |
Process |
Attribute Types
Denodo_db
Attribute Type |
Description |
vdbname |
Database name in VDP. |
Denodo_view
Attribute Type |
Description |
name |
View name in VDP |
vdbName |
Database name in VDP. |
qualifiedName |
View identifier in Purview |
catalogURL |
Data Catalog URL of the view |
viewType |
Base view or derived view |
Denodo_datasource
Attribute Type |
Description |
name |
Data source name in VDP |
vdbName |
Database name in VDP |
qualifiedName |
Data source identifier in Purview |
Denodo_column
Attribute Type |
Description |
name |
Column name in VDP |
vdbName |
name of the database |
viewName |
View name in VDP |
qualifiedName |
Column identifier in Purview |
userDescription |
Column Description |
data_type |
Column Data type |
Relationhip Types
Name |
End1 |
End2 |
Category |
|||
denodo_db_view |
name |
database |
name |
db |
Composition |
|
type |
denodo_db |
type |
denodo_view |
|||
cardinality |
SET |
cardinality |
SINGLE |
|||
denodo_view_column |
name |
columns |
name |
view |
Composition |
|
type |
denodo_column |
type |
denodo_view |
|||
cardinality |
SET |
cardinality |
SINGLE |
|||
denodo_view_dependency |
name |
dependencies |
name |
used_by |
AGGREGATION |
|
type |
denodo_view |
type |
denodo_view_dependency |
|||
cardinality |
SET |
cardinality |
SINGLE |
|||
denodo_db_datasource |
name |
datasources |
name |
db |
Association |
|
type |
denodo_db |
type |
denodo_datasource |
|||
cardinality |
SET |
cardinality |
SINGLE |
|||
denodo_datasource_view |
name |
views |
name |
datasource |
Composition |
|
type |
denodo_datasource |
type |
denodo_view |
|||
cardinality |
SET |
cardinality |
SINGLE |
Collections
Assets can be inserted into a Purview collection that already exists. Collections in Purview are used to organize the assets. The collection, where assets will be inserted, is configured in the input parameter collection.
Delete Outdate Assets
In synchronization, the Governance Bridge for Microsoft Purview can find obsolete assets that are not anymore in VDP. There is a parameter to decide what to do, when there are these obsolete assets, the forceDeletion parameter. When the value is false the application will only warn that other assets could be deleted, and the synchronization will not be done. When the value is true, the synchronization will be carried out and these assets will be deleted in Microsoft Purview.
Synchronize Denodo VDP with Microsoft Purview Platform instance
Once the Denodo Governance Bridge for Microsoft Purview is up and running (script denodo-purview-governance-bridge.sh|bat) it offers two methods that trigger the synchronization of Denodo Platform metadata with Microsoft Purview.
Service endpoint
The Denodo Governance Bridge for microsoft Purview offers an endpoint service that should be requested using an HTTPS POST method:
https://<server-host>:<server.port>/api/sync |
The server.port is 8442 by default but the user can configure it in the application.properties file.
The request body should be a JSON object defining the collectionId where the Purview assets will be stored and the Denodo databases that will be synchronized. If a collection is not specified in the input, the assets will be created but not in a collection. Example:
{ "collectionId": "8dxanz", "databases": ["db1","db2"], "forceDeletion": "false" } |
Accordingly, the Content-Type header must be added:
Content-Type: application/json |
The output shows information about created and un/modified asset counts. Example:
{ "databases" : { "created" : 2, "updated" : 0 }, "datasources" : { "created" : 4, "updated" : 0 }, "views" : { "created" : 12, "updated" : 0 }, "columns" : { "created" : 266, "updated" : 0 } "deletedAssets": [] } |
After the synchronization, the user will have the Denodo metadata available in Microsoft Purview:
Figure 1 Denodo metadata in Microsoft Purview
Synchronization script
In the bin directory of the distribution, you will find the script execute the script denodo-purview-synchronize-governance-bridge.sh|bat. You can execute it to trigger the synchronization:
$ cd denodo-purview-governance-bridge-<VERSION> $ bin/denodo-purview-synchronize-governance-bridge.sh conf/input.json |
This script uses curl for invoking the synchronization service endpoint of the Denodo Governance Bridge for Microsoft Purview. You can check if you have curl installed in your system using the command:
$ curl --version |
If curl is not there you can install it from https://curl.haxx.se/dlwiz/.
The denodo-purview-synchronize-governance-bridge.sh|bat script needs a JSON file with a JSON object defining the collectionId where the Purview assets will be stored and the Denodo databases that will be synchronized.
Example:
{ "collectionId": "8dxanz", "databases": ["db1","db2"], "forceDeletion": "false" } |
The output shows information about created, updated and deleted asset counts. Example:
$ bin/denodo-purview-synchronize-governance-bridge.sh conf/input.json HTTP/1.1 200 Content-Type: application/json Transfer-Encoding: chunked Date: Fri, 28 Jun 2024 08:12:17 GMT {"databases":{"created":2,"updated":0,"deleted":0},"datasources":{"created":4,"updated":0},"views":{"created":12,"updated":0},"columns":{"created":266,"updated":0},"deletedAssets": []} |
Note that the output of the script also includes the HTTP response headers. You can check the HTTP status code to see if the export process was OK.
Limitations
- The user is responsible for being consistent with the collection where the databases are synchronized. Because if you make several synchronizations over distinct collections, you can create assets in several Purview collections.
- The REST API of Purview doesn´t allow insert relationships in batch mode, this makes the synchronization slower.
- The lineage is only supported at the view level.
- The synchronization of tags is not supported by Governance Bridge for Microsoft Purview.
How to encrypt passwords
The Denodo Governance Bridge for Microsoft Purview expects encrypted passwords in the application.properties to appear surrounded by ENC(...). You can compute these values using the Jasypt CLI tools, and use the DENODO_EXPORT_ENCRYPTION_PASSWORD environment variable, or Java system property, to communicate the encryption password to the Denodo Governance Bridge.
This way, you can use encrypted passwords in the application.properties file:
... password=ENC(s2FdirMK4QORq1HZ6tcTTQ==) ... |
These are the steps for encrypting passwords:
- Download Jasypt CLI tools.
- Choose an encryption password, e.g., mypassword.
- Go to jasypt/bin.
- Run encrypt.bat with the input parameter and password parameter:
- input parameter - this is the string you want to encrypt.
- password parameter - this is the password that Jasypt is going to use to encrypt and decrypt the input parameter.
Your command should look like this:
Take note of the output. Example output: zrass64ls4LIx5hdFoXXyA==.
- Open your application.properties file, replace the password you want to encrypt with the output from Step 4: ENC(zrass64ls4LIx5hdFoXXyA==).
Example in the application.properties file:
Before Jasypt: password=admin After Jasypt password=ENC(zrass64ls4LIx5hdFoXXyA==) |
- Add an environment variable, or Java system property to the Denodo Governance Bridge start script, with the name DENODO_EXPORT_ENCRYPTION_PASSWORD, and value of mypassword, but use your real encryption password.