• User Manuals /
  • Denodo Governance Bridge for Collibra - User Manual

Denodo Governance Bridge for Collibra - User Manual

Download original document


You can translate the document:

This is the user manual for Denodo Governance Bridge for Collibra. If you are using IBM® InfoSphere® Information Governance Catalog (IGC), check the Denodo Governance Bridge for IGC - User Manual. Alternatively, if you are using Microsoft Purview, refer to the Denodo Governance Bridge for Microsoft Purview - User Manual.

Overview

Denodo Governance Bridge for Collibra retrieves metadata from Denodo Platform, transforms and upserts it to a Collibra Platform instance as assets and complex relations.

On the other hand, Collibra Data Lineage does not currently support integration of Denodo data source. However it offers the opportunity to create a custom technical lineage following a specification. Conforming to this specification, called batch definition, the Denodo Governance Bridge for Collibra is able to create a folder with the necessary files to load the Denodo Technical Lineage into Collibra via Edge or via Lineage Harvester.

Denodo Governance Bridge for Collibra Synchronization Overview

Denodo Governance Bridge for Collibra provides a service that allows the user to keep synchronized the Collibra Platform instance with  the Denodo metadata.

The following metadata is fetched from Denodo Platform:

  • Database-icon.gif    Databases
  • Folder-icon.gif    Folders
  • DataSource-icon.gif    Data Sources
  • BaseView-icon.gif    Base Views
  • DerivedView-icon.gif    Derived Views
  • InterfaceView-icon.gif    Interface Views
  • Column-icon.gif    Columns

To carry out this integration, Denodo Governance Bridge for Collibra needs the creation of a  metamodel that is necessary to provide a structure for representing Denodo elements as Collibra assets. The table below shows how the data is mapped:

Denodo

Collibra Platform Asset Type

Database

Schema

Folder

Denodo Folder

Data Source

Denodo Data Source

Base View

Table

Derived View

Database View (Table type attribute: Derived View)

Interface View

Database View (Table type attribute: Interface View)

Materialized View

Database View (Table type attribute: Materialized View)

Private View

Database View (Table type attribute: Private View)

Column

Column

The following images show a Denodo database sample and its representation in Collibra Platform:

                      Figure 1 Denodo Platform database sample

 

Figure 2 Diagram example in Collibra Platform result of a Denodo Governance Bridge synchronization

It is important to highlight that Denodo Governance Bridge for Collibra is compatible with Collibra Edge since the 20240123 version. This means that a synchronization performed via Edge can be completed with the metadata imported using the Denodo Governance Bridge for Collibra and vice versa. Nevertheless, the changes required to achieve compatibility, aimed at matching structures and naming conventions in Edge, break the backward compatibility of the Denodo Governance Bridge for Collibra.

Denodo Governance Bridge for Collibra Technical Lineage Overview

Denodo Governance Bridge for Collibra provides a service endpoint that generates a folder with the custom technical lineage for Denodo according to the batch definition format. This technical lineage can be loaded into Collibra via the Lineage Harvester or via Edge.

Requirements

The Denodo Governance Bridge for Collibra requires Java 8 or later and the environment variables JAVA_HOME and PATH correctly configured.

It is also important to highlight that if there are Unicode characters, for example space characters, in the names of the elements to be synchronized or for which you are going to create technical lineage, you must have Unicode support enabled in the Denodo Virtual DataPort server to achieve a correct synchronization in Collibra.

Installation

The distribution of the Denodo Governance Bridge for Collibra consists of:

  • Command-line executable scripts for Windows and Linux (/bin folder)

  • Configuration files: application.properties and log4j2.xml (/conf folder)

  • A sample JSON file, input-sync.json, required for the execution using the script to synchronize (/conf folder). See the Synchronization script subsection under the Synchronize Denodo VDP with Collibra Platform instance section.

  • A sample JSON file, input-technical-lineage.json, required for the execution using the script to generate technical lineage (/conf folder). See the Technical lineage generation script subsection under the Generate technical lineage section.

  • Java libraries (/lib folder)
  • Denodo Governance Bridge application jar:
    denodo-collibra-governance-bridge-<version>-jar

  • Denodo driver jar: denodo-vdp-jdbcdriver-<version>-full.jar

If you need to use a different Denodo driver version from the one that is distributed, you have to replace this jar by the Denodo driver of the proper version.

In order to install the Denodo Governance Bridge for Collibra, just download the .zip file and extract the tool into the desired folder.

Before running the Denodo Governance Bridge for Collibra, the user has to review and complete the application.properties file. It has some default properties that can be reset and some connection information that should be set. Consult the section Configuration of the Denodo Governance Bridge for Collibra for more detailed information.

After completing the configuration and running the script denodo-collibra-governance-bridge.sh|bat available in the /bin folder, you can trigger the application.

Configuration of the Denodo Governance Bridge for Collibra

The application.properties file, available in the /conf folder, allows the user to set the properties required to run the application.

Collibra connection properties

  • collibra.url: URL of the Collibra Platform instance.
  • collibra.username: username for the Collibra Platform instance.
  • collibra.password: password for the Collibra Platform instance. It could be encrypted or clear. See the How to encrypt passwords section for a detailed explanation.

Denodo connection properties

  • denodo.driver-class-name: the class name of the Denodo JDBC driver.
  • denodo.host: Denodo host name.
  • denodo.username: username used to connect to Denodo Platform.
  • denodo.password: password used to connect to Denodo Platform. It could be encrypted or clear. See the How to encrypt passwords section for a detailed explanation.
  • denodo.url: Denodo JDBC connection URL.
  • denodo.restful.base-url: URL where the Denodo RESTful web service is deployed.
  • denodo.restful.views-path: path to the views.
  • denodo.datacatalog.base-url: URL for accessing the Denodo Data Catalog (available since Denodo 8.0).
  • denodo.datacatalog.views-path: path to the views (available since Denodo 8.0).

Proxy connection properties (optional)

  • collibra.proxy.host: host of the proxy server that should be set if the user  wants to connect to the Collibra Platform instance through a proxy.
  • collibra.proxy.port: port of the proxy that should be set if the user  wants to connect to the Collibra Platform instance through a proxy.
  • collibra.proxy.username: proxy username that should be set if the user  wants to connect to the Collibra Platform instance through a proxy with authentication.
  • collibra.proxy.password: proxy password that should be set if the user  wants to connect to the Collibra Platform instance through a proxy with authentication. It could be encrypted or clear. See the How to encrypt passwords section for a detailed explanation.

Collibra custom constants

These properties are used to map the custom types, required by the application to represent  the Denodo elements in Collibra (see the Create Collibra metamodel section for further information), to their Collibra Platform ID.

If the user setup the Collibra Platform instance executing the script available under the package Import Collibra Operating Model (Python) using the resources folder generated by the Denodo Governance Bridge for Collibra (see the Creating the metamodel with the script offered by Collibra to import the Operating Model section), these values can be left as is. Otherwise, please check the Mapping Collibra Platform type identifiers to custom constants subsection in the Create Collibra  metamodel section.

Technical lineage properties

There are two optional properties regarding the generation of technical lineage:

  • technical.lineage.destination.folder: the destination folder name. This folder will be available in the denodo-collibra-governance-bridge-<version> directory, after triggering the technical lineage generation, and will contain the files and folders necessary to create the technical lineage in Collibra. The default value is “TechnicalLineage”.

  • technical.lineage.flows.number.file: the number of flows per lineage.json file. This lineage file is created to define the lineage relation between two or more data objects. Each flow contains the path from a source to a target and defines the transformation code or transformation references to be processed by the Collibra Data Lineage service. If the user does not set a number of flows per file, the service will create a single lineage_1.json file.

Metamodel resources property

There is an optional property related to the generation of resources to create the necessary metamodel in Collibra:

  • metamodel.resources.destination.folder: the destination folder name. This folder will be available in the denodo-collibra-governance-bridge-<version> directory, after triggering the metamodel resources generation, and will contain the folders and files necessary to create the metamodel in Collibra using a script offered by Collibra in its Marketplace (Import Collibra Operating Model). The default value is “resources”.

Private views synchronization property

There is a property related to the synchronisation of private views in Collibra:

  • private.views.sync: this property indicates if the private views will be synchronised in Collibra or not. By default the value is false so if you want the private views to be synchronised in Collibra you must change the value to true.

Usage

Create Collibra metamodel

The Denodo Governance Bridge for Collibra needs a specific metamodel on the Collibra Platform instance. This expected metamodel includes some out-of-the-box types but it also includes custom types that the user should create before running this application.

This section will outline the custom elements that should be available on the Collibra Platform instance. This allows the user to create the metamodel on his own. However, this Denodo-Collibra integration includes the ability to generate a resources folder with the necessary files to create the required metamodel using the script included in the package Import Collibra Operating Model (Python), available on the Collibra Marketplace. See the Creating the metamodel with the script offered by Collibra to import the Operating Model section for more detailed information on the latter option.

Domain

The user must create a Domain or Domains with the Physical Data Dictionary type where the Denodo integration assets will be upserted.

The files included in the resources folder that can be generated using the Denodo Governance Bridge for Collibra (See the Creating the metamodel with the script offered by Collibra to import the Operating Model section), create a root-level community named Denodo-Collibra Integration that has a domain named Denodo Data Dictionary. Note that only one domain (Denodo-Collibra Integration ) is created. However, this integration supports separate domains for each Denodo database.

Asset Types

Asset Type

Description

Parent Asset Type

Denodo Folder

Represents a Denodo folder

BI Folder

Denodo Data Source

Represents a Denodo data source

BI Data Source

Attribute Types

Attribute Type

Description

Kind

Dependency Type

The dependency (e.g. Union) between two views

Text

Multiplicity

The multiplicity of the column-to-column associations

Text

Left Role Description

The left role description of the column-to-column associations

Text

Right Role Description

The right role description of the column-to-column associations

Text

Cache Status

The cache status of the respective view

Text

View Status

The status of the respective view (available since Denodo 8.0)

Text

Relation Types

Head

Role

Co-Role

Tail

Schema

contains

is part of

Denodo Folder

Complex Relation Types

Head

Leg 1

Leg 2

Attributes

Description

Denodo View

Dependency

Name

Name

Dependency Type

Complex relation type used to provide the Denodo views

lineage

source

target

Asset Type

Asset type

Table (1:N)

Table (1:1)

Denodo Column

Association

Name

Name

Multiplicity

Left Role Description

Right Role Description

Complex relation type used to represent the column-to-

column associations

source

target

Asset Type

Asset Type

Column (1:1)

Column (1:1)

Assignments

The following assignments should be created:

Asset Type

Add

Schema

Relations: contains Denodo Folder

Denodo Folder

Relations: is part of Schema

Table

Attributes: URL, Cache Status, View Status

Complex Relations: Denodo View Dependency

Database View

Attributes: URL, Cache Status, View Status

Complex Relations: Denodo View Dependency

Column

Complex Relations: Denodo Column Association

It is important to note that even once you have correctly created the metamodel in Collibra, you should verify that the corresponding asset layouts include the attributes or relations you want to see.

Mapping Collibra Platform type identifiers to custom constants

Once the metamodel is created without using the resources directory generated by the Denodo Governance Bridge for Collibra, the custom constants identifiers available in the application.properties file have to be updated.

The table below shows the mapping between Collibra Platform type and custom constants:

Collibra Platform type

Custom constant

Denodo Folder

customconstant.denodo.folder

Denodo Data Source

customconstant.denodo.data.source

Dependency Type

customconstant.denodoview.complexrelation.dependencytype

Multiplicity

customconstant.denodoassociation.complexrelation.multiplicity

Left Role Description

customconstant.denodoassociation.complexrelation.leftroledescription

Right Role Description

customconstant.denodoassociation.complexrelation.rightroledescription

Cache Status

customconstant.cache.status

View Status

customconstant.view.status

Schema contains/is part of Denodo Folder

customconstant.schema.contains.denodofolder

Denodo View Dependency

customconstant.denodoview.dependency

Denodo View Dependency: source

customconstant.denodoview.complexrelation.source

Denodo View Dependency: target

customconstant.denodoview.complexrelation.target

Denodo Column Association

customconstant.denodocolumn.association

Denodo Column Association: source

customconstant.denodoassociation.complexrelation.source

Denodo Column Association: target

customconstant.denodoassociation.complexrelation.target

Create the metamodel with the script offered by Collibra to import the Operating Model

Collibra has available, in its Marketplace, a script which is designed to streamline the process of importing a Collibra Operating Model. This script accesses a resource directory, which the user must place in the same path as the script, and reads the files contained in the folder to create or update the Collibra metamodel elements required by Denodo Governance Bridge for Collibra.

Generate the metamodel resources folder

Once the Denodo Governance Bridge for Collibra is up and running (script denodo-collibra- governance-bridge.sh|bat) it offers two methods to generate a folder with the resources files:

  • Service endpoint
  • Metamodel resources generation script

Service endpoint

The Denodo Governance Bridge for Collibra offers an endpoint service that should be requested using an HTTPS GET method:

 https://<server-host>:<server.port>/api/generateMetamodelResources

The server.port is 8442 by default but the user can configure it in the application.properties file.

Metamodel resources generation script

In the bin directory of the distribution you will find the script denodo-collibra-generate-metamodel-resources-governance-bridge.sh|bat. You can execute it in order to generate the resources folder :

$ cd denodo-collibra-governance-bridge-<VERSION>

$ bin/denodo-collibra-generate-metamodel-resources-governance-bridge.sh

This script uses curl for invoking the service endpoint of the Denodo Governance Bridge for Collibra that generates the resources. You can check if you have curl installed in your system using the command:

$ curl --version

        

If curl is not there you can install it from https://curl.haxx.se/dlwiz/.

Metamodel resources result folder and metamodel creation using it

After the metamodel resources generation, the user will have a folder in the denodo-collibra-governance-bridge-<version> directory with the name specified in the metamodel.resources.destination.folder property of the application.properties file. By default the destination folder name is resources.  Note that if this folder already exists it will be deleted during the process of creating the new metamodel resources files for the current request.

The result directory will contain a subfolder per each type of element that the script can load to create or update:

  • AssetType
  • Assignment
  • AttributteType
  • Community
  • ComplexRelationType
  • Domain
  • RelationType
  • Role
  • Status

The JSON files in each subfolder contain detailed information about every element of the Collibra metamodel, indicated as required in the Create Collibra metamodel section. This allows for prior inspection and verification before creation/updating. The user can make changes directly in the JSON files or create new ones to fit their specific requirements. After that, following the Import Collibra Operating Model documentation, the metamodel resources folder can be used by the install_operating_model.py script to create the necessary Collibra metamodel.

Synchronize Denodo VDP with Collibra Platform instance

Once the Denodo Governance Bridge for Collibra is up and running (script denodo-collibra- governance-bridge.sh|bat) it offers four methods that trigger the synchronization of Denodo Platform metadata with Collibra Platform:

  • Service endpoint
  • Synchronization script
  • Collibra Platform Workflow
  • Cron Scheduler

Note that due to the compatibility of Denodo Governance Bridge for Collibra with Collibra Edge, available since the 20240123 version, there is a mandatory property to specify the Collibra Edge connection name created to communicate with the Denodo instance where there is the data to be synchronized. This property should be specified in each synchronization to be able to unify the features offered by both Edge and Denodo Governance Bridge for Collibra. Depending on the method used to trigger the process it can be set on the request body or using properties defined in the application.properties file. See the corresponding section for more detailed information.

The Collibra Edge connection, whose name is required, can be created in Collibra after performing the synchronization using Denodo Governance Bridge for Collibra. If it is created with the name specified in the synchronization carried out with Denodo Governance Bridge for Collibra and the same pair (Denodo database, Collibra domain id) is specified in both synchronizations, the metadata will be synchronized via Edge and Denodo Governance Bridge for Collibra. Nevertheless, note that the changes required to achieve compatibility, aimed at matching structures and naming conventions in Edge, break the backward compatibility of the Denodo Governance Bridge for Collibra.

Service endpoint

The Denodo Governance Bridge for Collibra offers an endpoint service that should be requested using an HTTPS POST method:

 https://<server-host>:<server.port>/api/sync

The server.port is 8442 by default but the user can configure it in the application.properties file.

The request body should be a JSON object defining the Edge connection name of the Collibra Edge connection created to communicate with the Denodo instance where there is the data to be synchronized and, for each database, the name and the respective Collibra domain ID that should be used. Example:

{

   "edgeConnectionName":"DenodoEdge",

   "databaseDomains":{

          "denodo_database_01":"b6d78bb4-8905-4f6c-88a7-039540eeeff3",

          "denodo_database_02":"b6d78bb4-8905-4f6c-88a7-039540eeeff3"

   }

}

Accordingly, the Content-Type header must be added:

 Content-Type: application/json

The output shows information about created and un/modified asset counts. Example:

{

    "Column": {

        "created": 15,

        "un/modified": 95

    },

    "Database View": {

        "created": 0,

        "un/modified": 1

    },

    "Database": {

        "created": 0,

        "un/modified": 1

    },

    "Table": {

        "created": 5,

        "un/modified": 11

    },

    "Denodo Data Source": {

        "created": 1,

        "un/modified": 1

    },

    "Schema": {

        "created": 0,

        "un/modified": 1

    },

    "Denodo Folder": {

        "created": 1,

        "un/modified": 2

    }

}

After the synchronization, the user will have the Denodo metadata available in the Collibra Platform instance:

Figure 3  Domain in Collibra Platform instance with Denodo Platform metadata

Synchronization script

In the bin directory of the distribution you will find the script execute the script denodo-collibra-synchronize-governance-bridge.sh|bat. You can execute it in order to trigger the synchronization:

$ cd denodo-collibra-governance-bridge-<VERSION>

$ bin/denodo-collibra-synchronize-governance-bridge.sh conf/input-sync.json

This script uses curl for invoking the synchronization service endpoint of the Denodo Governance Bridge for Collibra. You can check if you have curl installed in your system using the command:

$ curl --version

        

If curl is not there you can install it from https://curl.haxx.se/dlwiz/.

The denodo-collibra-synchronize-governance-bridge.sh|bat script needs a JSON file with a JSON object defining the Edge connection name of the Collibra Edge connection created to communicate with the Denodo instance where there is the data to be synchronized and, for each Denodo database, the name and the respective Collibra domain ID that should be used.

input-sync.json sample file

{

   "edgeConnectionName":"DenodoEdge",

   "databaseDomains":{

          "Database1":"b6d78bb4-8905-4f6c-88a7-039540eeeff3",

          "Database2":"b6d78bb4-8905-4f6c-88a7-039540eeeff3"

   }

}


The output shows information about created and un/modified asset counts. Example:

$ bin/denodo-collibra-synchronize-governance-bridge.sh conf/input-sync.json

HTTP/1.1 200

Content-Type: application/json

Transfer-Encoding: chunked

Date: Fri, 30 Jun 2023 09:49:12 GMT

{"Column":{"created":99,"un/modified":0},"Table":{"created":3,"un/modified":0},"Database":{"created":0,"un/modified":1},"Schema":{"created":1,"un/modified":0},"Denodo Folder":{"created":2,"un/modified":0},"Denodo Data Source":{"created":2,"un/modified":0}}

Note that the output of the script also includes the HTTP response headers. You can check the HTTP status code to see if the synchronization was OK.

Collibra Platform Workflow

Whenever the user wants a Collibra Workflow to communicate with the Denodo Governance Bridge for Collibra, they can create a user task for the integration to pick it up.

The Collibra Workflow can be a simple workflow that creates a user task called Sync Data User Task. This workflow should be configured to be applied to specific domain types that reside under a particular Collibra Community. After that, the user can start the workflow by simply pressing a button on the domain:

Figure 4 Sync Data User Task button

Check the Collibra Workflow Event-Driven Triggering for Spring Boot Integrations documentation, available in the Collibra Marketplace, for more detailed information.

Note that the user has to configure, in the application.properties file, the trigger.collibra.workflow properties:

  • trigger.collibra.workflow.enabled: true to enable the Collibra Workflow triggering
  • trigger.collibra.workflow.polling-frequency.ms: frequency in milliseconds used by the Denodo Governance Bridge for Collibra to check for Collibra Workflow triggers
  • trigger.collibra.workflow.user-task: name of the Collibra Workflow user task that should be used
  • trigger.collibra.workflow.edge.connection.name: name of the Collibra Edge connection created to communicate with the Denodo instance where there is the data to be synchronized
  • trigger.collibra.workflow.database: Denodo database name that should be processed once the Collibra Workflow is triggered using the Sync Data User Task
  • trigger.collibra.workflow.domain.id: ID of the Collibra domain that must be used when the Collibra Workflow is triggered using the Sync Data User Task

Cron Scheduler

The synchronization, using the Denodo Governance Bridge for Collibra, can be triggered using a cron scheduler. To accomplish this, there are two properties in the application.properties file:

  • trigger.scheduler.cron.enabled: true if the user wants to use this triggering mode
  • trigger.scheduler.cron.expression: cron expression to establish the frequency when the synchronization should be automatically triggered. This cron expression has 6 fields:

  1. The seconds at which it should be triggered. (Valid values are: 0 to 59)
  2. The minute at which it should be triggered. (Valid values are: 0 to 59)
  3. The hour at which it should be triggered. (Valid values are: 0 to 23)
  4. The day of the month at which it should be triggered. (Valid values are: 1 to 31)
  5. The month at which it should be triggered. (Valid values are: 1 to 12, or JAN-DEC)
  6. The day of the week at which it should be triggered. (Valid values are: 0 to 7, or MON-SUN)        

        Example:        

trigger.scheduler.cron.enabled=true

trigger.scheduler.cron.expression=0 0 9-17 * * MON-FRI

Based on this cron configuration example, the Denodo Governance Bridge for Collibra would be triggered on the hour nine-to-five on weekdays.

For more information about the cron expression, please refer to the Spring Boot documentation

Note that this synchronization mode will use the information setting in the collibra.edge.connection.name and collibra.domain.ids properties of the application.properties file:

collibra.edge.connection.name=edgeConnectionName

collibra.domain.ids={\

    "Denodo_database_01": "12d2a285-6dc3-4368-ba87-6a9295377fce",\

    "Denodo_database_02": "08245c84-10bc-4a33-a36b-8f71715a123f"\

}

Denodo technical lineage

The Denodo Governance Bridge for Collibra shows relations between Denodo elements that give rise to Denodo Derived Views.

Figure 5 Denodo View Dependency example

However, the technical lineage is not created as a result of a synchronization but the Denodo Governance Bridge for Collibra provides the possibility to generate technical lineage  that can then be loaded into Collibra.

Considerations

The way to relate the Denodo assets with the corresponding connection from Collibra is via Edge. Before synchronizing a Denodo database via Edge, the user has to register a data source. This registration creates an initial structure in a selected community in Collibra Data Catalog and allows the synchronization and the generation of the corresponding relation of type Technology Asset groups / is grouped by Technology Asset between a System asset and the Denodo database asset.

Without this synchronization via Edge the assets do not have a connection to the System asset, therefore the technical lineage is not able to stitch the columns and tables with their corresponding assets from the Collibra Data Catalog. It is important to point out that this applies both to assets corresponding to Denodo elements and to those corresponding to other sources with which it is desired to establish a technical lineage through the files generated by Denodo Governance Bridge for Collibra because they correspond to data sources from the synchronized Denodo databases.

In order to perform the stitching of elements, the Denodo Technical Lineage generated by the Denodo Governance Bridge for Collibra specifies the system data object name, consequently the useCollibraSystemName property in the lineage harvester configuration file should be set to true.

Note that the metadata synchronizations between Collibra and Denodo made via Edge or via Denodo Governance Bridge for Collibra are compatible, this means that even if a synchronization via Edge is necessary, a synchronization via Denodo Governance Bridge for Collibra should also be done in order to have more complete information.

Generate technical lineage

Once the Denodo Governance Bridge for Collibra is up and running (script denodo-collibra- governance-bridge.sh|bat) it offers two methods to generate a folder with the Denodo technical lineage:

  • Service endpoint
  • Technical lineage generation script

In addition, the Denodo Governance Bridge for Collibra offers a service endpoint which generates a compressed .zip file with all the Denodo technical lineage files and sources. This endpoint returns the data via payload and avoids writes on the server file system. 

Service endpoint

The Denodo Governance Bridge for Collibra offers two endpoint services that should be requested using an HTTPS POST method.

The first one, generates a folder with the Denodo technical lineage:

 https://<server-host>:<server.port>/api/generateTechnicalLineage

The second endpoint, allows the user to download the compressed .zip file generated with all the  Denodo technical lineage:

 https://<server-host>:<server.port>/api/downloadTechnicalLineage

The server.port is 8442 by default but the user can configure it in the application.properties file.

The request body should be a JSON object defining:

  • The System asset name that was selected when registering the data source related to the Denodo databases for which you want to generate the technical lineage information.

  • A JSON array with the Denodo database names the technical lineage is to be generated for.

Example:

{

   "systemAssetName":"systemAssetName",

   "denodoDatabases":["db01", "db02", "db03"]

}

Optionally, if you want to customize the elements for which Denodo generates Technical Lineage, you should add a filter in the request body. This filter must specify the views that you want to be part of the Technical Lineage using ‘*’ as a wildcard character.

Example:

{

   "systemAssetName":"systemAssetName",

   "denodoDatabases":["db01", "db02", "db03"],

   "filters":["view*", "*test"]

}

In this example, the technical lineage will be generated only for  views started with ‘view’ or ended with ‘test’.

Optionally, if you want to include the technical lineage with data sources that are also synchronized with Collibra, you should add a mapping in the request body. This mapping must specify the System asset and database that are related with each Denodo database and data source that will be exported.

As the Denodo Governance Bridge for Collibra cannot deduce the default schema of a source database it can also be included in the mapping. This is required for Denodo Base Views that were built using a SQL statement in Denodo without specifying the schema because the source has a default value.

Example:

{

   "systemAssetName":"systemAssetName",

   "denodoDatabases":["db01", "db02", "db03"],

   "mappings":[

    {"denodoDatabase":"db01", "dataSource":"ds01", "sourceSystemAssetName":"CollibraSystemAsset01", "sourceDatabase":"sourceDB01", "defaultSchemaSourceDatabase":"public"},

    {"denodoDatabase":"db02", "dataSource":"ds02", "sourceSystemAssetName":"CollibraSystemAsset02", "sourceDatabase":"sourceDB02", "defaultSchemaSourceDatabase":"dbo"}

  ]

}

This mapping field is mandatory if you want to complete the lineage path between Denodo DataSources and other Collibra Database assets that have been formerly imported into Collibra.

Accordingly with the request body, the Content-Type header must be added:

 Content-Type: application/json

With the first endpoint, the output shows information about the Denodo databases processed and included in the technical lineage generated. Example:

{

   "Technical Lineage generated for Denodo databases : ": "["db01", "db02", "db03"]"

}

With the second endpoint, the output will be the .zip file generated. Example:

Technical lineage generation script

In the bin directory of the distribution you will find the script denodo-collibra-generate-technical-lineage-governance-bridge.sh|bat. You can execute it in order to generate the Denodo technical lineage for Collibra:

$ cd denodo-collibra-governance-bridge-<VERSION>

$ bin/denodo-collibra-generate-technical-lineage-governance-bridge.sh conf/input-technical-lineage.json

Or you can execute it in order to download a .zip file with the Denodo technical lineage for Collibra:

$ cd denodo-collibra-governance-bridge-<VERSION>

$ bin/denodo-collibra-generate-technical-lineage-governance-bridge.sh conf/input-technical-lineage.json  /download-directory

In this case, the script needs as second parameter a path to the directory where you want the .zip file to be downloaded.

 

This script uses curl for invoking the service endpoint of the Denodo Governance Bridge for Collibra that generates the Denodo technical lineage. You can check if you have curl installed in your system using the command:

$ curl --version

        

If curl is not there you can install it from https://curl.haxx.se/dlwiz/.

The denodo-collibra-generate-technical-lineage-governance-bridge.sh|bat script needs a JSON file with a JSON object defining the System asset that allows the identification of the assets linked to a given Denodo connection previously synchronized and the name of the Denodo databases for which the user wants to generate the technical lineage.

Optionally, if you want to include the technical lineage with data sources that are also synchronized with Collibra, you should add a mapping in the request body. This mapping must specify the System asset and database that are related with each Denodo database and data source that will be exported.

As the Denodo Governance Bridge for Collibra cannot deduce the default schema of a source database it can also be included in the mapping. This is required for Denodo Base Views that were built using a SQL statement in Denodo without specifying the schema because the source has a default value.

input-technical-lineage.json sample file

{

   "systemAssetName":"systemAssetName",

   "denodoDatabases":["bookstore", "test01"],

   "mappings":[

    {"denodoDatabase":"bookstore", "dataSource":"ds01", "sourceSystemAssetName":"CollibraSystemAsset01", "sourceDatabase":"sourceDB01", "defaultSchemaSourceDatabase":"public"},

    {"denodoDatabase":"test01", "dataSource":"ds02", "sourceSystemAssetName":"CollibraSystemAsset02", "sourceDatabase":"sourceDB02", "defaultSchemaSourceDatabase":"dbo"}

  ]

}


The output shows the name of the Denodo databases for which the technical lineage has been created. Example:

$ bin/denodo-collibra-generate-technical-lineage-governance-bridge.sh conf/input-technical-lineage.json

HTTP/1.1 200

Content-Type: application/json

Transfer-Encoding: chunked

Date: Tue, 18 Jun 2024 10:53:14 GMT

{"Technical Lineage generated for Denodo databases : ":"[bookstore, test01]"}

Note that the output of the script also includes the HTTP response headers. You can check the HTTP status code to see if the process was OK.

If the user chooses the download zip option of the script, the output shows the progress of the .zip file download:

$ bin/denodo-collibra-generate-technical-lineage-governance-bridge.sh conf/input-technical-lineage.json C:/Collibra

 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100   527  100    22  100   505     21    488  0:00:01  0:00:01 --:--:--   510

Technical lineage result folder

After the technical lineage generation, the user will have a folder in the denodo-collibra-governance-bridge-<version> directory with the name specified in the technical.lineage.destination.folder property of the application.properties file. By default the destination folder name is TechnicalLineage.  Note that if this folder already exists it will be deleted during the process of creating the new Denodo technical lineage files for the current request.

If the user chooses the download zip option of the endpoint or the script, the user will have a .zip file, called TechnicalLineage.zip by default, in the directory chosen when downloading the file instead of the files generated in the technical.lineage.destination.folder.

The destination folder or .zip file will contain:

  • The metadata file (metadata.json). This file is used to provide the JSON architecture version, the data source type, and asset type UUIDs of the assets you want to include in the technical lineage
  • One or more lineage_<id>.json files. These files are used to define the lineage relation between two or more data objects.
  • The source_codes folder. This subfolder contains the transformation code in several VQL files.

These files follow the Collibra batch definition format.

Create the Denodo Technical Lineage in Collibra

The user can synchronize the generated custom technical lineage (see Generate technical lineage section), available in the TechnicalLineage folder or in the folder specified in the technical.lineage.destination.folder property, via the Lineage Harvester or via Edge.

The directory with the Denodo technical lineage should be referred to by the lineage harvester configuration file, when the process is going to be triggered via Lineage Harvester, or by the Shared Storage connection, when the process is going to be triggered via Edge. As a result of the synchronization, Collibra Data Lineage generates a technical lineage based on the definitions included in the Denodo technical lineage folder.

Figure 6 Denodo technical lineage example

If you select the "Show code" button, the VQL corresponding to the view on which you are viewing the technical lineage is displayed at the bottom, as shown in the image above.

It is important to highlight that to stitch assets in Collibra Data Catalog to data objects collected by the lineage harvester, the assets must have been added with an Edge synchronization. This is due to the fact that Edge establishes the connection between the assets and the System asset. It is the System asset that allows the identification of the assets linked to a given Denodo connection or to another type of data source that can be involved in the technical lineage generated by the Denodo Governance Bridge for Collibra. See the Considerations section for more detailed information.

If you have added the mapping between the data sources used by Denodo and the same sources synchronized in Collibra (see the Generate technical lineage section for further details), you will be able to see the complete information in the technical lineage viewer.

Figure 7 Denodo technical lineage including lineage to sources

Note that Denodo private views can be synchronized in Collibra if you set the private.views.sync property to true or if you are using a version prior to 20251106. These Denodo internal views are not linked to a System asset, therefore they are not stitched by the Collibra Data Lineage.

Figure 8 Denodo technical lineage with a private view example

For the sake of simplicity, since the 20251106 version, the technical lineage does not include the Denodo private views.

Limitations

Sample data

The Denodo Governance Bridge for Collibra transforms and upserts Denodo Platform metadata to a Collibra Platform instance but it does not collect sample data. You can use Edge in order to establish a connection to Denodo using the JDBC driver and then register, profile and classify Denodo data via Edge.

How to encrypt passwords

The Denodo Governance Bridge for Collibra expects encrypted passwords in the application.properties to appear surrounded by ENC(...). You can compute these values using the Jasypt CLI tools, and use the DENODO_EXPORT_ENCRYPTION_PASSWORD environment variable, or Java system property, to communicate the encryption password to the Denodo Governance Bridge.

This way, you can use encrypted passwords in the application.properties file:

...

password=ENC(s2FdirMK4QORq1HZ6tcTTQ==)

...

 

These are the steps for encrypting passwords:

  1. Download Jasypt CLI tools.

  1. Choose an encryption password, e.g., mypassword.

  1. Go to jasypt/bin.

  1. Run encrypt.bat with the input parameter and password parameter:

  • input parameter - this is the string you want to encrypt.

  • password parameter - this is the password that Jasypt is going to use to encrypt and decrypt the input parameter.

Your command should look like this:


        

Take note of the output. Example output: zrass64ls4LIx5hdFoXXyA==.

  1. Open your application.properties file, replace the password you want to encrypt with  the output from Step 4: ENC(zrass64ls4LIx5hdFoXXyA==).

        Example in the application.properties file:        

Before Jasypt:

password=admin

After Jasypt

password=ENC(zrass64ls4LIx5hdFoXXyA==)

  1. Add an environment variable, or Java system property to the Denodo Governance Bridge start script, with the name DENODO_EXPORT_ENCRYPTION_PASSWORD, and value of mypassword, but use your real encryption password.