• User Manuals /
  • Denodo Governance Bridge - Collibra Protect Integration - User Manual

Denodo Governance Bridge - Collibra Protect Integration - User Manual

Download original document


You can translate the document:

Overview

Protect is a Collibra service aimed at protecting data and granting varying levels of access to it depending on the groups to which users belong.  Collibra allows configuring data to be masked and/or filtered depending on the “data classification” and “data categories” applied to it, implementing this in the form of Data Protection Standards and Data Access Rules.

Data Protection Standards define the rules for masking columns based on their data classification and data category, and these rules are applied to specific groups of users.

Data Access Rules define a group of users and assets (data category, data classification, data sets)  to whose relations a series of filters and masks are applied. Data filtering operates on tuples (rows) based on specified conditions and data masking is determined by data classification and data category.

The integration with Collibra Protect offered by the Denodo Governance Bridge is a stored procedure that creates Global Security Policies (GSPs) in VDP by translating, to the extent possible, the Data Access Rules and Data Protection Standards specified in the parameters of the stored procedure.

The execution of the stored procedure attempts to create Denodo GSPs and returns a row for each GSP it attempts to create. These rows contain a column with the VQL statement associated with the created GSP and another column with an error message if GSP creation failed.

Note that, in the case of Data Access Rules, a Global Security Policy is created for each masking and another GSP is created for all filters.


How it Works


The stored procedure maps each Data Protection Standard (DPS) from Collibra to a corresponding Global Security Policy (GSP) in Denodo. Similarly, it maps each Data Access Rule (DAR) from Collibra to one or more GSPs.
 

Both Data Protection Standards (DPS) and Data Access Rules (DAR) are applied to users and assets, and both are ultimately enforced by global security policies in VDP. Furthermore, masking options and filter options data are transformed into their native equivalents in VDP.

The databases targeted by the DPS and DAR must be synchronized with VDP. You can use the Governance Bridge for Collibra for this. In this link, you can see the documentation.

Users

The set of users affected by a Data Access Rule (DAR) or Data Protection Standard (DPS) is defined using a Group or several groups in Collibra. A Collibra Group corresponds to a user role in VDP.

A user having multiple roles can influence how a Global Security Policy (GSP) from the Collibra Protect integration controls the visibility or masking of a column. Denodo’s permission model is cumulative: a user's effective permissions are the union of all the privileges granted by their roles. This means the least restrictive permission always applies.

For example, imagine a user has two roles:

  • Role A has a Global Security Policy (GSP) applied that masks a specific column.
  • Role B has no GSP applied, allowing unrestricted access to that same column.

In this case, the user's effective permission is the union of both. Since Role B grants unrestricted access, the user will see the column unmasked.

Assets

Data Protection Standards (DPS) and Data Access Rules (DAR) are used to protect data assets. An asset can be either a data category or a specific data set. When a DPS or DAR is applied to an asset, it automatically affects all columns related to that asset. The link between columns and their assets is defined through Protect Prescriptive Paths.

Columns related to assets are marked in VDP with tags, which are then used in the 'Elements' field of Global Security Policies. These tags must first be imported into VDP using the
Import tags from external catalog tool. Prior to this step, it is necessary to run the Init resource security tags attribute and Update resource security tags attribute workflows in Collibra. These workflows copy the tags into the DATA_CLASSIFICATION and DATA_DATASETS attributes. See the Usage section for detailed instructions.

Masking Options

Masking rules, whether they are part of a Data Access Rule (DAR) or a Data Protection Standard (DPS), have two inputs:

  1. Asset: This defines the target of the masking rule. The asset can be either a Data Category or a Data Classification. As explained in the previous section, these assets are mapped in Denodo as tags.
  2. Masking Type: This specifies the masking method to be applied. There are three available types:
  • Default Masking
  • Show Last
  • Hashing


These three masking types are translated into Global Security Policies (GSPs) in Denodo as follows:

  • Default Masking is converted to the Hide function in the GSP.
  • Show Last and Hashing are each converted into a Custom policy. These policies are generated with functions equivalent to the "Show Last" and "Hashing" methods, respectively.

In Denodo, a Global Security Policy (GSP) will be created for each DPS and for each Filter contained within a DAR.

Applying to a defined user role and the assets from the 'Assets' section, these GSPs mask the columns that have the corresponding tags.

Filter Data

Within a Data Access Rule (DAR), the 'Filter Data' component includes a configurable Filter Action. The available options are:

  • Show Everything
  • Hide Everything
  • Show Some
  • Hide Some

For the 'Show Some' and 'Hide Some' options, you must also specify the particular conditions or elements to be shown or hidden. You must select a Data Classification, which, as previously explained, must have also been imported into Denodo as a tag. These tags determine the columns to which the filter condition will be applied. The Code Set has no equivalent in Denodo, this is because a Code Set is used in Collibra to define a group of values, whereas for the GSP condition, only the specific value selected in the Code Value field is needed.

The VQL statements for 'Show Everything' filters will not be executed in Denodo. A Global security policy is designed exclusively to restrict access (e.g., via column masking or row filtering), so we can't translate these Filter Data as a GSP, because it is not a restriction. The response of these filters has the suffix "NO OP" in the name of the GSP, the VQL statement returned has its ENABLED parameter explicitly set to false.

Data Access Rules without masking or filtering are also incompatible with Denodo's GSP logic, which is only restrictive. It is treated as a 'Show Everything' policy. Consequently, while a VQL CREATE statement is generated for these rules, its ENABLED parameter is set to false, and the policy is not created in VDP.

For each 'Filter Data' component within a Data Access Rule (DAR), a Global Security Policy (GSP) is generated in Denodo.

Although the restriction's condition is evaluated against a specific column, the policy's action will either show or hide the entire row.

Installation

Importing the Stored Procedure

In order to run the stored procedure you have to load the denodo_collibra_protect.jar extension file. In Denodo Design Studio, you can do this by navigating to the File > Extension Management > Import menu.

Adding the Stored Procedure

VQL Shell

You can add the stored procedure programmatically with a CREATE PROCEDURE statement:

CREATE [OR REPLACE] PROCEDURE <name:identifier> CLASSNAME='com.denodo.connect.collibra.protect.ImportCollibraProtectPoliciesStoredProcedure'

JARS 'denodo-collibra-protect';

        [ FOLDER = <literal> ]

        [ DESCRIPTION = <literal> ]

Denodo Design Studio menu

You can also add the stored procedure by clicking on Stored procedure > Java stored procedure on the menu File > New:

You must input a name in the Name field and select the Select Jars checkbox in order to use the Jar file, denodo-collibra-protect, previously added.

The IMPORT_COLLIBRA_PROTECT_POLICIES stored procedure creates and executes the VQL statements required to generate Global Security Policies equivalent to the selected Collibra Standards and Rules. This stored procedure returns the name and the VQL statement used to create each Global Security Policy. Note that if an error occurs during the creation of a Global Security Policy, that policy will not be created. In such cases, the VDP error message will be returned in one of the columns of the stored procedure's output.

Stored Procedure Syntax

  • input_user: username for authenticating with the Collibra instance. Mandatory.
  • input_password: password for authenticating with the Collibra instance.
  • input_url: url for the Collibra instance.
  • input_proxy_user: user for the proxy.
  • input_proxy_password: password for the proxy.
  • input_proxy_host: proxy host.
  • input_proxy_port: proxy port
  • input_proxy_enabled: this parameter determines if the proxy is used.
  • input_policy_names_to_process: This parameter filters the Standards and Rules in Collibra Protect that will be imported into VDP. The asterisk (*) character can be used as a wildcard in your search terms, representing any sequence of characters (including no characters at all).
  • A search for *log will return all rules that end with "log".
  • A search for rule* will return all rules that start with "rule".
  • A search for *rule* will return all rules that contain "rule".
  • input_protect_status_filter: This optional parameter specifies the statuses of the Data Protection Standards (DPS) and Data Access Rules (DAR) from Collibra that will be executed in Denodo. Multiple statuses can be provided as a comma-separated list. By default, if this parameter is not set, only active DPS and DAR will be executed.
  • global_security_policy_name: Name of the Global Security Policy created.
  • global_security_policy_vql: VQL statement executed to create the Global Security Policy.
  • global_security_policy_error: Error returned by VDP if it cannot create the Global Security Policy.
  • creation_executed: a boolean that indicates if the VQL was executed in Denodo.

Importing Collibra Workflows

Two workflows (apps) need to be imported into the Collibra instance that will be used: one to be executed before the synchronization stored procedure is executed for the first time (“Init resource security tags attribute”) and another one to be executed every time that a change to synchronizable resources affects their Data Classification or Data Category (“Update resource security tags attribute”).

Briefly, these workflows copy the names of each Data Classification and Data Category associated with each column in a synchronizable resource to one attribute. You can use an existing attribute or you can create a new attribute, for example, with the name DATA_CLASSIFICATION. And also copy the names of the Datasets that the column belongs to into another attribute. You can use an existing attribute or you can create a new attribute, for example, with the name DATA_DATASETS. This later allows Denodo VDP to retrieve these values from those attributes for these synchronizable resources.

For instructions on how to deploy workflows, you can refer to the official Collibra documentation. Alternatively, the next sections will guide you through the process.

Importing the “Init resource security tags attribute” Workflow

In the Collibra Workflow Designer select Import App, and select the init resource security tags attribute-bar.zip file:

Before deploying the app you can review its configuration, which will allow you to modify aspects such as the regex expressions to be used for transforming the data classifications/categories/datasets values to be managed or the UUIDs used for the new attributes. It is highly recommended to leave these configuration values to their default values unless instructed to change them to avoid specific issues.


Select
Deploy

Navigate to Settings > Workflows, select the recently imported workflow



Set the Applies To field to Domain

.


Uncheck
Perform candidate user check on workflow start


Edit the variable
attributeTypeUUIDToUpdate.

First, you need to decide which Collibra attribute will store the data categories and data classification so they can be imported into Denodo later.

To create a new attribute, navigate to Settings > Operating Model > Attribute Types and click Add. If you create a new attribute, remember that you must assign it to the 'Column' asset type and then update the corresponding asset page layout.

Finally, you must enter the Resource ID of the chosen attribute type (whether it's a new or an existing one) in the attributeTypeUUIDToUpdate value field.

Edit the variable attributeTypeUUIDToUpdate2.

First, you need to decide which Collibra attribute will store the data sets of every column, so they can be imported into Denodo later.

To create a new attribute, navigate to Settings > Operating Model > Attribute Types and click Add. If you create a new attribute, remember that you must assign it to the 'Column' asset type and then update the corresponding asset page layout.

Finally, you must enter the Resource ID of the chosen attribute type (whether it's a new or an existing one) in the attributeTypeUUIDToUpdate2 value field.

Enable the workflow.


Executing the “Init resource security tags attribute” Workflow

This “Init resource security tags attribute” needs to be executed before executing the stored procedure for the first time in VDP. This will allow VDP to obtain the data categories and data classifications applied on Collibra entities as VDP tags, and later create Global Security Policies that match these tags.

In order to do this, navigate to the domain that you want to synchronize with VDP, and apply the action called:
 Init resource security tags attribute.

Note that you have to wait to finish the workflow before importing tags into VDP, using the procedure described below.

Importing the “Update resource security tags attribute” Workflow

In an equivalent way to the installation of the “Init resource security tags attribute” workflow, in the Collibra Workflow Designer select Import App, and select the update resource security tags attribute-bar.zip file.

Before deploying the app you can review its configuration, which will allow you to modify aspects such as the regex expressions to be used for transforming the data classifications / categories / datasets values to be managed or the UUIDs used for the new attributes. It is highly recommended to leave these configuration values to their default values unless instructed to change them to avoid specific issues.

Execute Deploy. Navigate to Settings>Workflows, select the recently imported workflow and uncheck Perform candidate user check on workflow start.

Add the following events in Start events:

  • Data Classification Added
  • Data Classification Rejected
  • Data Classification Removed
  • Data Classification Updated
  • Data Classification Accepted
  • Relation was added on an asset as head
  • Relation was added on an asset as tail
  • Relation was removed from an asset as head
  • Relation was removed from an asset as tail

Edit the variable attributeTypeUUIDToUpdate and   attributeTypeUUIDToUpdate2.

You have to put in these variables the same values than in the “Init resource security tags attribute” Workflow.


Enable the workflow:


When enabled, this workflow responds to changes in the classification, categories, and data sets of the columns.

Usage

Importing Tags into VDP

In order to be usable by Global Security Policies at VDP, Collibra’s data classifications, data categories, and data sets need to be imported into VDP as tags.

Tags in Virtual DataPort are labels that you assign to columns, in this case. These tags determine which elements are affected by the Global Security Policies created as a result of synchronizing Standards and Rules from Collibra. Filters and maskings are applied to these elements.

In Design Studio, go to Administration > Semantic and governance > Import tags from external catalog. Choose Collibra in the Catalog parameter, and then enter values for URL, User and Password.


In the EXTERNAL CATALOG tab, Collibra should be selected as the Catalog value.


Select Include attributes and then, check the boxes for the attributes you are using to store the data classification/categories and the datasets. From now on, we will refer to these attributes as DATA_CLASSIFICATION and DATA_DATASETS, respectively.

In the DATA_CLASSIFICATION attribute, the Data Classification and Data Categories applied in Collibra are stored (see workflows explained above). If you don't need any synchronization related to datasets, you can leave the DATA_DATASETS attribute unchecked.

Click on "Execute" and a dialog will be displayed showing a preview.


After executing, tags should be correctly retrieved. Clicking on “
Accept changes” will apply the new tags to Denodo view columns.

Executing the Synchronization


The
input_policy_names_to_process parameter specifies the Collibra Standards and Rules that will be transformed into VDP Global Security Policies.

The following Collibra connection parameters are required: input_user, input_password and input_url. And you can configure a proxy for accessing Collibra if necessary with input_proxy_user, input_proxy_password, input_proxy_host, and input_proxy_port.

The stored procedure's behavior depends on the Data Access Rules or Data Protection Standards status in Collibra Protect. The behaviour by default is:

  • If the status is enabled: The corresponding Global Security Policy is automatically created and enabled in VDP.
  • If it is any other state: The Global Security Policy is not created in VDP. But the stored procedure's response will contain the VQL CREATE statement for the GSP, but with its ENABLED parameter explicitly set to false.

However, if input_protect_status_filter is set, the behaviour is as follows:

  • Rules with matching statuses: All DPS and DAR whose status is included in the parameter are automatically created and enabled in VDP.
  • Rules with non-matching statuses: The VQL CREATE statements for these rules will be returned in the response, but they will not be created in VDP. The VQL will contain ENABLED = false.

If an error occurs while creating a GSP, the error message is included in the response, and the stored procedure continues with the remaining GSPs.

Synchronization Example

SELECT * FROM import_collibra_protect_policies() WHERE   input_user='xxx' and input_password='xxx'

and input_url='https://xxxx.collibra.com' and input_proxy_user=NULL and input_proxy_password=NULL

and input_proxy_host=NULL and input_proxy_port=NULL

and input_proxy_enabled=0

and input_policy_names_to_process='*tb_bk*'

Examples

Data Protection Standard Example

The following is an example of importing a Data Protection Standard from Collibra into VDP.

Below is an example of the Data Protection Standard in Collibra that will be imported into VDP.

After importing the tags into VDP, you can execute the stored procedure from the VQL shell to generate the VQL and import the Data Protection Standard into VDP.

When a Data Protection Standard is imported by itself, the stored procedure always returns a single row of output because it creates only one Global Security Policy for each DPS.

The returned row includes the following fields:

  • The name of the generated Global Security Policy
  • The VQL of the Global Security Policy
  • The status of the Data Protection Standard
  • An error message (if the execution of the VQL fails)

CREATE OR REPLACE GLOBAL_SECURITY_POLICY "data_ protection_standard_ default_ masking" DESCRIPTION = 'protect Contact information

with Default masking' ENABLED =   TRUE  AUDIENCE ( ANY ROLES ( allusers) ) ELEMENTS ( COLUMNS TAGGED  ANY("Contact information") ) RESTRICTION ( FILTER = '' MASKING ANY ("Contact information") WITH (HIDE) (numbers WITH HIDE, datetimes WITH HIDE, texts WITH HIDE) )

If a Data Protection Standard has an 'Active' status, a Global Security Policy is created in Denodo, as shown in the image below.

Data Access Rule Example

The following is an example of importing a Data Access Rule from Collibra into Denodo.

Below is an example of a Data Access Rule in Collibra that will be imported into VDP.

After importing the tags into Denodo, you can execute the stored procedure from the VQL shell to generate the VQL and import the Data Access Rule into Denodo.

The importation of Data Access Rule generates a Global Security Policy for each filter or data mask.  Each row includes the following fields:

  • The name of the generated Global Security Policy. The name is generated based on its origin: It will contain the word 'Masking' if it comes from the Mask Data component and it will contain the word 'Filter' if it comes from the Filter Data component.
  • The VQL of the Global Security Policy
  • The status of the Data Access Rule
  • An error message (if the execution of the VQL fails)

In this example, two Global Security Policies have been created. The first from the Mask Data: "data_Access_Rule_Example_MASKING".The generated VQL is shown below:

CREATE OR REPLACE GLOBAL_SECURITY_POLICY "data_Access_Rule_Example_MASKING" DESCRIPTION = '' ENABLED =   TRUE  AUDIENCE ( ANY ROLES ( testrole) ) ELEMENTS ( COLUMNS TAGGED  ANY("Bookstore > books") ) RESTRICTION ( FILTER = '' MASKING ANY ("Book Category (Migrated)") WITH (HIDE) (numbers WITH HIDE, datetimes WITH HIDE, texts WITH HIDE) )

The image below shows the Global Security Policy generated in Denodo.

An the second from the Filter Data:"data_Access_Rule_Example_1_FILTER". The generated VQL is shown below:

CREATE OR REPLACE GLOBAL_SECURITY_POLICY "data_Access_Rule_Example_1_FILTER" DESCRIPTION = '' ENABLED =   TRUE  AUDIENCE ( ANY ROLES ( testrole) ) ELEMENTS ( COLUMNS TAGGED  ANY("Bookstore > books") ) RESTRICTION ( FILTER = '"Book Category (Migrated)" <> ''Science'''  REJECT )

The image below shows the Global Security Policy generated in Denodo.

Limitations


1. Synchronized policies from Collibra to VDP work at the column level, the table level is not supported. The VDP Global Security Policies created by means of this stored procedure match tags at the column level.

2. If a 'Business process' asset type is included among the assets defined in Collibra's Rules, it will be ignored (this type is not supported by the synchronization).

3. Denodo's Global Security Policies (GSPs) are designed exclusively to restrict access (e.g., by masking columns or filtering rows). Permissive policies from Collibra, such as 'SHOW_EVERYTHING' filters or Data Access Rules that contain no masking or filter conditions cannot be translated into a functional GSP. In this scenario, a CREATE VQL  statement will be returned in the response. This VQL is not executed and has no effect on the Denodo.