CREATE_TAGS_FROM_COLLIBRA¶
Description
The stored procedure CREATE_TAGS_FROM_COLLIBRA
connects with a Collibra instance for extracting the defined information about
attributes and tags assignments to views and columns.
This procedure can be used for reading the Collibra data, and also it can be used to create and assign the attributes and tags at Virtual DataPort.
Note
This feature is only available with the subscription bundle Enterprise Plus. To find out the bundle you have, open the About dialog of the Design Studio or the Administration Tool. See more about this in the section Denodo Platform - Subscription Bundles.
Syntax
CREATE_TAGS_FROM_COLLIBRA (
input_scope : text
, input_action : text
, input_database : text
, input_include_tags : boolean
, input_tag_names : array
, input_include_attributes : boolean
, input_attribute_names : array
, input_search_since_last_execution_time : boolean
, input_search_since : timestamp
, input_unassign_tags_mode : text
, input_unused_collibra_tags_mode : text
, input_attribute_name_as_prefix : boolean
)
input_scope
: name of the Collibra scope where Virtual DataPort will search for metadata. Communities and domains accessible with the scope will be accessed (mandatory).input_action
: indicates what the procedure have to do. There are two possible values:READ
: the procedure accesses to Collibra and returns the retrieved information (default).CREATE
: the procedure accesses to Collibra, returns the retrieved information and stores that information at Virtual DataPort.
input_database
: name of the Virtual DataPort database where the tagging information will be imported. This parameter must be used only when you want to override the database retrieved from Collibra.input_include_tags
: if Collibra tags are retrieved. (default value:false
)input_tag_names
: names of the Collibra tags which we want to retrieve. If not defined, then any tag found on the accessible elements will be retrieved.input_include_attributes
: if Collibra attributes are retrieved. (default value:false
)input_attribute_names
: names of the Collibra attributes which we want to retrieve, such asSecurity Classification
orPersonally Identifiable Information
. Mandatory wheninput_include_attributes=true
.input_search_since_last_execution_time
: indicates if Virtual DataPort must execute an incremental search since latest execution or since the instant indicated byinput_search_since
. This option is ignored in case ofinput_include_tags=true
.input_search_since
: instant since the search of Collibra metadata must be done. In case ofinput_search_since_last_execution_time=true
and no value for this parameter, the search will use the latest time the procedure was executed (recommended method).input_unassign_tags_mode
: indicates what Virtual DataPort will do with the pre-existent tags on the views and columns of the databases found at Collibra metadata. There are two possible values:ALL
: all pre-existent tags will be unassigned from the views or columns. This option is intended to be used when tags assignment is only done at Collibra.ONLY_COLLIBRA
: pre-existent tags imported from an external system, such the imported with this procedure, will be unassigned from the views or columns. This option is useful for allow to define tags manually on the views or columns, but also import the information from Collibra. (default)
input_unused_collibra_tags_mode
: indicates what Virtual DataPort will do with the imported tags from Collibra which are no longer used by views or columns. There are next possible values:RETAIN
: tags are not removed from Virtual DataPort. (default)DROP_UNASSIGNED
: tags no longer used by views or columns are removed from Virtual DataPort. Note that it could exist tags used by other elements, such as Global Security Policies. In that case, the tags will not be removed with this option.DROP_UNASSIGNED_CASCADE
: tags no longer used by views or columns are removed from Virtual DataPort, and other elements, such Global Security Policies, will be also removed in case they are using some removed tag.
input_attribute_name_as_prefix
: indicates if the tags created at Virtual DataPort from Collibra attributes will have the attribute name as prefix. It is useful for identifying the origin of the tags created at Virtual DataPort.For example, a column has
Confidential
value for attributeSecurity Classification
at Collibra. This procedure will create a tag on Virtual DataPort named asSecurity Classification > Confidential
if this parameter istrue
, andConfidential
in case offalse
. (default valuetrue
)
The output schema has the following fields:
execution_time
: the instant when the procedure accesses to the Collibra instance. It could be used for incremental search using also theinput_search_since
parameter.element_type
: the type of the element returned at current tuple. For exampledatabase
,tag
orview
.database_name
: the name of the database, in case the current element is adatabase
, or the database which contains the current element in other case.view_name
: the view name, in case the current element is a view or a column.column_name
: the column name, in case the current element is a column.tag
: the tag name, in case the current element is a tag.tag_description
: the tag description, in case the current element is a tag.assignment_status
: indicates if the tag was correctly assigned or if not, in case the current tuple represents a tag assignment.
Privileges Required
Only administrators can invoke this procedure.
Configuration
Currently, it does not exist a dialog for configuring the access to Collibra. It has to be configured using properties.
Next configuration properties are available:
com.denodo.vdb.contrib.ext.datagov.collibra.connection.url
: the connection URL.com.denodo.vdb.contrib.ext.datagov.collibra.connection.user
: the user name to authenticate.com.denodo.vdb.contrib.ext.datagov.collibra.connection.password.secret
: the password from the user to authenticate. It receives a clear value or an encrypted one. In this case, the value must be prefixed withencrypted:
.For example:
-- Clear SET 'com.denodo.vdb.contrib.ext.datagov.collibra.connection.password.secret' = 'my_secret_password'; -- Encrypted SET 'com.denodo.vdb.contrib.ext.datagov.collibra.connection.password.secret' = 'encrypted:<encrypted_value>';
To encrypt the password, execute the statement
ENCRYPT_PASSWORD
. For example:ENCRYPT_PASSWORD 'my_secret_password';
com.denodo.vdb.contrib.ext.datagov.collibra.connection.proxyEnabled
: use a proxy to access to Collibra. Possible values areON
,OFF
,AUTOMATIC
orDEFAULT
(use the Default Configuration of HTTP Proxy)com.denodo.vdb.contrib.ext.datagov.collibra.connection.proxyHost
: the host name to use as proxy.com.denodo.vdb.contrib.ext.datagov.collibra.connection.proxyPort
: the port number to connect to the host used as proxy.com.denodo.vdb.contrib.ext.datagov.collibra.connection.proxyUser
: the user name to connect to the proxy.com.denodo.vdb.contrib.ext.datagov.collibra.connection.proxyPassword.secret
: the password to connect to the proxy. The value follows the same format than the user password. A clear value or an encrypted one prefixed withencrypted:
.com.denodo.vdb.contrib.ext.datagov.collibra.connection.proxyPacUri
: the PAC URI to use whenAUTOMATIC
status is set.com.denodo.vdb.contrib.ext.datagov.collibra.connection.attributes
: a list of Collibra attribute names which are going to be accessed in case you do not want to specify them when invoking the procedure. The value must be a valid JSON string list. Note that this value is only used when parameterinput_attribute_names
is not specified.For example:
SET 'com.denodo.vdb.contrib.ext.datagov.collibra.connection.attributes'='["Security Classification", "Personally Identifiable Information"]';
Examples
First of all, we need to configure the access to Collibra. Let us say that the credentials are:
URL
: https://acme.collibra.comUser
: cl-userPassword
: cl-password
We have to set the configuration properties. For securing the password value, we encrypt it.
ENCRYPT_PASSWORD 'cl-password';
Copy returned encrypted value, let us say it was something like: uq2Ldx0oXTIscUU/KmiVtZfFaGs
.
Now, execute these commands:
SET 'com.denodo.vdb.contrib.ext.datagov.collibra.connection.url'='https://acme.collibra.com';
SET 'com.denodo.vdb.contrib.ext.datagov.collibra.connection.user'='cl-user';
SET 'com.denodo.vdb.contrib.ext.datagov.collibra.connection.password.secret'='encrypted:uq2Ldx0oXTIscUU/KmiVtZfFaGs';
The following examples assume that you have loaded your Virtual DataPort database to your Collibra instance. This can be done, for example, using the Denodo to Collibra Integration.
For the examples, we have uploaded some metadata to Collibra that contains two views named customers
and employees
.
At Collibra, we have added the following tags and attributes:
customers
view:Added tags
customers_data
andsensitive
.Columns:
phone
: tagsensitive_value
andRestricted
as value forSecurity Classification
.first_name
: marked asPersonally Identifiable Information
.last_name
: marked asPersonally Identifiable Information
.
employees
view:Added tags
sensitive
andemployees_data
.Columns:
first_name
: marked asPersonally Identifiable Information
.last_name
: marked asPersonally Identifiable Information
.password
: tagpassword
.picture
: marked asPersonally Identifiable Information
andHighly Confidential
as value forSecurity Classification
.salary
: valueRestricted
forSecurity Classification
.
Also, let us say that Virtual DataPort metadata was uploaded to a Collibra domain accessible by the scope Denodo to Collibra Integration.
Example 1
Let us say we want to use this procedure to see what information about tags and attributes are at Collibra. Also, we are
interested on tags and values for attributes Security Classification
and Personally Identifiable Information
.
Execute this for obtain the data.
SELECT element_type, database_name, view_name, column_name, tag, tag_description, assignment_status
FROM CREATE_TAGS_FROM_COLLIBRA()
WHERE input_scope = 'Denodo to Collibra Integration'
AND input_attribute_names = {row('Security Classification'),
row('Personally Identifiable Information')}
AND input_include_tags = true
AND input_include_attributes = true
AND input_action = 'READ'
AND input_attribute_name_as_prefix = true;
The following image is a screenshot of the result of executing previous query.
Result meaning
Depending on the element_type
column:
database: databases present at Collibra instance. In a
CREATE
execution they will be affected by the changes.tag: when a description is retrieved for a tag is retrieved at Collibra, this tuple contains that value.
view: assignment to a view. It will contain a tag name on column
tag
. If no tag name is found, then this means that the view does no longer have tags or attributes on Collibra. This second scenario can happen when an incremental search is executed.column: assignment to a column. It will contain a tag name on column
tag
.
Example 2
At this example, we want to apply the Collibra tags and attributes on Virtual DataPort metadata. For that, we have to execute previous query
with CREATE
as input_action
.
SELECT element_type, database_name, view_name, column_name, tag, tag_description, assignment_status
FROM CREATE_TAGS_FROM_COLLIBRA()
WHERE input_scope = 'Denodo to Collibra Integration'
AND input_attribute_names = {row('Security Classification'),
row('Personally Identifiable Information')}
AND input_include_tags = true
AND input_include_attributes = true
AND input_action = 'CREATE'
AND input_attribute_name_as_prefix = true;
Now, execute a Refresh
Design Studio or the Administration Tool and if we can see the tags created and assigned on Virtual DataPort.
And one of the views.
Example 3
At this example, we only want the information about Collibra attributes, but not the Collibra tags. We have to use the
parameter input_include_tags
with false
value.
SELECT element_type, database_name, view_name, column_name, tag, tag_description, assignment_status
FROM CREATE_TAGS_FROM_COLLIBRA()
WHERE input_scope = 'Denodo to Collibra Integration'
AND input_attribute_names = {row('Security Classification'),
row('Personally Identifiable Information')}
AND input_include_tags = false
AND input_include_attributes = true
AND input_action = 'CREATE'
AND input_attribute_name_as_prefix = true
If we Refresh
the client tool, we can see opening a view that only the Collibra attributes are assigned.
For example, employees
view does no longer have the tags employees_data, sensitive and password.
We are not accessing to Collibra tags, so we can test the input_search_since_last_execution_time
parameter. If we
execute now with specifying true
to that parameter, we can see that no results are returned, as there are not changes on Collibra since
last import process.
At Collibra, we remove the Personally Identifiable Information
from column first_name
of the view employees
.
If we execute now the procedure incrementally, only the tagging information for view employees
is retrieved,
as it is the unique modified element since last import.
SELECT element_type, database_name, view_name, column_name, tag, tag_description, assignment_status
FROM CREATE_TAGS_FROM_COLLIBRA()
WHERE input_scope = 'Denodo to Collibra Integration'
AND input_attribute_names = {row('Security Classification'),
row('Personally Identifiable Information')}
AND input_include_tags = false
AND input_include_attributes = true
AND input_action = 'CREATE'
AND input_attribute_name_as_prefix = true
AND input_search_since_last_execution_time = true;
The following image is a screenshot of the result of executing previous query.
Note that first_name
does no longer have tags and no changes are applied to other views.