CREATE_TAGS_FROM_VIEW

Description

The stored procedure CREATE_TAGS_FROM_VIEW allows to load tags assignments reading the information from a Virtual DataPort view. That view must follow a predefined schema and it must return data following a predefined structure.

Note that the specified view could be of any type, so, for example, you can have your tags defined on a CSV file, or you can access to an external system using a JSON data source through a REST API.

Note

This feature is only available with the subscription bundle Enterprise Plus. To find out the bundle you have, open the About dialog of the Design Studio or the Administration Tool. See more about this in the section Denodo Platform - Subscription Bundles.

Syntax

CREATE_TAGS_FROM_VIEW (
      input_database_view_name : text
      , input_view_name : text
      , input_action : text
      , input_unassign_tags_mode : text
      , input_unused_external_tags_mode : text
      , input_incremental : boolean
)
  • input_database_view_name: name of the database that contains the view which will be executed by this procedure.

  • input_view_name: name of the view which will be executed by this procedure.

  • input_action: indicates what the procedure have to do with the information read from the view. There are two possible values:

    • READ: the procedure executes the view and returns the retrieved information (default).

    • CREATE: the procedure executes the view, returns the retrieved information and stores that information at Virtual DataPort.

  • input_unassign_tags_mode: indicates what Virtual DataPort will do with the pre-existent tags on the views and columns of the databases found at executed view. There are two possible values:

    • ALL: all pre-existent tags will be unassigned from the views or columns. This option is intended to be used when tags assignment is only done by this procedure.

    • ONLY_EXTERNAL: only pre-existent tags imported from an external system, such the imported with this procedure, will be unassigned from the views or columns. This option is useful for allow to define tags manually on the views or columns, but also import the information using the procedure. (default)

  • input_unused_external_tags_mode: indicates what Virtual DataPort will do with the imported tags from the procedure which are no longer used by views or columns. There are next possible values:

    • RETAIN: tags are not removed from Virtual DataPort. (default)

    • DROP_UNASSIGNED: tags no longer used by views or columns are removed from Virtual DataPort. Note that it could exist tags used by other elements, such Global Security Policies. In that case, the tags will not be removed with this option.

    • DROP_UNASSIGNED_CASCADE: tags no longer used by views or columns are removed from Virtual DataPort, and other elements, such Global Security Policies, will be also removed in case they are using some removed tag.

  • input_incremental: indicates if executed view will return data incrementally, with only the new changes since last loading. Note that, if there are changes on a view or its columns, the full set of tags from that view must be present on the data.

The output schema has the following fields:

  • element_type: the type of the element returned at current tuple. Posible values are database, tag, view and column.

  • database_name: the name of the database, in case the current element is a database, or the database which contains the current element in other case.

  • view_name: the view name, in case the current element is a view or a column.

  • column_name: the column name, in case the current element is a column.

  • tag: the tag name, in case the current element is a tag.

  • tag_description: the tag description, in case the current element is a tag.

  • assignment_status: indicates if the tag was correctly assigned or if not, in case the current tuple represents a tag assignment.

Privileges Required

Only administrators can invoke this procedure.

View Definition

The procedure expects that the input view has a schema with at least certain column names. We recommend to define an interface following the correct schema, and then assign the desired view as implementation.

The expected schema is:

CREATE OR REPLACE INTERFACE VIEW i_tags_definition (
        element_type:text,
        database_name:text,
        view_name:text,
        column_name:text,
        tag:text,
        tag_description:text
    )

The procedure executes next query for extract the data from the view:

SELECT element_type, database_name, view_name, column_name, tag, tag_description
         FROM <input_database_name>.<input_view_name>

Data Structure

The view must return data following a structure that Virtual DataPort expects for understanding what changes must be done.

Returned data must have next sections:

  • Databases: affected databases are returned in a single row per database. That is, a row with element_type value database and the name on column database_name. Note that, When input_action has CREATE as value, the views from these databases can lose their tags in favor of the returned ones by the executed view, depending on the procedure parameters. Row 1 from below table.

    • When input_unassign_tags_mode has value ALL, the views will lose all their tags. If the value is ONLY_EXTERNAL, then the lost tags would be the imported from an external system, like this procedure.

    • When input_incremental has true value, the views only lose their tags if there is an empty tag assignment for the view at returned data. The unassigned tags depends on the value of input_unassign_tags_mode parameter.

  • Tag descriptions: if you want to add a description to a tag, this information is returned on a row per tag. The element_type value is tag, the name must be at column tag and the description at tag_description. If the tag does not have a description, then it is not required to include a tuple for the tag with an empty value. Row 2 from below table.

  • Assignments: tag assignment to views or columns.

    • Views: tags assigned to a view must be returned on a row per assignment. The element_type value is view, the database_name is the database where the view is and the view_name the name of the view. Finally, the tag column contains the name of the tag. Row 3 from below table.

    • Columns: tags assigned to a column must be returned on a row per assignment. The element_type value is column, the database_name is the database where the view is, the view_name the name of the view and the column_name the name of the column. Finally, the tag column contains the name of the tag. Row 4 from below table.

    • Incremental: Notes about data structure when an incremental loading is executed.

      • Only new assignments should appear.

      • When there are changes on the assignment of a view or its columns, the whole tagging information of the view and its columns must appear on the data.

      • If a view no longer have tags, a tuple referring to the view with no tags must appear in order to remove all the tags from the view and its columns. This means a tuple with element_type as view, the database_name and view_name identifying the view and anything else. Row 5 from below table.

Data sections

element_type

database_name

view_name

column_name

tag

tag_description

1

database

<database_name>

2

tag

<tag_name>

<tag_description>

3

view

<database_name>

<view_name>

<tag_name>

4

column

<database_name>

<view_name>

<column_name>

<tag_name>

5

view

<database_name>

<view_name>

Example

The following table shows an example of how data must be returned by the view. Note that the order is not relevant.

  • Rows 1 to 2 are the databases affected by the tagging.

  • Rows 3 to 5 are descriptions for some of the used tags.

  • Rows 6 to 8 are tags assignments to views. For example, row 6 is the assignment of the tag hr_data to the view hr.employees.

  • Rows 9 to the end are tags assignments to columns. For example, row 10 is the assignment of the tag pii to the column first_name from view hr.employees.

element_type

database_name

view_name

column_name

tag

tag_description

1

database

hr

2

database

sales

3

tag

pii

Personally identifiable information

4

tag

hidden

Data that must be hidden

5

tag

zip

Postal code

6

view

hr

employees

hr_data

7

view

sales

customer_payments

customers_data

8

view

sales

customer_payments

payments_data

9

column

hr

employees

first_name

pii

10

column

hr

employees

last_name

pii

11

column

hr

employees

email

pii

12

column

hr

employees

picture

hidden

13

column

hr

employees

password

hidden

14

column

sales

customer_payments

first_name

pii

15

column

sales

customer_payments

last_name

pii

16

column

sales

customer_payments

email

pii

17

column

sales

customer_payments

postal_code

zip

18

column

sales

customer_payments

amount

payment

We can put that data on a CSV file and use a Delimited File data source for creating a Virtual DataPort view. This view, can be used as implementation of the interface that follows the expected schema by the procedure. For example, the i_tags_definition shown before.

Next image shows the interface i_tags_definition with its implementation, a Delimited File base view over previous data. .. note:: Note that the implementation view can be of any type and any complexity.

Interface i_tags_definition

Definition of the interface i_tags_definition

Now, we can execute the procedure CREATE_TAGS_FROM_VIEW for reading the tags definition using the interface i_tags_definition.

SELECT element_type, database_name, view_name, column_name, tag, tag_description
   FROM CREATE_TAGS_FROM_VIEW()
      WHERE input_database_view_name = 'admin'
         AND input_view_name = 'i_tags_definition'
         AND input_action = 'READ';

Note that input_action value is READ, so the procedure will only retrieve data.

Returned data by ``CREATE_TAGS_FROM_VIEW`` procedure accessing to the interface ``i_tags_definition``

Returned data by CREATE_TAGS_FROM_VIEW procedure accessing to the interface i_tags_definition

When we check that returned data are correct, we can persist the tags assignments invoking the procedure with the input parameter input_action with value CREATE.

SELECT element_type, database_name, view_name, column_name, tag, tag_description
   FROM CREATE_TAGS_FROM_VIEW()
      WHERE input_database_view_name = 'admin'
         AND input_view_name = 'i_tags_definition'
         AND input_action = 'CREATE';

And now, the affected views or columns have the tags assignments.

View ``customer_payments`` after executing procedure ``CREATE_TAGS_FROM_VIEW``

View customer_payments after executing procedure CREATE_TAGS_FROM_VIEW with CREATE option.

Now, we are going to execute an incremental loading. We want to:

  • remove tags from view customer_payments and its columns. Data must have a tuple referencing the view without tags assignment.

  • add a new tag with description to the view store_managers. Data must have a tuple with the tag description and a tuple with the assignment to the view.

  • add a new tag without description to the column username of the view store_managers. Data must have a tuple with the assignment to the column.

  • finally, add assignments to the columns from store_managers of the tags pii and hidden. Data must have a the tuples with that assignments.

The new CSV file must have next data.

element_type

database_name

view_name

column_name

tag

tag_description

1

database

hr

2

database

sales

3

tag

store_data

Data about the stores

4

view

sales

customer_payments

5

view

hr

store_managers

store_data

6

column

hr

store_managers

first_name

pii

7

column

hr

store_managers

last_name

pii

8

column

hr

store_managers

email

pii

9

column

hr

store_managers

picture

hidden

10

column

hr

store_managers

password

hidden

Once our Delimited File data source points to the new file, we have to execute the procedure with the input_incremental = true.

SELECT element_type, database_name, view_name, column_name, tag, tag_description, assignment_status
   FROM CREATE_TAGS_FROM_VIEW()
      WHERE input_database_view_name = 'admin'
         AND input_view_name = 'i_tags_definition'
         AND input_action = 'CREATE'
         AND input_incremental = true;

After that, the view customer_payments does not have tags, the employees view has the same tags than before and the store_managers view have the new tag assignment.

View ``customer_payments`` after executing procedure ``CREATE_TAGS_FROM_VIEW`` with ``input_incremental`` = true

View customer_payments after executing procedure CREATE_TAGS_FROM_VIEW with input_incremental = true.

Finally, we are going to execute another incremental loading. At previous example, we forgot to add next tags to the view store_managers:

  • a tag id to the staff_id and store_id columns.

  • a tag staff_data to the view.

So, as the modified view is store_managers, we have to add only the tagging information for that view and its columns.

The new CSV file must have next data.

element_type

database_name

view_name

column_name

tag

tag_description

1

database

hr

2

database

sales

3

tag

id

Internal identifier

4

view

hr

store_managers

store_data

5

view

hr

store_managers

staff_data

6

column

hr

store_managers

first_name

pii

7

column

hr

store_managers

last_name

pii

8

column

hr

store_managers

email

pii

9

column

hr

store_managers

picture

hidden

10

column

hr

store_managers

password

hidden

11

column

hr

store_managers

staff_id

id

12

column

hr

store_managers

store_id

id

Remember that the whole tagging information for the view must be included. The previous tags and the new ones.

Once our Delimited File data source points to the new file, we have to execute the procedure with the input_incremental = true.

SELECT element_type, database_name, view_name, column_name, tag, tag_description, assignment_status
   FROM CREATE_TAGS_FROM_VIEW()
      WHERE input_database_view_name = 'admin'
         AND input_view_name = 'i_tags_definition'
         AND input_action = 'CREATE'
         AND input_incremental = true;

After that, the view store_managers has the new tag store_data and the columns staff_id and store_id have the tag id. The other views do not have changes.

View ``store_managers`` after executing procedure ``CREATE_TAGS_FROM_VIEW`` with ``input_incremental`` = true

View store_managers after executing procedure CREATE_TAGS_FROM_VIEW with input_incremental = true.