DATA CATALOG ADMINISTRATION

The Denodo Data Catalog is a web based self service tool included in Denodo Platform that lets both technical and business users query, search and browse information and metadata stored in a Virtual DataPort server. With this tool, users can generate new knowledge and pave the way to make better decisions.

It is strongly advised to complete the Data Services tutorial before starting this one, as we are going to be using the resources (data source, base views and integrations) created on it.

If you have already completed it, please proceed to the next section. Otherwise, follow the subsequent instructions:

  1. Launch the resources needed (check how in the Installation & Bootstrapping tutorial).
  2. Log in Denodo Design Studio (user/password: admin/admin)
  3. Import this VQL to Denodo by clicking into File > Import. Drag ‘n' drop the file, select Use custom password for sensitive data decryption, and enter denodo in the Password field.

If all steps have been executed correctly, you should observe the following in the Design Studio's elements tree:

Great! Now it's time to start the tutorial!

  1. For starting this web tool in a local installation, you have to open the Denodo Platform Control Center, and start the Data Catalog. Once it changes the status to "Running", click the Data Catalog link to open the Web tool (by default: https://127.0.0.1:9090/denodo-data-catalog).

Now login to the Data Catalog using the standard login details (admin/admin):

The first time you login to the Data Catalog, you will notice the Synchronize Metadata popup window. This needs to be run when you open the Data Catalog for the first time, in order to ensure that the Data Catalog reflects the latest state of the Denodo Virtual DataPort server you are connected to.

Run the VDP Synchronization as follows:

  1. Click the Synchronize the metadata now link.
  2. Click Continue on each Synchronization step.
  3. Once the synchronization is complete, you should now see the views and web services available in the tree view when navigating to the Browse > Database / Folders page

Great! Now let's continue with the next setup.

In this section We will explore the features of the Data Catalog Content Search. With this feature you can use Denodo Scheduler to index the content of your views using either ElasticSearch or the Denodo Scheduler Index Server. You can then allow your users to perform Google-like searches on them, and to customize how they see the search results.

We are going to index the fields of the bv_crm_client view to allow more rapid discovery of client details.

Our first Step is to configure an Index. Let's see how to do that using Denodo Scheduler

Creating an Index in the Denodo Scheduler

Start the Scheduler Server and the Scheduler Index Server (you can follow the latest section of the Caching tutorial to know how to start Denodo Scheduler). Once these are all running, open the Scheduler Administration Tool for creating an Index.

Create the Index following these steps:

  1. In the login screen of the Scheduler Administration Tool, provide the login details admin / admin and URI of the Scheduler Server.

  1. In the Denodo Scheduler we need to create a new job to create and maintain the Index. Click Add Job > VDPIndexer

  1. Give the Job a suitable name, in this case index_clients.

  1. Choose the following settings Under the Extraction section, while leaving the rest to default/blank:
  • Data Source: VDP
  • View: tutorial.bv_crm_client
  • Indexing process name: tutorial.bv_crm_client

  1. Under the Exporters section, click Add Exporter > Scheduler-Index and choose the following settings while leaving the rest to default/blank:
  • Data Source: Scheduler-Index
  • Index name: ix_client

  1. Save the Scheduler Job.
  2. Once the job is saved, you can execute the job by clicking three dots and then the Start option.

  1. The job will execute and once successfully complete, the Result status will change to COMPLETE, indicating that the Index has been populated.

Configuring the Index in the Data Catalog

We now need to configure the newly created Index in the Data Catalog, in order to ensure that the Data Catalog includes the Index as part of the searchable content.

  1. Open the Data Catalog and navigate to Administration > Set-up and management.
  2. In the Administration window, click on Search Engines option

  1. Click on + Add server option under Index Servers tab.

  1. Add the details as follows to the Add New Index Server screen.
  • Name: TutorialIndex
  • Type: Scheduler Index
  • Description: Tutorial Index
  • Host: sched (or localhost if you are using a local installation of Denodo)
  • Port: 9000
  • Login: admin
  • Password: admin
  1. Click Ok.

  1. Go to the Content Search tab, Click the Pencil Icon.

  1. In the Search index path screen, add the following details:
  • Index Type: Scheduler Index
  • Index Server: TutorialIndex
  • Index Name: ix_client
  1. Click Ok.

  1. The Index will display a green checkmark under the Configured column to indicate that the Index was added successfully.

Done! In the Using Denodo Data Catalog tutorial you will see the new Index in action. For now, let's continue setting up other configurations of the Data Catalog.

In this section, we will explore the feature of creating Property Groups. With this feature, we can use the Denodo Data Catalog to define custom properties; add them to databases, views and web services; and give them a value on each element in order to improve the available information on them.

In our example, we are going to create a new property group containing custom properties to identify the data owner of a view, and, as an example, assign it to the bv_crm_client view.

Creating a Property Group

Our first step is to create a property group to specify information for Data Ownership

  1. Open the Data Catalog and navigate to the Administration > Set-up and management.
  2. In the Administration window, click on the Property Groups option.

  1. In the Property group management page, click on the + Add Property Group button

  1. In the pop-up page that appears, specify the following details:

    Name: Data Ownership
    Description: This property group is to specify the data owner information of the data asset
    Show in: Summary tab

  1. Click Ok

Adding a Custom Property to the Property Group

Now that we have created the Property Group, it's time to add some Custom Properties to it to specify what kind of information we want to define to enhance the information of the views.

  1. In the Property group management page, click on the three dots button located at the rightmost section of the Data Ownership property group, and select Manage Properties option.

  1. In the next page, click the + Add Property button.

  1. In the pop-up page that appears, specify the following details:

    Name: Data Owner Name
    Description: The name of the data owner
    Property type: Text

  1. Click Ok
  2. Once we are back to the Properties Management page, we can add more custom properties to the property group, but for now, we will just add one property for this tutorial

Assigning the Property Group to a View

We can now assign our newly created Property Group with the relevant Custom Property to our bv_crm_client view to enhance it with the Data Ownership information.

  1. Navigate back to the Administration > Set-up and management > Property Group page.
  2. Click on the three dots button located at the rightmost section of the Data Ownership property group, and select Assign Property Group option.

  1. In the Assign property group page, select the bv_crm_client view, and drag it towards the Views section.

  1. In the pop-up page that appears, we can now specify the value of the Data Owner Name property, which we configured earlier.

    Specify Jane Smith in the text box and click Ok.

  1. Now, navigate to the Browse > Databases /Folders page, and click on the bv_crm_client view to open it.
  1. In the Summary tab of the view, we can now see the Data Ownership information of this view

Great! Let's continue setting up other configurations of the Data Catalog.

In this section, we will explore how to configure one of the best features of Data Catalog: the Assisted Queries. This feature lets you explain your data needs in natural language via the Natural language query input. Providing more details not only refines the search results but also elevates the quality of the displayed information. This extends also to the metadata of the view.

In this tutorial we will set up the required configuration to start using the Assisted Query feature.

Create an OpenAI API Secret Key

The Data Catalog provides 3 options on the API to use for the LLM services so you need the credentials of the service in order to configure it in the Data catalog.

In this tutorial, we will use the official public OpenAI API, as they provide $5 credit for new users (as of June 2024). So you can create a new account if needed to test this feature.

  1. Once you have your OpenAI account, in a web browser, access the https://platform.openai.com/api-keys page
  2. Navigate to API keys menu and click the + Create new secret key button

  1. In the pop up window, specify the name as DenodoAppKey and click the Create secret key

  1. Copy and keep the secret key somewhere for use in the next section

Configure the Assisted Query feature in the Data Catalog

  1. Login to the Data Catalog and navigate to the Administration > Set-up and management, click on Server, and go to the AI Integration tab.


  1. Specify the following information:

    Enable query generation: Enabled (please read the Disclaimer shown in the Data Catalog)
    Enable data usage: disabled
    Language options: User locale
    Execution mode: Manual
    API: OpenAI
    API key: the secret key obtained from previous section
    Organization ID: the service account id added in the previous section (e.g. my-denodo-data-catalog)
    Model: gpt-3.5-turbo-16k

Done! We are going to see the Assisted Query feature in action in the Using Denodo Data Catalog tutorial.

In this tutorial, we have configured various Data Catalog features which we are going to explore in the next. Head on to the next tutorial to see how these features work.

Thanks!