Server Set-Up¶
From the Server set-up dialog you can configure all the settings of the Data Marketplace server. It is divided into the following tabs:
VDP servers. Configure the Virtual DataPort servers available from the login page and the connection settings used in every query.
Authentication. Enable the single sign-on with Kerberos in the Data Marketplace.
Database. Configure an external database to store the metadata of Data Marketplace.
Permissions. Grant power users the privileges to do additional tasks.
LLM Configuration. Configure the LLM provider and proxy for Denodo Assistant features.
Vector DB. Configure the vector database integration to index the metadata and optionally data of Data Marketplace for the Global Assisted Query feature.
Configure the Virtual DataPort Servers¶
In the Servers section of the VDP servers tab you can configure the Virtual DataPort servers that are listed in the login dialog.
Note
VDP Server configuration is disabled in Agora.
Dialog to configure the Virtual DataPort servers listed in the login page¶
To add a new Virtual DataPort server click on the Add server button and enter the following values:
Name. The name that will be shown in the login page for this server.
URL. The connection URL of the server, which follows the pattern
//<host>:<port>/<database>. You should take into account the following considerations:Only those users granted with the
CONNECTprivilege on the database will be able to connect to the Data Marketplace.If the server is configured with LDAP authentication for a specific database, use this database in the URL.
Description. A description for the server.
Dialog to add a new Virtual DataPort server¶
Note
Saved queries are stored per user and Virtual DataPort server. The servers are identified by an internal identifier, so if a server is edited, it will maintain the same queries associated to it. However, if the server is removed, the queries are removed too.
Note
For Automated Cloud Mode you can configure the default server by editing the following properties in the file $DENODO_HOME/conf/data-catalog/DataCatalogBackend.properties
com.denodo.dc.vdp.port=<port>
com.denodo.dc.vdp.host=<hostname>
com.denodo.dc.vdp.servername=<servername>
Configure the Connection Settings to the Virtual DataPort Servers¶
In the Connection settings section of the VDP servers tab you can configure the global parameters of the connections created to execute queries against the Virtual DataPort Servers.
Dialog to configure the connection settings to the Virtual DataPort servers¶
These parameters are:
Query timeout. Maximum time in milliseconds that the query will wait for the statement to be completed. If not indicated (or you enter 0 as a value), then it will wait until the execution is complete. By default it is 900,000 milliseconds.
Chunk timeout. Maximum time in milliseconds that the query will wait until it arrives a new chunk of results. Where this time is exceeded, Virtual DataPort returns the results extracted up to that moment. If not specified (or you enter 0 as a value), Virtual DataPort returns all the results together at the end of the statement run. By default it is 90,000 milliseconds.
Chunk size. Number of results that make up a chunk of results. When Virtual DataPort Tool obtains this number of results, it will return them to the Data Marketplace, even though the Chunk timeout has not been reached. By default it is 1,000 rows.
Enable Single Sign-On with Kerberos¶
In the Authentication tab you can enable single sign-on with Kerberos in the Data Marketplace.
Note
Kerberos configuration is disabled in Agora since SSO is automatically configured.
Dialog to configure Kerberos authentication in the Data Marketplace¶
Follow these steps:
Enable the Use Kerberos option.
Enter the Service Principal Name of the Data Marketplace in the Server Principal field. You should enter the same value used to generated the keytab file.
Drag-and-drop the keytab file to the Keytab file field. As an alternative, you can click the field and select it from the files browser.
Consider to add a krb5 file to the Configuration file field if one of the following conditions is met and there is no krb5 file in the default location of your system:
The host where the Data Marketplace runs does not belong to a Kerberos realm (e.g., a Windows Active Directory domain).
The host where the Data Marketplace runs is Linux/Unix.
The service account in Active Directory configured for the Data Marketplace has the option constrained delegation enabled.
You are in a cross-domain scenario. That is, the organization has several domains.
Otherwise, you can leave the field empty.
See also
For more details on the krb5 file, go to Providing a Krb5 File for Kerberos Authentication.
In case you run into any issues, select the Activate Kerberos debug mode option. Otherwise, disable it.
See also
For more details on debugging Kerberos issues, go to How to Debug Kerberos in Web Applications.
Click the Save button to confirm the Kerberos configuration. This configuration will take effect immediately. There is no need to restart the Data Marketplace.
Important
If no user is able to log in the Data Marketplace due to an inappropriate configuration of the Kerberos authentication, remember that you can still connect to the Data Marketplace using the local authentication or the authentication with login and password. Then you can modify the Kerberos settings or enable its debug mode.
Use an External Database for the Data Marketplace¶
This section explains how to configure the Data Marketplace to store its settings on an external database. This is necessary if you want to set up a cluster of Data Marketplace servers so all the servers share the same settings and metadata.
Note
This option is disabled when accessing Data Marketplace from Agora since Agora automatically configures an external database.
By default, the Data Marketplace stores the global settings and certain settings for each user in an embedded database (Apache Derby). For example, the saved queries for a user, the fields selected by the administrator to display by default on a view search, etc. You can configure Data Marketplace to store all this in an external database.
The Data Marketplace supports the following databases:
MySQL
Note
Make sure that MySQL (or the database in MySQL you are going to use) is configured with the options
Default Charset = utf8_mb4andCollation = utf8mb4_unicode_ciin the case of MySQL 5 orCollation = utf8mb4_0900_ai_ciin the case of MySQL 8.Oracle
Note
When using Oracle as an external database, it is essential to set the
MAX_STRING_SIZEparameter toEXTENDED, as explained in official Oracle documentation. The default setting,STANDARD, limits text fields to 4000 bytes, specifically theVARCHAR2andNVARCHAR2data types. This restriction significantly impacts the number of characters that can be stored, especially when dealing with multibyte characters.PostgreSQL
SQL Server
Amazon Aurora for MySQL and PostgreSQL
Azure SQL Server
Note
The minimum version required of each database is:
MySQL 5.7
Oracle 12c
PostgreSQL 11
SQL Server 2014
Amazon Aurora for MySQL 5.7 and Amazon Aurora for PostgreSQL 11
Follow these steps to store the metadata of the Data Marketplace on an external database:
In the external database, create a catalog or a schema for the metadata of the Data Marketplace.
Although you can use an existing schema, we suggest creating one to keep the tables separate from the tables of other applications. We suggest reserving 5 gigabytes of space for this schema. In most cases, less space will be required. However, we recommend a high value to avoid issues due to the lack of space in the database.
In this database, create a service account that has create, read and write privileges on that database.
Consider enabling the high availability features of this database to meet higher uptime requirements.
Copy the JDBC driver of the database to the directory
<DENODO_HOME>/lib/data-catalog-extensions.Take this into account:
To use Oracle 12, only copy the files
ojdbc8.jarandorai18n.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/oracle-12c.To use Oracle 18, only copy the files
ojdbc8.jarandorai18n.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/oracle-18c.To use Oracle 19, only copy the files
ojdbc8.jarandorai18n.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/oracle-19c.To use Oracle 21, only copy the files
ojdbc8.jarandorai18n.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/oracle-21c.To use SQL Server, you have available two different drivers:
For any version, use the Microsoft driver. This is the recommended option. To use it, only copy the file
mssql-jdbc-10.2.0.jre8.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/mssql-jdbc-10.x.For the 2014 and 2016 versions, the jTDS driver is also available. To use it, copy the file
denodo-jtds-1.3.1.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/denodo-jtds-1.3.1.
To use Amazon Aurora for MySQL 5.7, only copy the file
mariadb-java-client-2.7.1.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/mariadb-2.7.To use Amazon Aurora for PostgreSQL 11, only copy the file
postgresql-42.7.2from<DENODO_HOME>/lib/extensions/jdbc-drivers/postgresql-11.To use Azure SQL Server, only copy the file
mssql-jdbc-7.2.2.jre8.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/mssql-jdbc-7.x.To use Azure SQL Server via Active Directory, copy the files
mssql-jdbc-7.2.2.jre8.jar,adal4j-1.6.7.jar,gson-2.9.0.jarandoauth2-oidc-sdk-8.36.2.jarfrom<DENODO_HOME>/lib/extensions/jdbc-drivers/mssql-jdbc-7.x; andaccessors-smart.jar,json-smart.jar,javax.mail.jarandnimbus-jose-jwt.jarfrom<DENODO_HOME>/lib/contrib.To use Azure SQL Server as the external metadata database with ActiveDirectoryPassword authentication method, copy the file
<DENODO_HOME>/lib/contrib/content-type.jarto the directory<DENODO_HOME>/lib/data-catalog-extensions, in addition to the files already specified.You may find the JDBC of other databases in
<DENODO_HOME>/lib/extensions/jdbc-drivers.
Log in the Data Marketplace with an administrator account and export the metadata.
This is necessary because the metadata of the Data Marketplace is not transferred automatically from the current database to the new one. You can skip this step if this is a fresh installation and you have not done any other change to this Data Marketplace.
Stop all the components of the Denodo Platform. Then, execute
<DENODO_HOME>/bin/webcontainer_shutdownto make sure the web container is stopped.Start Data Marketplace and the other components of the Denodo Platform.
Log in the Data Marketplace with an administrator account and go to Administration > Set-up and management > Server. In the Database tab you will find the dialog to configure the external database.
Dialog to configure the database where the Data Marketplace stores its metadata¶
Provide the following information:
Database. Select the database you want to use.
Driver class. The name of the Java class of the JDBC driver to be used. The default value is usually correct.
URL. The connection URL to the database.
Important
For SQL Server using the Microsoft driver 10.x and above, it is necessary to set the encrypt property to false. To do so, in the database URL, add encrypt=false at the end. For example:
jdbc:sqlserver://host:port;databaseName=database;encrypt=falseAuthentication. Authentication method for accessing the external database. It is only available for Oracle and SQL Server 2014 databases. Select one of these methods:
Login and Password: It is the authentication method by default. Use the credentials of the account used to connect to the database. Username (optional) and Password (optional).
Kerberos with password: Use Kerberos authentication, with the provided Username and Password (from the Active Directory account).
Kerberos with keytab: Use Kerberos authentication, with the provided Username (in this case, the Service Principal Name - SPN -) and Keytab file (no password needed). Drag-and-drop the keytab file to the Keytab file field or as an alternative, you can click the field and select it from the files browser.
Kerberos with Windows User: Use Single Sign-On (SSO) with Kerberos, doing pass-through with the user that launched the server (no user name nor password required).
Denodo AWS Instance Credentials. Needed parameters for accessing to a Amazon Aurora datasources using instance credentials:
AWS IAM role ARN (optional): An IAM identity that you can create in your account that has specific permissions.
AWS region: AWS region where the Aurora database is deployed.
Database user: Database account that you want to access.
Maximum lifetime of a connection in the pool (minutes) (optional): Connection lifetime should be a little bit less than an AWS authentication token to avoid pool connections drop. Data Marketplace assumes 14 minutes by default since according to IAM database authentication for MariaDB, MySQL, and PostgreSQL the tokens have a default lifetime of 15 minutes.
AWS IAM Credentials. The parameters to authenticate using IAM Credentials are necessary in addition to the parameters for accessing to a Amazon Aurora datasources defined above:
AWS access key id: A unique identifier that specifies the user or entity making the request.
AWS secret access key: A secret string that is used to sign the requests you make to AWS.
Note
AWS IAM Credentials and Denodo AWS Instance Credentials options are only available when the database is Amazon Aurora MySQL or Amazon Aurora PostgreSQL. They require SSL and you have to follow instructions in Adding AWS RDS Certificate to JRE cacerts.
Note
When the selected database is
Derby Embedded, the fields Driver class, URL, Username and Password are not editable.Note
To configure the SQL Server 2014 with the jTDS driver, use the value
net.sourceforge.jtds.jdbc.Driveras the Driver class and enter a URL that follows the patternjdbc:jtds:sqlserver://<host>:<port>/<database>Schema. Select the schema you want to use. If the field is empty, database default schema will be used.
Note
This option is available only for databases that support this concept of schema: PostgreSQL and SQL Server.
To speed up the queries, the Data Marketplace creates a pool with connections to the database. You can configure this pool with the following optional settings:
Maximum pool size. The maximum number of actual connections to the database, including both idle and in-use connections.
Minimum idle connections. The minimum number of idle connections that the Data Marketplace tries to maintain in the pool. If the idle connections dip below this value and total connections in the pool are less than Maximum pool size, the Data Marketplace will make a best effort to add additional connections.
Connection timeout. The maximum number of milliseconds that the Data Marketplace will wait for a connection from the pool. If this time is exceeded without a connection becoming available, an error will be thrown.
Ping query. The query that will be executed just before using a connection from the pool to validate that it is still alive. This is required only for legacy drivers that do not support the JDBC 4.0
Connection.isValid()API. If your driver supports JDBC 4.0 we strongly recommend not setting this property.Note
The jTDS driver for Microsoft SQL Server is considered legacy since it only supports JDBC 3.0. To use it with the Data Marketplace you need to provide a value for the Ping query field.
Click the Save button.
The Data Marketplace will check that it can reach the database and that the necessary tables exist:
If the JDBC driver for the selected database cannot be loaded, you will see the following warning.
Warning when the JDBC driver cannot be loaded¶
The configuration can still be saved, but the JDBC driver has to be available before restarting the Data Marketplace.
If for some reason the database cannot be reached, you will see the following warning.
Warning when cannot connect to the database¶
Again, the configuration can be saved, but you have to fix the error before restarting the Data Marketplace. Otherwise, it is not recommended to save it.
If the database or schema (in PostgreSQL and SQL Server) does not have the necessary tables, you will see the following dialog.
Dialog to create the tables in the database or schema¶
Confirm if you want the Data Marketplace to create the tables for you. It will use the user account you configured for the database, so make sure it has DDL privileges on the database. Alternatively, you can specify another user account with the right privileges.
Click the Cancel button if you want to run the script manually. You will find it on the folder
<DENODO_HOME>/scripts/data-catalog/sql/db_scripts.Important
If the scripts for creating tables are executed manually, it will be necessary in each future platform update and before starting the Data Marketplace again to manually execute the possible new scripts to update the database schema, since in this case the Data Marketplace will not be able to do it automatically.
If you still want to run the scripts manually, go to the folder
<DENODO_HOME>/scripts/data-catalog/sql/db_scripts/<DATABASE_TYPE>and run these scripts in the schema created for this purpose and configured in the Data Marketplace:In the case of PostgreSQL and SQL Server, the table prefix must be replaced with the desired schema or removed to use the default schema.
In the case of starting from an empty schema, all the scripts present in the indicated folder must be executed in ascending version order.
In the case of a platform update, all the scripts present in the indicated folder that have not been executed in previous updates must be executed in ascending version order from the last script that has been executed.
If you are setting-up a cluster of Data Marketplace servers, all servers will share the same database. Therefore, you only need to create the tables in the database once.
Restart Data Marketplace to apply the changes.
If the Data Marketplace does not start due to an error in the database configuration you can restore the default database configuration manually. Edit the file
<DENODO_HOME>/conf/data-catalog/datasource.propertiesand modify its content to match the following (replace<DENODO_HOME>with the path to the Denodo installation). The propertydatasource.url.defaultwill contain the path to the database so you can copy its content into thespring.datasource.urlproperty.datasource.type=derby datasource.url.default=jdbc:derby:<DENODO_HOME>/metadata/data-catalog-database;create=true spring.datasource.driver-class-name=org.apache.derby.jdbc.EmbeddedDriver spring.datasource.url=jdbc:derby:<DENODO_HOME>/metadata/data-catalog-database;create=true spring.datasource.username= spring.datasource.password.secret=
Restart Data Marketplace again to apply the changes.
Import the metadata you exported on step #3.
If you are building a cluster of instances of Data Marketplace, configure the load balancer to redirect all the request of the same web session to the same node of the cluster (i.e. sticky sessions).
Configure the Permissions¶
In the Permissions tab you can configure the privileges granted to a role. The privileges are divided into six categories: Management, Administration, Collaboration, User, Request and External element. To grant or revoke privileges to a role you have to:
Navigate through the categories in the dialog.
Select or deselect the privileges you want for the role.
Click the Save button to apply changes.
After that, all users with that role will be able to perform those tasks associated to the selected privileges.
You can find a detailed explanation of each privilege in the Privileges on the Data Marketplace section.
Dialog to configure the user privileges¶
In this dialog a check box can have one of three states. Let us see what each one of them means:
The privilege is not granted to the role.
The privilege have been granted to some elements for this role in the advanced permissions dialog.
The privilege is granted to the role.
If a role has been removed from Virtual DataPort it is displayed in red in the
permissions table. Click the icon next to its name to remove it from
the Data Marketplace too.
Note
To configure privileges in the Permissions dialog, a user needs:
The
Permissionsprivilege in Data Marketplace, which gives access to the Permissions dialog.One of the following requirements to access all roles available in the Virtual DataPort server:
The
ADMINprivilege for a database in Virtual DataPort.The
assignprivilegesrole.The
create_rolerole.The
drop_rolerole.The
assign_all_privilegesrole.The
assign_all_rolesrole.The
manage_policiesrole.The
read_all_privilegesrole.
Take into account that the
allusersrole is only available for those users with the roleassign_all_privileges,assign_all_rolesorread_all_privileges.
By default, all privileges granted in the Permissions tab will apply to all elements in the catalog. However, for
the privileges to manage requests you can be more specific and restrict the databases, views or web services that will
be affected by the privilege. For instance, you can grant the Manage access privilege to a role but only for those
requests in which participate an element of a particular database. To access this advanced view of the Permissions
tab, click the icon under the Advanced column for the role you want to configure.
Dialog to grant a privilege to a role but only on specific elements of the catalog¶
To grant a privilege to a view or web service, go to the VDP element section and just select the corresponding check box for that element. Databases work slightly different:
Select a check box for a database and the privilege will be granted to all elements under the database, the current ones and the future ones. If the privilege is applicable for a database, it will be granted to the database too.
If you want to grant the privilege only to the database, then click the
icon next to the check box and select the Database option.
In this dialog a check box can have one of four states. Let us see what each one of them means:
The privilege is not granted to the element.
The privilege is granted only to the database. Its elements will not inherit the privilege.
According to the element, the meaning of this state can be different:
In case of databases, the privilege is granted to the database (if applicable) and all its elements.
In case of views and web services, the privilege is granted to the element.
The privilege is granted to the element and cannot be revoked because it is inherited from the database or from the global privilege. Remember that when a privilege is granted from the main dialog of the Permissions tab it will affect all elements in the catalog.
As you can see in the dialog, folders cannot be the target of a privilege. However, they allow you to grant or revoke
privileges to several elements at once. Click the icon next to the folder name and select one of the
privileges under the Select on all folder elements menu to grant the selected privilege on all current elements
under the folder. If you select a privilege under the Clear on all folder elements menu, the privilege will be
revoked.
The catalog may contain too many elements and finding one in particular could be difficult. Notice that you can filter
your catalog using the tools in the Elements title. Filter by element name typing a keyword in the search bar or by
element type using the button.
To grant a privilege to a external element, go to the External element section and select the corresponding check box for that element.
Dialog to grant a privilege to a role but only on specific external elements of the catalog¶
The list of external elements in the table could be too large. Notice that you can filter
the external elements using the tools in the Elements title. Filter by element name by typing a keyword in the search bar. You can also filter the external elements by tags, categories or external tool servers using the button.
Large Language Models (LLM) Configuration¶
In the LLM Configuration tab you will be able to configure LLM related content of the Data Marketplace. This configuration is separated into two sections: Provider Configuration and Proxy Configuration.
LLM Configuration tab.¶
Use the buttons at the top to perform one of the following actions:
Save. Stores the LLM configuration in the database.
Clear. Removes temporary changes that were made in the form but are not saved in the database.
Test Configuration. Checks whether the LLM configuration specified in the form is valid. An LLM configuration is considered valid if the LLM provider successfully responds to a test request.
Note
To see the HTTP logs for the Test Configuration process, use the same method described for the Assisted Query in the Log Assisted Query HTTP information section.
Now, these two sections of the LLM configuration are going to be explained in detail.
Provider Configuration¶
In the Provider Configuration pill of the LLM Configuration tab you can configure the large language model provider that the Denodo Assistant will use. Currently, there are four options when configuring a provider: Amazon Bedrock, Azure OpenAI Service, OpenAI or Other (OpenAI API Compatible).
Amazon Bedrock (Anthropic Claude). This option allows to configure the official public Amazon Bedrock API. At the moment, only Amazon Bedrock models from the Claude family of model provider Anthropic are supported.
Azure OpenAI Service. This option allows to configure the official public Azure OpenAI Service API or a custom one.
OpenAI. This option allows to configure the official public OpenAI API.
Other (OpenAI API Compatible). This option allows to configure an OpenAI API compatible provider.
Important
If you are configuring a provider for the Assisted Query feature you must take into account that the fixed portion of the prompt provided by Denodo to the model occupies 2300 tokens, with an additional number of tokens reserved for the model’s response.
It’s crucial that the model being used supports a sufficient number of tokens not only for the fixed portion of the prompt and the reserved tokens for the response, but also for the schema (field names and their types) of the views it will interact with. While sending the view schema is essential for system functionality, having a context window token limit that accommodates additional information, such as field descriptions, associations with other views or example values is highly recommended.
Models supporting at least 10,000 tokens of context window could be adequate. However, depending on the length of the view metadata (for example, the length of descriptions), this limit may prove insufficient. If the token count falls short for transmitting all necessary information, the Data Marketplace automatically trims it down using the token reduction algorithm, prioritizing the view schema and essential function information.
Amazon Bedrock (Anthropic Claude) Provider¶
In this section, the Amazon Bedrock (Anthropic Claude) provider parameters are going to be enumerated and explained.
Dialog with the Amazon Bedrock (Anthropic Claude) provider configuration parameters¶
Authentication. Authentication method for accessing Amazon Bedrock. Choose between AWS IAM credentials or Denodo AWS instance credentials.
AWS IAM credentials. Use this option if you want to specify the AWS credentials.
AWS access key ID. Unique identifier that specifies the user or entity making the request.
AWS secret access key. Secret string that is used to sign the requests you make to AWS.
AWS IAM role ARN (optional). IAM identity with specific permissions.
AWS region. The AWS region where access to the Amazon Bedrock service is available.
Denodo AWS instance credentials. Use this option if the AWS credentials are configured in the instance where Denodo is running.
AWS IAM role ARN (optional). IAM identity with specific permissions.
AWS region. The AWS region where access to the Amazon Bedrock service is available.
Model ID. The identifier used within the Amazon Bedrock service to specify which model should be used. If you enter a custom Model ID, select a model of the “Claude” family of the “Anthropic” model provider.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. This includes both the input (Data Marketplace request) and the output (model response). If you configure a model from the list, the Data Marketplace will use the official model’s context window value. If you select a custom model and you do not introduce the context window value, a default value of 16385 tokens will be assigned to this parameter. You can choose the custom option to type the number you want.
Max output tokens. The maximum number of tokens that the model can use to generate the response. If you configure a model from the list, the Data Marketplace will use the official model’s max output tokens. If you select a custom model and you do not introduce a max output tokens value, a default value of 4096 tokens will be assigned to this parameter. You can choose the custom option to type the number you want.
See also
For more information on the AWS authentication parameters go to the Amazon AWS security credentials reference.
Azure OpenAI Service Provider¶
In this section, the Azure OpenAI Service provider parameters are going to be enumerated and explained.
The Specify URI parameter allows configuring a custom Chat Completions URI when enabled. With this parameter we define that a specific API is eligible if it implements the official Chat Completions Azure OpenAI Service API (see https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#chat-completions). Not all the parameters of the Chat Completions Azure OpenAI Service API are needed for the custom API to be compatible with the Denodo Assistant:
Request body. The Denodo Assistant will make requests with a request body having the following parameters: messages and temperature.
Response body. The Denodo Assistant needs the following parameters in the response body: id, object, created, choices and usage.
If the Specify URI parameter is disabled the following parameters are available:
Dialog with the Azure OpenAI Service provider configuration parameters when the Specify URI parameter is disabled¶
Azure resource name. The name of your Azure resource.
Azure deployment name. The deployment name you chose when you deployed the model.
API version. The API version to use for this operation. This follows the YYYY-MM-DD format.
API key. The API key.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. This includes both the input (Data Marketplace request) and the output (model response). The default value is 16385.
Max output tokens. The maximum number of tokens that the model can use to generate the response. The default value is 4096 tokens.
If the Specify URI parameter is enabled the following parameters are available:
Dialog with the Azure OpenAI Service provider configuration parameters when the Specify URI parameter is enabled¶
Authentication. Configure this option depending on whether the custom provider requires authentication.
API key. The API key. If the authentication is turned on, this parameter is required.
Chat Completions URI. The URI of your custom API. This parameter is required.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. This includes both the input (Data Marketplace request) and the output (model response). The default value is 16385.
Max output tokens. The maximum number of tokens that the model can use to generate the response. The default value is 4096 tokens.
See also
For more information on the Azure OpenAI Service provider parameters go to the Azure OpenAI Service REST API reference.
OpenAI Provider¶
In this section, the OpenAI provider parameters are going to be explained.
Dialog with the OpenAI provider configuration parameters.¶
API key. This is the OpenAI API key. This parameter is required.
Organization ID (optional). A header to specify which organization an API request is sent from. This is useful for users who belong to multiple organizations. This parameter is not required.
Model. The model which is going to be used to generate the query. The drop-down values are the ones tested by Denodo. However, if you want to try an untested OpenAI model, you can configure it by pressing the edit icon.
Important
Access to the models listed in the Model field depends on your OpenAI account and organization. You may not have access to all of them.
Configuring an untested OpenAI model can potentially result in erroneous behavior within the functionality.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. This includes both the input (Data Marketplace request) and the output (model response). If you configure a model from the list, the Data Marketplace will use the official model’s context window value. If you select a custom model and you do not introduce the context window value, a default value of 16385 tokens will be assigned to this parameter. You can choose the custom option to type the number you want.
Max output tokens. The maximum number of tokens that the model can use to generate the response. If you configure a model from the list, the Data Marketplace will use the official model’s max output tokens. If you select a custom model and you do not introduce a max output tokens value, a default value of 4096 tokens will be assigned to this parameter. You can choose the custom option to type the number you want.
See also
For more information on the OpenAI provider parameters, go to the OpenAI provider reference website.
Other (OpenAI API Compatible)¶
In this section, the Other (OpenAI API Compatible) provider parameters are going to be enumerated and explained.
The Other (OpenAI API Compatible) option allows the Denodo Assistant to send and process requests from APIs following the OpenAI Chat Completions API approach. We define that an API follows this approach if it implements the official Chat Completions OpenAI API (see https://platform.openai.com/docs/api-reference/chat). Not all the parameters of the Chat Completions OpenAI API are needed for the API to be compatible with the Denodo Assistant:
Request body. The Denodo Assistant will make requests with a request body having the following parameters: model, messages and temperature.
Response body. The Denodo Assistant needs the following parameters in the response body: id, object, created, choices and usage.
Dialog with the Other (OpenAI API Compatible) provider configuration parameters¶
Authentication. Configure this option depending on whether the custom provider requires authentication.
API key. The API key. If the authentication is turned on, this parameter is required.
Organization ID (optional). A header to specify which organization an API request is sent from. This is useful for users who belong to multiple organizations. Only available when the authentication is turned on. This parameter is not required.
Chat Completions URI. The Chat Completions endpoint. This parameter is required.
Model. The model which is going to be used to generate the query. This parameter is required.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. This includes both the input (Data Marketplace request) and the output (model response). The default value is 16385.
Max output tokens. The maximum number of tokens that the model can use to generate the response. The default value is 4096 tokens.
In addition to OpenAI models such as gpt-oss-20b, several others support the Chat Completions API, including Google’s Gemini and models that can be served using Ollama, such as Llama 3.3, Mistral, and Phi-4. The following examples demonstrate how to configure both a Gemini model and models served via Ollama.
Vertex AI Platform. Follow these steps to use a deployed Gemini model via the Vertex AI Platform.
Select Other (OpenAI API Compatible) as the Provider.
If authentication is required, enter the API key. You can retrieve a temporary access token using the following command in Google Cloud Shell:
gcloud auth print-access-token
Specify the URI of the chat completions endpoint for the Vertex AI API. This URI must include your specific
{location}and{project}values:https://{location}-aiplatform.googleapis.com/v1beta1/projects/{project}/locations/{location}/endpoints/openapi/chat/completions.In the Model field, enter the full name of your deployed Gemini model, such as
google/gemini-2.0-flash.
By following these steps, Denodo can securely connect to and utilize your deployed Vertex AI model.
Ollama. Follow these steps to use a local LLM hosted by the Ollama server.
Before you begin, ensure you have the following:
The Ollama server is installed and running on your machine.
You have pulled the model you wish to use (e.g.,
gpt-oss:20borllama3.2:3b). The model must be available on your local Ollama server.ollama pull gpt-oss:20b
Follow these steps to complete the configuration:
Select Other (OpenAI API Compatible) as the Provider.
Leave the API Key field blank, as a locally running Ollama server typically does not require authentication.
Specify the URI for the Ollama chat completions endpoint. This will usually be your local machine’s address and the default Ollama port (11434), followed by the API path:
http://localhost:11434/v1/chat/completions.In the Model field, enter the name of the model you have pulled in Ollama, such as
gpt-oss:20borllama3.2:3b.
By following these steps, Denodo can connect to your local Ollama server and utilize the selected model.
HTTP Proxy Configuration¶
In the Proxy Configuration pill of the LLM Configuration tab you can specify the HTTP proxy configuration:
Dialog with the proxy configuration parameters¶
Enable proxy. This toggle enables the connection via proxy. If this is enabled, the requests made to the API are going to be sent to the proxy. This option should be enabled if a proxy has been configured between the Data Marketplace and the provider.
Host. The hostname (IP or DNS name) of the proxy.
Port. The port number. This parameter is mandatory in order to use the proxy.
Proxy name (optional). The username.
Proxy password (optional). The password.
Important
If the proxy configured requires basic authorization, the following property must be added to the JVM properties of the web container:
-Djdk.http.auth.tunneling.disabledSchemes=""
Vector Database Configuration¶
In the Vector DB tab you will be able to set up the vector database configuration of the Data Marketplace.
This configuration is separated into four sections: Connection Configuration, Embedding Model Configuration, Index Configuration and Servers Configuration.
Connection Configuration¶
Use the connection configuration section to enter the connection details of the vector database.
Note
This option is disabled when accessing Data Marketplace from Agora since Agora automatically configures the vector database connection.
First, activate the metadata vectorization process using the Enable the metadata vectorization toggle. This will reveal the connection parameters:
Dialog with the connection configuration parameters¶
Database adapter. The vector database provider. Currently, Denodo only supports configuring PGVector as the database adapter.
Important
Before configuring and using the vector database, the PostgreSQL JDBC driver needs to be located in the <DENODO_HOME>/lib/data-catalog-extensions folder to work with
PGVector. A restart of the Data Marketplace is required after placing the driver.
URL. The connection URL to the database.
Username. Database username.
Password. Database user password.
Embedding Model Configuration¶
Denodo supports these embedding model providers: Amazon Bedrock (Titan), Azure OpenAI Service, OpenAI and Other (OpenAI API Compatible).
Amazon Bedrock (Titan)¶
Dialog with the Amazon Bedrock (Titan) configuration parameters¶
Authentication. Authentication method for accessing Amazon Bedrock. Choose between AWS IAM credentials or Denodo AWS instance credentials.
AWS IAM credentials. Use this option if you want to specify the AWS credentials.
AWS access key ID. Unique identifier that specifies the user or entity making the request.
AWS secret access key. Secret string that is used to sign the requests you make to AWS.
AWS IAM role ARN (optional). IAM identity with specific permissions.
AWS region. The AWS region where access to the Amazon Bedrock service is available.
Denodo AWS instance credentials. Use this option if the AWS credentials are configured in the instance where Denodo is running.
AWS IAM role ARN (optional). IAM identity with specific permissions.
AWS region. The AWS region where access to the Amazon Bedrock service is available.
Model ID. The identifier used within the Amazon Bedrock service to specify which model should be used. If you enter a custom Model ID, select a model of the “Titan” family.
Max tokens. Maximum tokens allowed by the deployed model.
For more details on Amazon Titan embedding models, go to the Amazon Titan embedding models reference. For more information on the AWS authentication parameters, go to the Amazon AWS security credentials reference.
Azure OpenAI Service¶
Dialog with the Azure OpenAI Service configuration parameters¶
Azure resource name. The name of your Azure resource.
Azure deployment name. The deployment name you chose when you deployed the model.
API version. The API version to use for this operation. This follows the YYYY-MM-DD format.
API key. The API key.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. This includes both the input (Data Marketplace request) and the output (model response). The default value is 16385.
Enable proxy. Enable this if there is a proxy between the Data Marketplace and the Azure OpenAI Service.
Host. The hostname (IP or DNS name) of the proxy.
Port. The port number. This parameter is mandatory in order to use the proxy.
Proxy name (optional). The username.
Proxy password (optional). The password.
For more information on the Azure OpenAI Service API parameters, go to the Azure OpenAI Service REST API reference.
OpenAI¶
Dialog with the OpenAI configuration parameters¶
API key. This is the OpenAI API key. This parameter is required.
Organization ID (optional). A header to specify which organization an API request is sent from. This is useful for users who belong to multiple organizations. This parameter is not required.
Model. Embedding model to use. The drop-down values are the ones tested by Denodo. However, if you want to try an untested OpenAI model, you can configure it by pressing the edit icon.
Important
Access to the models listed in the Model field depends on your OpenAI account and organization. You may not have access to all of them.
Configuring an untested OpenAI model can potentially result in erroneous behavior within the functionality.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. If you configure a model from the list, the Data Marketplace will use the official model’s context window value. If you select a custom model and you do not introduce the context window value, a default value of 8191 tokens will be assigned to this parameter. You can choose the custom option to type the number you want.
For more details, see the OpenAI embedding models documentation https://platform.openai.com/docs/guides/embeddings#embedding-models. For OpenAI and Other (OpenAI API Compatible) embedding models, you can set up a proxy without authentication using the following JVM properties of the Virtual DataPort server and the web container:
-Dhttps.proxyHost=proxyHost.com
-Dhttps.proxyPort=1234
Other (OpenAI API Compatible)¶
Dialog with the Other (OpenAI API Compatible) configuration parameters¶
Authentication. Configure this option depending on whether the custom provider requires authentication.
API key. The API key. If the authentication is turned on, this parameter is required.
Embeddings URI. The Chat Completions endpoint. This parameter is required.
Model. Embedding model to use. This parameter is required.
Context window (tokens). The maximum amount of tokens that the model can process in a single interaction. The default value is 8191.
The Other (OpenAI API Compatible) option allows the Denodo Assistant to send and process requests from APIs following the OpenAI Create Embeddings approach. An API follows this approach if it adheres to the official Create Embeddings API (see https://platform.openai.com/docs/api-reference/embeddings/create).
There are models that adhere to the OpenAI Create Embeddings approach and are not made by OpenAI. These include Ollama models, such as nomic-embed-text, mxbai-embed-large or all-minilm, which can even be run locally. To configure these models, ensure you use the v1 endpoint: /v1.
Index Configuration¶
The Index Configuration allows users to configure how Denodo handles the indexation:
Dialog with the index configuration parameters¶
Batch Size. Number of elements to index at once.
Refresh metadata interval. Interval in seconds at which the system looks for updates and refreshes the necessary metadata.
Servers Configuration¶
The Servers Configuration section lets users enable and configure the vectorization of data for each server. By default, the vectorization is disabled for all servers. The current vectorization state is represented by a table containing the status for each of the servers.
The servers configuration status is represented by a table that shows the current status of the vectorization for each server.¶
You can start the vectorization of the server’s data following these steps:
First, click on the
options button and then click on the
Edit button.
Enable the vectorization using the Enable server metadata vectorization toggle and the following options will be visible:
Dialog with the VDP Server vectorization configuration parameters¶
Configure the following parameters:
Username. VDP server username. This user must have metadata permissions for all the views that need to be vectorized. If the Use sample data is enabled, the user will also need execute permissions in all columns of all the views that need to be vectorized.
Password. VDP server user password.
Use sample data. This enables the Data Marketplace to vectorize actual data from the view. This can help improve the results when the LLM needs to create queries that contain conditions.
Sample data size. Configure the number of rows to retrieve for sampling. This will create a subset of rows to vectorize, which can improve query results. The maximum value is 500 rows.
You can now check your configuration using the Test Configuration button, or save it.
Important
Whenever you save the server or connection configuration the previous vectorization will be overwritten. A dialog will prompt you to confirm this action.
Dialog warning the user of the delete process of the previous vectorization.¶
While the vectorization process is ongoing, the table will refresh automatically every 10 seconds to show you the current status. You can also manually refresh the table via the
Refresh Table button. If you want to see how many elements are left, you can hover the mouse over the pending status:
Hovering the mouse over the pending status will show the number of elements left to vectorize.¶
The vectorization process for the selected server is completed. There are two possible outcomes:
From here, you can choose to delete the current vectorization or recreate it using the options button if needed.
