Denodo AI SDK - User Manual
You can translate the document:
Introduction
The Denodo AI SDK works with the Denodo Platform to help simplify and accelerate the development of AI agents. The SDK simplifies the process of grounding AI with enterprise data using the Retrieval-Augmented Generation (RAG) pattern. These types of RAG applications combine the power of retrieval-based and generation-based models to provide more accurate and contextually relevant responses. The AI SDK simplifies the development process with multiple configurable LLMs and Vector Databases, enabling you to quickly create AI applications.
Benefits of Using Denodo’s SDK for RAG Applications
- Rapid Development: Build RAG applications quickly and efficiently, reducing time-to-market.
- Enhanced Accuracy: Combine retrieval and generation models to improve the accuracy and relevance of responses.
- Flexibility: Easily integrate with various data sources and adapt to different use cases.
- Improved User Engagement: Provide users with more precise and context-aware information, enhancing their overall experience.
To showcase the AI SDK’s capabilities, a sample chatbot application is included.
How to obtain the Denodo AI SDK
Downloading the source code from GitHub
The AI SDK and sample chatbot source code is available in the following public Github repository.
Downloading the Denodo AI SDK container image
If you want to use the Denodo AI SDK container image with a Denodo Platform installation, you can obtain the AI SDK container image from DenodoConnects via:
- Support site. You will find it in the DenodoConnects section.
- Harbor:
- If you have an Enterprise Plus license you can find it here.
- If you have an Express license you can find it here.
NOTE: If downloading the image directly from Harbor, you will need to also obtain the example configuration files from Github: sdk_config_example.env, chatbot_config_example.env. If you obtain the AI SDK container image from the support site, these files will be already included in the downloaded .zip file.
If you want to use the AI SDK container image together with the Denodo Express container image you can follow this tutorial.
Installing Denodo Express
The AI SDK is included as part of the Denodo Express installer.
Installing the environment
NOTE: This is only necessary if you obtained the AI SDK from Github or as part of the Denodo Express installer. The container from DenodoConnects has the Python environment already set up.
To execute the AI SDK, you will need the following environment setup:
- Python 3.10/3.11/3.12
- A Denodo 9.0.5 or higher (either Express or Enterprise Plus license is required) instance, with cache enabled.
- A minimum of 2 cores
- A minimum of 2GB RAM
- A minimum of 5GB in disk space
Please note that if using the AI SDK on the same machine as the Denodo Platform, make sure to have an extra 2 cores/2GB RAM/5GB disk space freely available for the AI SDK to use.
Also, as the number of concurrent users increases, so will the required number of cores/GB of RAM.
Once you have installed the correct environment, you will proceed by creating the Python virtual environment and installing the required dependencies. The steps are the same for both Windows and Linux, unless otherwise specified.
First, we have to move to the AI SDK root directory:
cd [path to the Denodo AI SDK]
Then, we create the Python virtual environment.
python -m venv venv
This will create a venv folder inside our AI SDK folder where all the specific dependencies for this project will be installed. We now need to activate the virtual environment.
Activating the virtual environment differs depending on the OS. Please check the FAQ for common problems when installing on Windows/Linux.
Windows
.\venv\Scripts\activate
Linux
source venv/bin/activate
Once inside the virtual environment, you now can install the required dependencies.
python -m pip install -r requirements.txt
Finally, we need to install Playwright. The Playwright dependency is used to generate PDFs in DeepQuery. If you do not want to use DeepQuery, you can safely skip this step.
Windows
.\venv\Scripts\playwright install
Linux
venv/bin/playwright install
The required environment for the AI SDK is now correctly set up.
Basic configuration
Virtual DataPort and Data Marketplace
The AI SDK must be able to connect to the Data Marketplace to retrieve the data and execute queries against it. For this reason, the Virtual DataPort server and the Data Marketplace of your Denodo instance must always be running while executing the AI SDK.
NOTE: With the AI SDK there is a sample banking demo data to get started with the AI SDK. If you plan on using it, you can skip the following Virtual DataPort and Data Marketplace configuration steps and jump to the AI SDK configuration section.
Enable cache in Virtual DataPort
Cache is needed to retrieve data samples of each view. You can disable this (and remove the need of having cache enabled) by setting examples_per_table = 0 in the getMetadata endpoint.
Choose the databases you want to expose to the AI SDK
In the AI SDK getMetadata endpoint you will be asked to select which databases (or tags) are exposed to the AI SDK through the vdp_database_names (vdp_tag_names) variable (more on the AI SDK’s configuration later on).
NOTE: Remember to synchronize the Data Marketplace with VDP before executing the AI SDK.
Configure users with the correct permissions
If you plan on using your own data, we recommend you create two users with specific permissions to work with:
- Metadata Sync User. This is the user account used to run the synchronization (through the getMetadata endpoint) to vectorize the Denodo Metadata so it is available for the LLM. This user should have the privileges to access all the assets that will be available via the Denodo AI SDK. This will be typically a service account used for this particular purpose. See: Create a new User for the Creating Users — Virtual DataPort Administration Guide.
- End User(s). These are the users that will be interacting with the AI SDK to run their questions over the data and metadata exposed in Denodo.
The same access privileges and data protections applied to the users in Denodo will apply when the End users interact via the AI SDK.
How to configure users in the AI SDK
Credentials are passed directly through the authentication method of your choice (HTTP Basic or OAuth) when calling the endpoint. The credentials sent will be passed to the Denodo instance to authenticate.
The recommended procedure is to call the getMetadata endpoint with Metadata Sync User permissions to load the vector store for all users. Then, end users will call the answerQuestion endpoints with their own credentials, and permissions will be applied.
If you load the vector store with the End Users credentials directly, the vector store will be limited to just that user’s permissions, and the vector store may not contain the necessary views for other users to be able to query the vector store at the same time.
How to grant privileges for users in Denodo
- To set user privileges in Denodo Design Studio go to: Administration > User Management
- Select Edit Privileges
Metadata Sync User privileges
The vectorization process requires privileges to obtain both metadata and some sample data from Denodo. Therefore the privileges required for this user are:
- Connection privilege in the admin database within Denodo.
- Create view privilege in the admin database within Denodo.
- Connect privilege in your target database within Denodo.
- Metadata privilege in your target database within Denodo.
- Execute privilege in your target database within Denodo.
If you do not want to include sample data in the vectorization process (which is highly recommended because it is shared with the LLM to better understand your schema), please check this guide.
End User privileges
For a user to execute and interact with the AI SDK, it must have:
- Connection privilege in the admin database within Denodo.
- Connect privilege in your target database within Denodo.
- Metadata privilege in your target database within Denodo.
- Execute privilege in your target database within Denodo.
AI SDK
The AI SDK is configured via a .env file located in api/utils/sdk_config.env. With this file you will be able to configure many parameters related to the AI SDK’s behavior.
NOTE: If using the container images, you will find the sdk_config_example.env in the .zip (if you downloaded the container image from the support site) or you will have to download it from Github here (if you obtain the image through Harbor).
There is a sample file named sdk_config_example.env in api/utils/ that you can rename to sdk_config.env. This file is already populated with the default values, so the only parameters you have to modify are:
- LLM_PROVIDER, LLM_MODEL and LLM_TEMPERATURE
- THINKING_LLM_PROVIDER, THINKING_LLM_MODEL, and THINKING_LLM_TEMPERATURE
- EMBEDDINGS_PROVIDER and EMBEDDINGS_MODEL
Here’s a screenshot of this section of the sample file:
As you can see, if we want to use OpenAI and model gpt-4o, we would fill out “openai” in the PROVIDER section, and the model name “gpt-4o” in the MODEL section.
Then, we would need to configure the credentials to access this model. In most providers, this is the API_KEY, but it depends on the provider.
Please check the Advanced configuration section to look for your provider of choice and how to configure it.
Large Language Model
The AI SDK requires connection to 2 different types of AI models: Large Language Models (LLMs) and Embedding Models.
An LLM is the model you would normally interact with in any of the main platforms: ChatGPT, Claude Desktop, Perplexity… They take a natural language string as input and are able to return a natural language response as a string.
Example of LLMs, by provider:
- OpenAI: gpt-4o, gpt-4o-mini, gpt-3.5-turbo
- Google: gemini-2.0-flash, gemini-2.5-pro
- Anthropic: claude-3.5-sonnet, claude-3.7-sonnet
You will need access to a powerful LLM to be able to successfully build quality applications with the AI SDK.
Embeddings Models
An embedding model differs from an LLM in many ways, from architecture to purpose. It is likely that you have interacted with an embeddings model without knowing it, because you do not interact directly with it.
An embeddings model takes in a natural language string as input and returns an array of integers (a vector). This vector is used to determine the semantic similarity between multiple strings. These vectors are stored in a vector store.
In the case of the AI SDK, we use embedding models to determine which tables need to be used to answer the input question.
Example of embedding models, by provider:
- OpenAI: text-embedding-3-small, text-embedding-3-large
- Google: text-embedding-004, text-embedding-005
- Amazon: amazon.titan-embed-text-v2:0, amazon.titan-embed-text-v1
Thinking Models
Thinking models are used in the AI SDK at the core of the DeepQuery functionality. A thinking model is an LLM trained to “think longer” and handle complex, multi-step tasks. Before giving a final answer, it generates an internal "chain of thought."
This reasoning process isn't generally visible to the user, but it allows the model to break down problems, consider multiple approaches, and arrive at more accurate and logical conclusions. A thinking model is required to use DeepQuery.
Some examples of thinking models, by provider:
- OpenAI: o4-mini or o3 (which is more expensive and has higher latency).
- Google: gemini-2.5-pro
- Anthropic: claude-4-sonnet
Sample chatbot
The sample chatbot, on the other hand, is configured via another .env file located in sample_chatbot/chatbot_config.env.
NOTE: If using the container images, you will find the chatbot_config_example.env in the .zip (if you downloaded the container image from the support site) or you will have to download it from Github here (if you obtain the image through Harbor).
Again, for the basic configuration, you simply need to rename the sample file chatbot_config_example.env to chatbot_config.env and this will have the default parameters already populated. Then, the only thing you will need to configure is the LLM/embeddings to be used in the chatbot through the following parameters:
- CHATBOT_LLM_PROVIDER
- CHATBOT_LLM_MODEL
- CHATBOT_EMBEDDINGS_PROVIDER
- CHATBOT_EMBEDDINGS_MODEL
Execution
From source code
NOTE: The AI SDK includes a sample chatbot. Both processes can be executed using the command python run.py both. To run the processes in the background and regain control of the terminal, execute python run.py both --background.
To stop the foreground processes, type exit and press Enter, press Ctrl+C, or send an EOF signal (Ctrl+D on Linux/macOS; Ctrl+Z + Enter on Windows). If the processes were launched in the background, they can be stopped by executing python stop.py both.
Denodo AI SDK
To execute the AI SDK standalone, you will need to have configured all the basic configuration parameters for the AI SDK in the api/utils/sdk_config.env file.
Once that is done, you can run python run.py api.
- The Swagger documentation for the AI SDK API will be available in {API_HOST}/docs.
- The AI SDK API logs can be found in the file logs/api.log.
There is a sample api/example.py file that showcases how to call the AI SDK endpoints.
Sample chatbot
With the sample chatbot, you’ll be able to test the AI SDK’s functionality with a simple-to-use UI.
To execute the sample chatbot, you must have the API running (or execute both, as per the command above) and completed the basic sample chatbot configuration.
With the sample data
The sample chatbot comes with a sample banking dataset that you can load onto your Denodo instance to get a feel for the AI SDK.
NOTE: If you have downloaded the AI SDK from GitHub, it is necessary to move the sample files in <AI_SDK_HOME>/sample_chatbot/sample_data/structured/ to <DENODO_HOME>/samples/ai-sdk/sample_chatbot/sample_data/structured/, as the demo data loader expects them to be in that filepath. If you are using the Denodo Express installer, the CSVs will already be in that folder.
To load the sample data, run the following command:
python run.py both --load-demo --host localhost --grpc-port 9994 --dc-port 9090 --server-id 1
NOTE: The –load-demo command only needs to be run the first time to load the demo data in your Denodo instance. After that, you can run python run.py both, directly.
This command assumes your Denodo instance is using host localhost, GRPC port 9994, and Tomcat port 9090 (the default installation values). Besides, it assumes that you want to create the sample data in the VDP server registered with identifier 1 in the Data Marketplace (the VDP server registered by default during the installation). If this is not the case, you can modify the parameters to your current Denodo configuration.
Once the command has executed successfully, a ‘samples_bank’ database will be created in your VDP server, the cache of the VDP server will be enabled, the Data Marketplace will be synchronized with the VDP server, the AI SDK will be launched (in the port 8008 by default) and the chatbot will be accessible through the following URL: http://localhost:9992
With the chatbot now running, you will find the logs in the logs/sample_chatbot.log file.
You will also be able to leverage the sample chatbot’s decision-making, by uploading a UTF-8 encoded CSV file that contains unstructured data. To do this, simply click on the green button at the top-right of the header and upload your CSV file. You must include a description of the contents of the CSV file so that the LLM can correctly decide when it needs to search in the CSV file.
In the case of the banking dataset, there is a sample banking reviews CSV file located in <AI_SDK_HOME>/sample_chatbot/sample_data/unstructured/ that you can upload.
In the description field, you can write “Reviews by customers”.
The chatbot will then vectorize the CSV file and it will now decide whether to query the Denodo instance or the unstructured data to answer your questions.
Without the sample data
It is highly recommended to begin experimenting with the AI SDK using the provided sample banking dataset. However, if you have already done so and wish to experiment with your own dataset, we will now explain how to do it.
First, run the command python run.py both to launch both the AI SDK APIs and the sample chatbot.
Then, you can either call the getMetadata endpoint or use the Sync VectorDB button in the sample chatbot UI with the database or tags you want to vectorize. Once the vector store is populated, you can ask questions using either the answerQuestion endpoint or the chatbot itself.
With the chatbot also running, you will find the logs in the logs/sample_chatbot.log file.
You will also be able to leverage the sample chatbot’s decision-making, by uploading a UTF-8 encoded CSV file that contains unstructured data. To do this, simply click on the green button at the top-right of the header and upload your CSV file. You must include a description of the contents of the CSV file so that the LLM can correctly decide when it needs to search in the CSV file.
For example, for the sample banking dataset, we used “Reviews by customers”.
From container
The image contains both the AI SDK and the sample chatbot. The only difference in execution is deciding whether to load the sample banking dataset into your Denodo instance or not. Before diving into the execution from container, make sure that:
- You filled out the configuration files (sdk_config.env and chatbot_config.env) and moved them to a simple-to-remember filepath. For this example, we will use C:\share\sdk_config.env and C:\share\chatbot_config.env.
- Data Marketplace is running and synced with VDP.
- Optional. If you want to use the sample banking dataset, you must first move it to your Denodo’s instance directory. To do so, it is necessary to move the sample files (the sample_data folder in the .zip you downloaded from the support site, or the sample_data folder available through Github here, if you are downloading from Harbor) to <DENODO_HOME>/samples/ai-sdk/sample_chatbot/sample_data/structured/, as the demo data loader expects them to be in that filepath.
NOTE: For the following commands, remember that if using the container image available in the .zip file that you obtained from the support site, you will need to load the Docker image instead of pulling from Harbor.
With the sample data
The sample chatbot comes with a sample banking dataset that you can load onto your Denodo instance to get a feel for the AI SDK.
NOTE: Since you are using the AI SDK container, it is necessary to move the sample files. However, if you are using the Denodo Express installer, the CSVs will already be in that folder.
To load the sample data, you first need:
- Data Marketplace host (from the container’s perspective, in this case we’ll use 192.168.1.1 as an example)
- Data Marketplace port (default: 9090)
- VDP GRPC port (default: 9994)
- VDP server id (default: 1)
Then, run (or modify if values are different) the following command:
docker run --rm -ti -v /mnt/c/share/chatbot_config.env:/opt/ai-sdk/sample_chatbot/chatbot_config.env -v /mnt/c/share/sdk_config.env:/opt/ai-sdk/api/utils/sdk_config.env -p 8008:8008 -p 9992:9992 harbor.open.denodo.com/denodo-connects-9/images/ai-sdk:latest bash -c "python run.py both --load-demo --no-logs --host 192.168.1.1 --grpc-port 9994 --dc-port 9090 --server-id 1"
NOTE: The load demo command only needs to be run the first time to load the demo data in your Denodo instance. After that, you can run the container like in the next section.
Please review the python command being executed after the image. This command assumes your Denodo instance is using host localhost, GRPC port 9994, and Tomcat port 9090 (the default installation values). Besides, it assumes that you want to create the sample data in the VDP server registered with identifier 1 in the Data Marketplace (the VDP server registered by default during the installation). If this is not the case, you can modify the parameters to your current Denodo configuration.
Once the command has executed successfully, a ‘samples_bank’ database will be created in your VDP server, the cache of the VDP server will be enabled, the Data Marketplace will be synchronized with the VDP server, the AI SDK will be launched (in the port 8008 by default) and the chatbot will be accessible through the following URL: http://localhost:9992
You will also be able to leverage the sample chatbot’s decision-making, by uploading a UTF-8 encoded CSV file that contains unstructured data. To do this, simply click on the green button at the top-right of the header and upload your CSV file. You must include a description of the contents of the CSV file so that the LLM can correctly decide when it needs to search in the CSV file.
In the case of the banking dataset, there is a sample banking reviews CSV file (in the .zip you downloaded from the support site or available through Github) that you can upload.
In the description field, you can write “Reviews by customers”.
The chatbot will then vectorize the CSV file and it will now decide whether to query the Denodo instance or the unstructured data to answer your questions.
You will also find the Swagger documentation for the AI SDK API will be available in {API_HOST}/docs.
Without the sample data
It is highly recommended to begin experimenting with the AI SDK using the provided sample banking dataset. However, if you have already done so and wish to experiment with your own dataset, we will now explain how to do it.
Then, run:
docker run --rm -ti -v /mnt/c/share/chatbot_config.env:/opt/ai-sdk/sample_chatbot/chatbot_config.env -v /mnt/c/share/sdk_config.env:/opt/ai-sdk/api/utils/sdk_config.env -p 8008:8008 -p 9992:9992 harbor.open.denodo.com/denodo-connects-9/images/ai-sdk:latest
You will also be able to leverage the sample chatbot’s decision-making, by uploading a UTF-8 encoded CSV file that contains unstructured data. To do this, simply click on the green button at the top-right of the chatbot’s header and upload your CSV file. You must include a description of the contents of the CSV file so that the LLM can correctly decide when it needs to search in the CSV file.
For example, for the sample banking dataset, we used “Reviews by customers”.
You will also find the Swagger documentation for the AI SDK API will be available in {API_HOST}/docs.
You will need to call the getMetadata endpoint yourself with the name of the database or tag you wish to vectorize (and ask questions about).
Optimal performance
While it is true that up until now we have covered the basics of how to get the AI SDK working, there are a few factors to consider to be able to extract maximum performance from it. After all, the AI SDK is a chain of many different components and it will only be as powerful as its weakest link.
For this reason, please ensure that all of these factors are attended properly:
- Metadata. The AI SDK will pass everything it can find in the schema of a view to the LLM. One of the advantages of using Denodo is that it can enrich a view’s schema in ways that can help the LLM better understand the purpose of said views. It’s important to fill out view and column descriptions, associate related views and provide logical names. Make use of the Denodo Assistants’s features to enhance the LLM’s understanding of your data.
- LLM. The LLM is the brains of the AI SDK. The smarter the LLM, the better the AI SDK’s performance. Striking a balance between speed and accuracy is key. Please check our recommended providers and model names in the AI SDK’s README file.
- Embeddings. Another key component of the operation is the embeddings model. A more precise embeddings model will help the LLM choose the correct and relevant tables to answer your questions accurately. Again, you will find in the project’s README a list of recommended and tested providers and model names.
- Advanced configuration. Custom instructions, set views, mode setting, graph generation… These are only a select few of the AI SDK’s capabilities that you will be able to fine-tune and leverage to obtain the most accurate results. We will review all of these characteristics in the following section.
MCP
You can connect the AI SDK as an MCP server (i.e, a tool) with your MCP-compatible client (Visual Studio Code, Claude Desktop, Cursor) by following these steps, depending on if you prefer local or remote MCP deployment.
Local MCP
- First, install uv in your system: https://github.com/astral-sh/uv
- Open the mcp/stdio_server_config.json JSON file. This file contains a template of the AI SDK’s MCP Server configuration. Specifically, you will need to modify the ABSOLUTE_PATH, MCP_AI_SDK_USER, MCP_AI_SDK_PASSWORD and MCP_AI_SDK_ENDPOINT.
- Append the resulting JSON into your MCP client’s configuration file.
Follow these steps to connect VS Code with your local AI SDK MCP server:
- To add the MCP Server, open the Command Palette in VS Code using Ctrl / Cmd + Shift + P, then type “MCP: Add Server” and press Enter. When prompted, choose whether to add the server Globally (available in all projects) or for the Workspace (only this project).
- When VS Code asks for the server details, paste the modified template code provided in api/mcp/local_config_example.json inside the “servers” key. Replace {ABSOLUTE_PATH} with the full path to your project’s api/mcp/local.py file.
{ "Denodo_AI_SDK_Local": { "command": "uv", "args": [ "run", "--with", "fastmcp", "fastmcp", "run", "{ABSOLUTE_PATH}/api/mcp/local.py" ], "env": { "MCP_AI_SDK_ENDPOINT": "http://localhost:8008", "MCP_AI_SDK_AUTH": "Basic YWRtaW46YWRtaW4=", "MCP_AI_SDK_VERIFY_SSL": "false" } } } |
VS Code local configuration
- Make sure your local AI SDK server is running at http://localhost:8008 before you use it in VS Code.
- Open the Chat view in VS Code, switch to Agent Mode, click Tools, and enable Denodo_AI_SDK_Local. You can now chat with the AI agent, and it will automatically use the tools provided by your local MCP server.
Remote MCP (HTTP)
Connecting to a remote AI SDK MCP server is very simple. Simply add “--mcp” to your normal AI SDK runner to start the MCP server together with the normal AI SDK API:
python run both.py –mcp
Connecting to the remote AI SDK MCP server (from the same machine or a different machine) requires configuring your MCP client. The authentication method depends on the MCP_BASIC_AUTH variable set in your sdk_config.env file. Below are the two possible modes.
Dynamic Client Registration (MCP_BASIC_AUTH = 0)
To comply with the Model Context Protocol proposed authorization flow, the remote AI SDK MCP server implements the OAuth 2.0 Protected Resource Metadata (RFC9728) standard.
This enables MCP clients implementing the OAuth 2.0 Dynamic Client Registration Protocol (RFC7591) to be configured to automatically register themselves as authorization server clients, eliminating the need for manual, error-prone configuration steps.
To use this functionality, you must appropriately configure the following properties in your sdk_config.env file:
- MCP_OIDC_ISSUER_URL: The URL of your Identity Provider (e.g., Keycloak).
- MCP_OIDC_JWKS_URI: The URL to your IdP's JSON Web Key Set for token signature verification.
- MCP_OIDC_AUDIENCE: The unique identifier for your AI SDK API.
- MCP_SCOPES_SUPPORTED: A comma-separated list of permissions required in the token.
The base URL used by the MCP server to build metadata URLs for client discovery can be specified in MCP_AI_SDK_DCR_URL. It should be the root URL, without the /mcp path (e.g., http://localhost:8008). If not set, the URL is auto-generated using the AI_SDK_HOST, AI_SDK_PORT, and AI_SDK_ROOT_PATH variables.
For the authorization flow to be successful, you must ensure that the authorization server specified in the MCP_OIDC_ISSUER_URL property supports both RFC 7591 and the OAuth 2.0 Authorization Server Metadata (RFC8414) standards.
Notable identity providers that fully support both RFC 7591 and RFC 8414 include Okta, Ping Identity or Keycloak. As of October 2025, Microsoft Entra ID and Google Identity do not support these protocols.
NOTE: If your Denodo identity provider doesn't support RFC 7591 or RFC 8414, you still can manually configure your MCP client. Most clients allow you to add custom headers, so you can manually get the access token using the OAuth 2.0 Credentials Wizard. Then, configure the MCP client to send this token in an Authorization header.
Below are two examples of possible configurations for the MCP client with MCP_BASIC_AUTH = 0.
{ "servers": { "Denodo_AI_SDK_Remote": { "url": "http://localhost:8008/mcp", "type": "http", } } } |
VS Code configuration with DCR enabled
{ "inputs": [ { "type": "promptString", "id": "denodo_oauth_token", "description": "Denodo OAuth 2.0 Token", "password": true } ], "servers": { "Denodo_AI_SDK_Remote": { "url": "http://localhost:8008/mcp", "type": "http", "headers": { "Authorization": "Bearer ${input:denodo_oauth_token}" } } } } |
VS Code configuration example without DCR support
Basic authentication (MCP_BASIC_AUTH = 1)
This mode disables the MCP server's built-in security layer, making it act as a simple proxy. It is primarily intended for development, testing, or trusted internal environments where security is handled by another layer.
In this mode, authentication is delegated to the internal logic of the AI SDK tool itself. The tool is responsible for inspecting the Authorization header of the incoming request. This provides flexibility, as the tool can be programmed to accept different credential types.
The AI SDK's default tool implementation supports both HTTP Basic and Bearer token authentication formats. Below are two examples of possible configurations for the MCP client.
{ "servers": { "Denodo_AI_SDK_Remote": { "url": "http://localhost:8008/mcp", "headers": { "Authorization": "Basic YWRtaW46YWRtaW4=" } } } } |
VS Code configuration with HTTP Basic credentials (base64 of admin:admin in this example)
{ "inputs": [ { "type": "promptString", "id": "denodo_oauth_token", "description": "Denodo OAuth 2.0 Token", "password": true } ], "servers": { "Denodo_AI_SDK_Remote": { "url": "http://localhost:8008/mcp", "type": "http", "headers": { "Authorization": "Bearer ${input:denodo_oauth_token}" } } } } |
VS Code configuration with Bearer token authentication
Advanced configuration
Check out the guide on how to deploy the AI SDK in production.
AI SDK
These configuration settings are expected in the api/utils/sdk_config.env file. Please check out the api/utils/sdk_config_example.env template for more information.
- AI_SDK_HOST. The host the AI SDK will run from.
- AI_SDK_PORT. The port the AI SDK will execute from.
- AI_SDK_ROOT_PATH. Allows the AI SDK to run on a different path.
- AI_SDK_SSL_CERT. If you want to use SSL in the AI SDK, set the relative path from root AI SDK folder to the certificate file here.
- AI_SDK_SSL_KEY. If you want to use SSL in the AI SDK, set the relative path from root AI SDK folder to the key file here.
- AI_SDK_WORKERS. If using the AI SDK in production, you will want to leverage multiple threads. Use AI_SDK_WORKERS to set the number of threads (workers) to run the AI SDK APIs with.
- SENSITIVE_DATA_LOGGING. You can set it to 0 to disable the input/output of functions to the logs.
- AI_SDK_DATA_CATALOG_URL. The Data Marketplace URL. Defaults to: http://localhost:9090/denodo-data-catalog/
- DATA_CATALOG_SERVER_ID. The identifier of the Virtual DataPort server you want to work with in the Data Marketplace. Defaults to 1 (which is the identifier assigned to the default VDP server registered in Data Marketplace during installation).
- DATA_CATALOG_VERIFY_SSL. If using SSL in the Data Marketplace with an unsafe certificate, you can disable verification here.
- CUSTOM_INSTRUCTIONS. Include domain-specific knowledge the LLM should use when selecting tables and generating queries.
- RATE_LIMIT_RPM. The maximum number of calls to the embedding model per minute in the vectorization process.
- VQL_EXECUTE_ROWS_LIMIT. The number of rows to return from the Data Marketplace.
- LLM_RESPONSE_ROWS_LIMIT. The number of rows the LLM will receive from the execution result to formulate the natural language response. It’s set to 15 to not fill up the context, which increases latency and response time.
DeepQuery
DeepQuery is an advanced feature of the Denodo AI SDK designed to process complex analytical questions. It operates in two distinct phases: a planning phase, which uses a thinking model to reason about the question and create an execution plan, and an execution phase, which carries out that plan to produce the final answer. Using DeepQuery requires more powerful LLMs. You'll need a thinking model with at least 128k context and a 50 RPM allowance.
DeepQuery leverages an advanced set of configurations to customize its behavior. These configuration settings are expected in the api/utils/sdk_config.env file. Please check out the api/utils/sdk_config_example.env template for more information.
- DEEPQUERY_EXECUTION_MODEL. Defines which LLM to use for the DeepQuery execution phase. This is separate from the planning phase, which always uses the thinking model.
- "thinking": Uses the thinking model for execution (recommended for best performance and accuracy).
- "base": Uses the standard LLM for execution (more cost-effective, but may be less capable of handling complex tasks).
- DEEPQUERY_DEFAULT_ROWS. The number of rows to return by default in database queries during the reasoning process.
- DEEPQUERY_MAX_ANALYSIS_LOOPS. The maximum number of analysis loops the thinking model will perform.
- DEEPQUERY_MAX_REPORTING_LOOPS. The maximum number of internal loops the thinking model will perform to generate a comprehensive final report in PDF.
- DEEPQUERY_MAX_CONCURRENT_TOOL_CALLS. The maximum number of tools that the thinking model can call simultaneously during its reasoning process.
- You can also configure the color schemes for PDF reports generated by DeepQuery.
NOTE: To disable deepQuery in the API, please disable the THINKING_PROVIDER/THINKING_LLM variables in the sdk_config.env configuration file.
Vector Stores
The AI SDK is compatible with 3 different vector store providers through their Langchain compatibility libraries:
- Chroma
- PGVector
- OpenSearch
Chroma
Chroma will be used in persistent mode, meaning all changes will be logged locally. No configuration is needed.
PGVector
PGVector requires the following variable be set in the configuration file:
- PGVECTOR_CONNECTION_STRING. For example, for the demo PGVector Langchain instance, the value of this variable would be
postgresql+psycopg://langchain:langchain@localhost:6024/langchain
where:
postgresql+psycopg://{user}:{pwd}@{host}:{port}/{db_name}
OpenSearch
OpenSearch requires the following variables be set in the configuration file:
- OPENSEARCH_URL. The URL of the OpenSearch instance. Defaults to http://localhost:9200
- OPENSEARCH_USERNAME. The username of the OpenSearch instance. Defaults to admin.
- OPENSEARCH_PASSWORD. The password of the OpenSearch instance. Defaults to admin.
LLMs/Embeddings
The AI SDK is compatible with the following LLM/Embeddings providers through their Langchain compatibility libraries.
NOTE: The model used in the _MODEL variable is the same model ID you would use when making an API call to that model provider. For example, claude-3-sonnet-20240229 in Anthropic and anthropic.claude-3-5-sonnet-20240620-v1:0 in AWS Bedrock refer to the same model, Claude 3.5 Sonnet v1. You should refer to the provider-specific model ID in the configuration file.
OpenAI (LLM/Embeddings)
_PROVIDER value: openai
OpenAI requires the following variables be set in the configuration file:
- OPENAI_API_KEY. Defines the API key for your OpenAI account.
- OPENAI_PROXY_URL. Proxy to use when contacting the OpenAI servers. Set as http://{user}:{pwd}@{host}:{port} format.
- OPENAI_ORG_ID. OpenAI organization ID. If not set it will use the default one set in your OpenAI account.
DeepQuery Configuration:
To use DeepQuery, set the following variables:
- THINKING_PROVIDER = openai
- THINKING_MODEL = o4-mini
Azure (LLM/Embeddings)
_PROVIDER value: azure
For Azure, please set the deployment name in the LLM_MODEL variable. The model (deployment) name used will be appended to the Azure endpoint, like this /deployments/{LLM_MODEL}.
Please follow this guide with an example of how to configure Azure in the AI SDK.
Azure requires the following variables be set in the configuration file:
- AZURE_ENDPOINT. Defines the connection string to your Azure instance. For example, https://example-resource.openai.azure.com/
- AZURE_API_VERSION. Defines the connection string and version to your Azure instance. For example, 2024-06-01.
- AZURE_API_KEY. Defines the API key for your Azure account.
- AZURE_PROXY. Azure proxy to use. Set as http://{user}:{pwd}@{host}:{port} format
Google (LLM/Embeddings)
_PROVIDER value: google
Google Vertex AI requires a service account JSON file be created to be used in the AI SDK. Then, the following variables are to be set in the configuration file:
- GOOGLE_APPLICATION_CREDENTIALS. Defines the path to the JSON storing your Google Cloud service account.
DeepQuery Configuration:
To use DeepQuery, you need to configure a thinking model and activate its reasoning capabilities with the following extra variables:
- THINKING_PROVIDER = google
- THINKING_MODEL = gemini-2.5-pro
- GOOGLE_THINKING = 1
- GOOGLE_THINKING_TOKENS = 2048
Google AI Studio (LLM/Embeddings)
_PROVIDER value: googleaistudio
NOTE: Not to be confused with Google Vertex AI, which is the previous listing. Google AI Studio is not intended for production environments.
Google AI Studio requires the following variables be set in the configuration file:
- GOOGLE_AI_STUDIO_API_KEY. Your Google AI Studio API key.
AWS Bedrock (LLM/Embeddings)
_PROVIDER value: bedrock
If your LLM_MODEL is an AWS ARN, specifying the model provider (e.g., 'anthropic') is mandatory. It can be specified directly in the LLM_MODEL variable using the 'provider:arn:...' format.
AWS Bedrock requires a service account to be run with the AI SDK. Once that is obtained, the following variables must be set in the configuration file:
- AWS_REGION. The region where you want to execute the model in. For example, us-east-1.
- AWS_PROFILE_NAME.
- AWS_ROLE_ARN.
- AWS_ACCESS_KEY_ID.
- AWS_SECRET_ACCESS_KEY.
- AWS_STS_REGIONAL_ENDPOINTS. Set to “regional” to use the regional STS endpoint and not the global one, which is the default behavior of the boto3 AWS SDK. Comment out to use the global.
DeepQuery Configuration:
To use DeepQuery, you need to configure a thinking model and activate its reasoning capabilities with the following extra variables:
- THINKING_PROVIDER = bedrock
- THINKING_MODEL = us.anthropic.claude-sonnet-4-20250514-v1:0
- AWS_CLAUDE_THINKING = 1
- AWS_CLAUDE_THINKING_TOKENS = 2048
Groq (LLM)
_PROVIDER value: groq
Groq requires the following variables be set in the configuration file:
- GROQ_API_KEY. The API Key to the GROQ provider.
NVIDIA NIM (LLM/Embeddings)
_PROVIDER value: nvidia
NVIDIA NIM requires the following variables be set in the configuration file:
- NVIDIA_API_KEY. The API Key to your NVIDIA NIM instance.
- NVIDIA_BASE_URL (Optional). If self-hosting NVIDIA NIM, set the base url here, like "http://localhost:8000/v1"
Mistral (LLM/Embeddings)
_PROVIDER value: mistral
Mistral requires the following variables be set in the configuration file:
- MISTRAL_API_KEY. The API Key for your Mistral service.
SambaNova (LLM)
_PROVIDER value: sambanova
Sambanova requires the following variables be set in the configuration file:
- SAMBANOVA_API_KEY. The API Key for your SambaNova service.
OpenRouter (LLM)
_PROVIDER value: openrouter
OpenRouter requires the following variables be set in the configuration file:
- OPENROUTER_API_KEY. The API Key for your OpenRouter service.
Ollama (LLM/Embeddings)
_PROVIDER value: ollama
- You must have the model already installed first through Ollama.
- You must use the same model ID in _MODEL as the one you use in Ollama.
There's no need to execute 'ollama run <model-id>' before launching the AI SDK.
Optional parameters:
- OLLAMA_API_BASE_URL. Modify the base URL of the Ollama service.
NOTE: Ollama usually serves models with context window size default of 2048 tokens. Please check this guide to increase this value. Ollama will not work for DeepQuery as it requires a much larger context length (128k+) and its default configuration is not compatible.
OpenAI-Compatible Provider (LLM/Embeddings)
_PROVIDER value: Any custom name that does not start with AZURE_ (e.g., MY_API).
This option allows you to connect to any LLM service that is not natively listed but offers an API compatible with OpenAI.
To configure it:
- Choose a unique name for your provider (e.g., MY_API).
- Set this name in the corresponding _PROVIDER variable in section 4 of the configuration file (e.g., LLM_PROVIDER = MY_API).
- Define the following variables in the configuration file, replacing [PROVIDER_NAME] with the name you chose:
- [PROVIDER_NAME]_API_KEY. The API key for your custom provider (e.g., MY_API_API_KEY).
- [PROVIDER_NAME]_BASE_URL. The base URL of the API endpoint (e.g., MY_API_BASE_URL).
Custom Azure Provider (LLM/Embeddings)
_PROVIDER value: Any custom name that must start with AZURE_ (e.g., AZURE_LLM).
This is the recommended approach for advanced Azure configurations, such as using separate endpoints or API versions for chat and embeddings.
To configure it:
- Choose unique names for your providers that must start with AZURE_ (e.g., AZURE_LLM, AZURE_EMBEDDING).
- Set these names in the corresponding _PROVIDER variables in section 4.
- Define the following variables for each provider, replacing [PROVIDER_NAME] with the names you chose:
- [PROVIDER_NAME]_ENDPOINT. The endpoint of your Azure instance (e.g., AZURE_LLM_ENDPOINT).
- [PROVIDER_NAME]_API_VERSION. The API version for that specific configuration (e.g., AZURE_LLM_API_VERSION).
- [PROVIDER_NAME]_API_KEY. The API key for that configuration (e.g., AZURE_LLM_API_KEY)
- [PROVIDER_NAME]_EMBEDDINGS_DIMENSIONS. (Optional) Defines the vector dimensions for embedding models (e.g., AZURE_EMBEDDING_EMBEDDINGS_DIMENSIONS).
Sample chatbot
These configuration settings are expected in the sample_chatbot/chatbot_config.env file. Please check out the sample_chatbot/chatbot_config_example.env template for more information.
- CHATBOT_HOST. Here you can specify the host from where the chatbot will launch.
- CHATBOT_PORT. The chatbot’s port.
- CHATBOT_ROOT_PATH. Allows the sample chatbot to run on a different path.
- CHATBOT_LLM_PROVIDER. The LLM provider for the chatbot.
- CHATBOT_LLM_MODEL. The LLM model id for the specified provider for the chatbot.
- CHATBOT_EMBEDDINGS_PROVIDER. The embeddings provider for the chatbot.
- CHATBOT_EMBEDDINGS_MODEL. The model id for the embeddings model for the specified provider.
- CHATBOT_REPORTING. Set to 1 if you want to enable reporting to a CSV file in reports/. Set to 0 to disable.
- CHATBOT_AUTO_GRAPH. Set to 1 to allow the AI SDK to decide when to generate graphs. Set to 0 to disable this option. If set to 0, AI SDK will only generate graphs when explicitly asked to.
- CHATBOT_UNSTRUCTURED_MODE. Set to 1 to enable users the option to upload their own CSV files and 0 to disable.
- CHATBOT_REPORT_MAX_SIZE. Defaults to 10. The max size of the report generated before rotating.
- CHATBOT_FEEDBACK. Set to 1 if you want to enable feedback in the chatbot UI (saved to the report CSV, so CHATBOT_REPORTING set to 1 is needed). Set to 0 to disable.
- CHATBOT_USER_EDIT_LLM. Allow users to edit LLM settings (provider, model, temperature, max_tokens) through the settings modal. Set to 1 to enable LLM editing, 0 to disable. This setting cannot be changed after startup.
- CHATBOT_UNSTRUCTURED_INDEX. Instead of having users upload their own unstructured data through a CSV file, you can connect the chatbot to an already populated vector store. Indicate the index of the unstructured data source.
- CHATBOT_UNSTRUCTURED_DESCRIPTION. The description of the custom vector store, so the LLM knows when to route here.
- CHATBOT_DATA_CATALOG_URL. To allow direct linking to the Data Marketplace page for the views used in the context.
- AI_SDK_URL. The AI SDK base URL. Defaults to: http://localhost:8008
- AI_SDK_USERNAME. The username of the user who will be used to vectorize the data from the chatbot. It will call getMetadata with these credentials. Remove this to disable the “Sync” button in the UI.
- AI_SDK_PASSWORD. The password of the user who will be used to vectorize the data from the chatbot. It will call getMetadata with these credentials. Remove this to disable the “Sync” button in the UI.
- CHATBOT_LLM_RESPONSE_ROWS_LIMIT. Controls the number of rows the LLM will use to formulate a natural language response.
- CHATBOT_SYNC_VDBS_TIMEOUT. Defaults to 600000. Frontend timeout (in ms) for the vector store synchronization call.
- AI_SDK_VERIFY_SSL. Allows to verify SSL certificates for AI SDK requests. Set to 1 to enable SSL verification; 0 to disable.
Observability and tracing
The log files generated in the logs/ folder of the AI SDK are useful to debug specific errors/functionality. However, to properly review the complete flow of the AI SDK there is compatibility with an open-source LLM observability and tracing tool called Langfuse.
To configure it, simply fill out the following variables in the sdk_config.env and chatbot_config.env configuration files:
- LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY identify your Langfuse project. Both are required to enable tracing.
- Set the Langfuse host in LANGFUSE_HOST. Can either use cloud or local (self-hosted).
- Optional: Associate traces with a user identifier (appears as user_id in Langfuse) through LANGFUSE_USER.
Answer Question Endpoints
/answerQuestion
This is the main endpoint for processing natural language questions about your data. It automatically determines whether to use SQL or metadata to answer the question. If you know that you want to query data or metadata directly, you can either set the mode parameter in this endpoint to “data”/ “metadata” or you can call the answerDataQuestion/answerMetadataQuestion endpoints directly.
Parameters
- question (string, required): The natural language question to be answered
- plot (boolean, default: false): Whether to generate a plot with the answer
- plot_details (string, default: ''): Additional plotting instructions
- embeddings_provider (string): Provider for embeddings generation (defaults to env var)
- embeddings_model (string): Model to use for embeddings (defaults to env var)
- vector_store_provider (string): Vector store to use (defaults to env var)
- llm_provider (string): Provider for SQL generation and chat completion (defaults to env var)
- llm_model (string): Model for SQL generation and chat completion (defaults to env var)
- vdp_database_names (string): Comma-separated list of databases to search in
- vdp_tag_names (string): Comma-separated list of tags to search in
- use_views (string, default: ''): Specific views to use for the query. For example, “bank.loans, bank.customers”
- expand_set_views (boolean, default: true): Whether to expand the previously set views. If true, then the context for the question will be “bank.loans, bank.customers” + the result of the vector store search. If set to false, it will be only those 2 views (or whatever views are set in use_views).
- custom_instructions (string): Additional instructions for the LLM
- markdown_response (boolean, default: true): Format response in markdown
- vector_search_k (integer, default: 5): Number of similar tables to retrieve
- vector_search_sample_data_k (integer, default: 3): Number of samples to pass to the LLM for each column
- mode (string, default: "default"): One of "default", "data", or "metadata"
- disclaimer (boolean, default: true): Whether to include disclaimers
- verbose (boolean, default: true): Whether to return a natural language response or simply the SQL query and execution result.
- vql_execute_rows_limit (int, default: 100): The number of rows to return from the Data Marketplace.
- llm_response_rows_limit (int, default: 15): The number of rows the LLM will receive from the execution result to formulate the natural language response. It’s set to 15 to not fill up the context, which increases latency and response time.
/answerDataQuestion
Identical to /answerQuestion but specifically for questions that require querying data (forces data mode).
It is faster than using default mode on answerQuestion because it doesn’t have to decide if it needs data or metadata.
/answerMetadataQuestion
Identical to /answerQuestion but specifically for questions that require querying metadata (forces metadata mode).
It is faster than using default mode on answerQuestion because it doesn’t have to decide if it needs data or metadata.
/answerQuestionUsingViews
Similar to /answerQuestion but accepts pre-selected views instead of performing vector search.
This endpoint is ONLY intended for clients that want to implement their own vector store provider and not one supported by the AI SDK. To force the LLM to consider a set of views, please review the use_views parameter in answerQuestion.
Additional/Modified Parameters:
- vector_search_tables (list of strings, required): List of pre-selected views to use
Does not use vdp_database_names, use_views, or expand_set_views.
Streaming Endpoints
/streamAnswerQuestion
Streaming version of /answerQuestion that returns the answer as a text stream.
Parameters are identical to /answerQuestion.
/streamAnswerQuestionUsingViews
Streaming version of /answerQuestionUsingViews that returns the answer as a text stream.
Parameters are identical to /answerQuestionUsingViews.
DeepQuery Endpoints
/deepQuery
This endpoint processes a complex analysis question using DeepQuery. It returns the analysis along with metadata that can be used by the /generateDeepQueryPDF endpoint to create a PDF report.
Parameters
- question (string, required): The business question to analyze.
- execution_model (string, default: "thinking"): The model used for executing the analysis. Options are "thinking" (uses the thinking_llm for both planning and execution) or "base" (uses the llm for execution).
- default_rows (integer, default: 10): The number of rows to return in database queries.
- max_analysis_loops (integer, default: 50): The maximum number of analysis loops allowed.
- max_concurrent_tool_calls (integer, default: 5): The maximum number of tools that can be called concurrently.
- thinking_llm_provider (string): The provider for the "thinking" LLM.
- thinking_llm_model (string): The model to use for the "thinking" LLM.
- thinking_llm_temperature (float, default: 0.0): The temperature setting for the "thinking" LLM.
- thinking_llm_max_tokens (integer, default: 10240): The maximum token limit for the "thinking" LLM.
- llm_provider (string): The provider for the "base" LLM.
- llm_model (string): The model to use for the "base" LLM.
- llm_temperature (float, default: 0.0): The temperature setting for the "base" LLM.
- llm_max_tokens (integer, default: 2048): The maximum token limit for the "base" LLM.
- embeddings_provider (string): The provider for embeddings generation.
- embeddings_model (string): The model to use for embeddings.
- vector_store_provider (string): The vector store to use.
- vdp_database_names (string): Comma-separated list of databases to search in
- vdp_tag_names (string): Comma-separated list of tags to search in
- use_views (string, default: ''): Specific views to use for the query. For example, “bank.loans, bank.customers”
- expand_set_views (boolean, default: true): Whether to expand the previously set views. If true, then the context for the question will be “bank.loans, bank.customers” + the result of the vector store search. If set to false, it will be only those 2 views (or whatever views are set in use_views).
- vector_search_k (integer, default: 5): The number of similar views to retrieve during the vector search.
- vector_search_sample_data_k (integer, default: 3): The number of example rows to retrieve for each view during the vector search.
/generateDeepQueryPDF
This endpoint generates a PDF report from the metadata returned by the /deepQuery endpoint. The report includes visualizations and detailed analysis.
Parameters
- deepquery_metadata (object, required): The metadata object returned from the /deepQuery endpoint.
- color_palette (string, default: "red"): The color palette to use for visualizations in the PDF.
- max_reporting_loops (integer, default: 25): The maximum number of loops for the reporting process.
- include_failed_tool_calls_appendix (boolean, default: false): Whether to include an appendix detailing failed tool calls in the report.
- thinking_llm_provider (string): The provider for the "thinking" LLM (used for report generation).
- thinking_llm_model (string): The model for the "thinking" LLM.
- thinking_llm_temperature (float, default: 0.0): The temperature setting for the "thinking" LLM.
- thinking_llm_max_tokens (integer, default: 10240): The maximum token limit for the "thinking" LLM.
- llm_provider (string): The provider for the "base" LLM.
- llm_model (string): The model for the "base" LLM.
- llm_temperature (float, default: 0.0): The temperature setting for the "base" LLM.
- llm_max_tokens (integer, default: 2048): The maximum token limit for the "base" LLM.
Metadata and Search Endpoints
/getMetadata
Retrieves metadata from specified VDP databases and optionally stores it in a vector database.
Parameters
- vdp_database_names (string): Comma-separated list of databases
- vdp_tag_names (string): Comma-separated list of tags
- embeddings_provider (string): Provider for embeddings generation
- embeddings_model (string): Model to use for embeddings
- embeddings_token_limit (int, default: 0): The token limit after which views that surpass it will be chunked in the vector store. If set to 0, no chunking will be done.
- vector_store_provider (string): Vector store to use
- rate_limit_rpm (int, default: 0): Number of maximum embedding requests per minute. If set to 0, no rate limit will be applied.
- examples_per_table (integer, default: 3): Number of example rows per table
- view_descriptions (boolean, default: true): Include view descriptions
- column_descriptions (boolean, default: true): Include column descriptions
- associations (boolean, default: true): Include table associations
- view_prefix_filter (string): Only vectorizes views that contain the specified view prefix.
- view_suffix_filter (string): Only vectorizes views that contain the specified view suffix.
- insert (boolean, default: false): Store metadata in vector store
- incremental (boolean, default: true): On first run with incremental = true it will activate incremental loading in the Data Marketplace, which will begin tracking changes. On subsequent runs with incremental = true, only views that have suffered modifications will be returned.
- parallel (boolean, default: true): To process the embedding and insertion process in parallel
- views_per_request (int, default: 50): How many views to ask for per request to the Denodo Platform.
/deleteMetadata
Deletes views from the vector store based on database names or tag names.
Parameters
- vdp_database_names (string): Comma-separated list of databases.
- vdp_tag_names (string): Comma-separated list of tags.
- embeddings_provider (string): Provider for embeddings generation.
- embeddings_model (string): Model to use for embeddings.
- vector_store_provider (string): Vector store to use.
- delete_conflicting (boolean, default: false): If it is set to false, only views that satisfy all of the following conditions will be deleted:
- The associated database is included in the list of databases to delete, or it has not been previously synchronized (view was synchronized through a tag).
- All active tags are included in the list of tags to delete, or those tags have not been previously synchronized.
/similaritySearch
Performs similarity search on previously stored metadata.
Parameters
- query (string, required): Search query
- vdp_database_names (string): Comma-separated list of databases to search
- vdp_tag_names (string): Comma-separated list of tags to search in
- embeddings_provider (string): Provider for embeddings generation
- embeddings_model (string): Model to use for embeddings
- vector_store_provider (string): Vector store to use
- n_results (integer, default: 5): Number of results to return
- scores (boolean, default: false): Include similarity scores
Authentication
All endpoints support two authentication methods:
- HTTP Basic Authentication
- OAuth Bearer Token Authentication
Environment Variables
To avoid including the parameters constantly in every call, you can set the following environment variables to have the API default to them, if not specified.
- EMBEDDINGS_PROVIDER
- EMBEDDINGS_MODEL
- VECTOR_STORE
- LLM_PROVIDER
- LLM_MODEL
- THINK_LLM_PROVIDER
- THINK_LLM_MODEL
- CUSTOM_INSTRUCTIONS
- VQL_EXECUTE_ROWS_LIMIT
- LLM_RESPONSE_ROWS_LIMIT
Response Format
Most endpoints return a JSON response containing:
- answer: The generated answer
- sql_query: The SQL query used (if applicable)
- query_explanation: Explanation of the query
- tokens: Token usage information
- execution_result: Query execution results
- related_questions: Suggested follow-up questions
- tables_used: Tables used in the query
- raw_graph: Graph data in Base64 SVG format (if applicable)
- Various timing information
NOTE: Streaming endpoints return a text stream containing only the answer.
Examples
Sample chatbot architecture
The sample chatbot uses a custom tool flow. Our goal is to support all the main LLM providers and models. Most of them already have their own version of tool (function) calling, but its accuracy and success varies greatly across their own library of models.
Our sample chatbot architecture performs near-perfect tool calling across the whole spectrum of large language models, without the need to rely on the provider’s own functionality.
Our sample chatbot has 3 tools currently implemented (can be expanded):
- Database query tool. This tool calls the answerDataQuestion endpoint, with the parameters being set by the chatbot LLM.
- Metadata query tool. This tool calls the answerMetadataQuestion endpoint, with the parameters being set by the chatbot LLM, same as before.
- Knowledge base query tool. This tool performs a similarity search on the unstructured CSV uploaded to the chatbot.
The information regarding the tools the LLM can use to answer a question are appended at the end of each user question. Then,
- An initial response is generated by the LLM, calling the specific tool with the necessary parameters using XML tags.
- The tool is executed and the output is passed to the LLM, substituting the tool usage information with the tool output. This final response is then shown to the user.
Integrating into your own chatbot solution
To integrate the Denodo AI SDK as an agent or tool as part of your own solution, you can call the API endpoints.
NOTE: Before calling any of the endpoints, you must have inserted the VDP database(s) information into a compatible vector store. To do this, call the getMetadata endpoint with insert = True.
Here is a simple example using the Python requests library. We are going to call the answerQuestion endpoint with the corresponding parameters. This endpoint takes a natural language question in and returns a natural language response out.
First, we load the vector store with the views from database ‘bank’.
Once this is done, we can now use answerQuestion (or any other endpoint, for that matter).
For the question ‘How many loans have we have’, in our database, we get the following response:
As we can see, the LLM answered that we have 23 loans. To reach this conclusion, it generated the query SELECT COUNT(*) FROM bank.loans and executed it.
This is a straightforward approach when integrating the AI SDK into your application. However, if integrating the AI SDK as a tool, where your application is controlled by an LLM, you should instead use verbose = False and return only the execution_result to your application’s LLM.
You can find more examples in the api/example.py file.
Frequently Asked Questions (FAQ)
How can I choose the reasoning level of OpenAI models?
OpenAI reasoning models have different levels of reasoning (https://platform.openai.com/docs/guides/reasoning) that can be triggered via the API:
- minimal (only on gpt-5 as of October 2025)
- low
- medium
- high
To set this parameter, please add “-low” to the value of the {LLM/THINKING}_MODEL. For example, to use o4-mini with medium (default) you would set:
THINKING_MODEL = o4-mini-medium
How can I send custom HTTP headers with OpenAI-compatible LLMs?
You can define custom HTTP headers to be sent with API requests. This is useful for passing authentication tokens, subscription keys, or other metadata in custom environments.
IMPORTANT: This functionality is specifically designed for providers compatible with the OpenAI/Azure API structure. It applies to:
- OpenAI (standard)
- Azure
The syntax is: PROVIDER_NAME_HEADER_Header-Name=HeaderValue
- Replace PROVIDER_NAME with the provider's name in uppercase (e.g., MYAPI, AZURE).
- Replace Header-Name with the actual HTTP header name (e.g., Subscription-Key).
- Replace HeaderValue with the value for that header.
Example for a Custom OpenAI-compatible provider named "MYAPI"
MYAPI_HEADER_Subscription-Key=HeaderValue
AWS Bedrock assumeRole fails, why?
If you’re certain your AWS credentials are correct, the role you chose has the permissions to access Bedrock and assumeRole is failing with a permissions error, it might be that the AWS SDK is using the global STS endpoint to assume the role.
To use the regional STS endpoint, please uncomment the AWS_STS_REGIONAL_ENDPOINTS = regional line in both the sdk_config.env and the chatbot_config.env.
How does incremental loading work?
NOTE: As of version 0.7 of the AI SDK, incremental loading only works with Denodo 9.2.0 and above. In previous versions, the AI SDK will remove all existing views and re-insert all the views associated with the database(s) and tag(s) that you want to vectorize. Please note that in 9.2.1 and onwards, incremental mode will be activated automatically in the Data Marketplace when using incremental = True. In 9.2.0, you will have to follow the steps listed at the end of this question.
When you first load a database(s) with getMetadata with insert = True and incremental = True, all of the views in the database(s) will be vectorized in the vector store. With this first load, a timestamp is saved in the vector store to keep track of the last time it was updated.
Next time you call the endpoint getMetadata with insert = True and incremental = True, only the views that have received changes will be modified in the vector store.
If you wish to perform a fresh start, you can safely delete your vector store and load again using getMetadata.
NOTE: For Denodo 9.2.0 (and only 9.2.0, not 9.2.1 and above) you will need to perform extra steps to activate incremental mode.
Go to the Denodo Platform path and navigate to resources\apache-tomcat\webapps\denodo-data-catalog\WEB-INF\classes. Open the application.properties file. Activate askAQuestion.aiSdk.metadataChanges.enabled
How to deploy the AI SDK in production
Getting the AI SDK set up in production is simple, simply by using the flag –production with run.py. For example:
python run.py api –production
This will launch Gunicorn (ASGI server) to handle incoming requests on UNIX or Uvicorn on Windows. Some configuration parameters that might be useful in production:
- AI_SDK_WORKERS. Number of workers to handle incoming requests.
- AI_SDK_SSL_CERT. Path to the SSL certificate file. PEM format.
- AI_SDK_SSL_KEY. Path to the SSL key file. PEM format.
As of version 0.7, the sample chatbot that comes with the AI SDK is not suitable for a production environment.
Also, Chroma is not a suitable vector store for production, as it does not handle large queries and concurrency well. Please look at deploying PGVector or OpenSearch in production.
No matching distribution found for pysqlite3-binary
Sometimes when installing the requirements.txt, you will face an error that says no distribution was found for pysqlite3-binary for your OS. Safely comment out this line from your requirements.txt and proceed with the installation.
Upgrading the AI SDK
When upgrading the AI SDK, please make sure to follow these steps:
- Back up your AI SDK configuration files (api/utils/sdk_config.env) and the sample chatbot configuration files (sample_chatbot/chatbot_config.env).
- Review the new AI SDK configuration files and sample chatbot configuration files to check whether new properties have been added/deleted/modified.
- Delete the previous AI SDK’s python virtual environment.
- Check if there have been any critical changes to the vector store structure with the new schema. If so, delete the previous AI SDK’s vector store.
- Install the new AI SDK’s dependencies in the new python virtual environment.
- If you had to delete the previous AI SDK’s vector store, run getMetadata to load the new AI SDK vector store.
You’re set!
Python and Ubuntu installation common problems
When installing Python in Ubuntu, you might encounter some of the following problems:
- Python installation failed due to SSL/SQLite dependencies missing from OS. In this case, executing the following command solved the issue:
apt install zlib1g zlib1g-dev libssl-dev libbz2-dev libsqlite3-dev
- Requirements install failed due to Failed to build chroma-hnswlib. This sometimes occurs in Ubuntu 22.04 and Python 3.12. Upgrading to Ubuntu 24.04 (or downgrading to Python 3.11) solved the issue.
Chroma requires a more recent SQLite version
SQLite is included in the Python installation. If your Python installation comes bundled with an old SQLite version, you might need to update it.
Please refer to the following https://docs.trychroma.com/troubleshooting#sqlite.
Using the AI SDK with a Denodo version lower than 9.0.5
While it is possible to use the AI SDK with a Denodo version lower than 9.0.5 (not included), some functionality may be limited:
- User permissions. To be able to filter by user permissions, you will need a specific hotfix for any version below 9.0.5.
- Setting examples_per_table to 0 will cause a bug. If you don’t want to provide the LLMs with example data to better understand your dataset, you can set examples_per_table to 0, but this will cause a crash in versions under 9.0.5. You will need the same hotfix as the user permissions to avoid this.
Installing Python requirements.txt takes longer than 15 minutes
The requirements.txt provided comes with all dependencies set and has been tested on Python 3.10 and 3.11 on both Windows and Linux. However, sometimes Python might take longer than usual to install the dependencies.
If this is the case, please update pip (Python’s package installer).
When should I use streaming endpoints vs non-streaming endpoints?
Streaming endpoints are designed to be used with user interfaces (UIs). They allow for real-time updates as the response is generated, providing a more interactive experience for the user.
Non-streaming endpoints are better suited for programmatic access or when you’re fine with receiving the complete response at once.
How can I debug if something goes wrong?
If you encounter issues, you can check the following log files to help diagnose the problem:
1. AI SDK API log file: This log file (located in logs/api.log) contains information about the API server's operations.
2. Sample chatbot log: This log file (located in logs/sample_chatbot.log) provides details about the chatbot's activities.
3. Denodo Data Marketplace log: Located in {DENODO_HOME}/logs/vdp-data-catalog/ within the Denodo installation folder, this log can provide insights into Data Marketplace-related issues.
Reviewing these log files can help you identify error messages, warnings, or other relevant information to troubleshoot the problem.
How can I use two different custom providers for LLM and embeddings?
NOTE: This requires v0.4 of the AI SDK or above.
The SDK supports custom provider for LLM and embeddings as long as it offers a OpenAI-API compatible API endpoint. You can configure the custom providers by following these steps:
- Set the provider for the LLM models as a custom name (not OpenAI). For example, provider1.
- Set the provider for the embeddings as a custom name (not OpenAI). For example, provider2.
- Set PROVIDER1_BASE_URL, PROVIDER1_API_KEY, PROVIDER1_PROXY (optional) and PROVIDER2_BASE_URL, PROVIDER2_API_KEY, PROVIDER2_PROXY (optional) in the configuration files.
PROVIDER1 must be the full capital letters version of the custom provider name set in _PROVIDER.
The base URL should always end in v1, for example:
CEREBRAS_BASE_URL = https://api.cerebras.ai/v1
Can I use a different provider for each component (LLM/Embeddings)?
Yes, absolutely. You can configure the AI SDK to use different providers for your LLMs and for embeddings, allowing you to optimize performance and costs.
On Bedrock, LLM call fails with “on-demand throughput isn’t supported”, why is this?
Not all models are available in all regions on AWS Bedrock. In these cases, you may need to prepend “us.” to your modelID so that AWS routes your request to the appropriate region where the model is available. So, for example, if LLM_MODEL is:
meta.llama3-1-70b-instruct-v1:0
It should become:
us.meta.llama3-1-70b-instruct-v1:0
I’m having issues with the onnxruntime library on Windows, how can I fix it?
This usually happens because of a dependency conflict with Visual C++. Please install the latest version: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170
Ollama is truncating my prompt, why is that?
By default, and unless the Modelfile specifies otherwise, Ollama defaults models to a context window size of 2048 tokens. This is done to enhance user experience, because bigger context windows require more memory and sacrifice performance.
If you are certain you want to increase your model’s context window size in Ollama, these are the steps to follow:
- We will be using phi4 (and expanding to a context window of 16k tokens) as an example for this tutorial, please replace it with the model you want to modify.
- ollama show --modelfile phi4 > Modelfile
- Open the file Modelfile and edit it.
- The FROM inside the Modelfile will contain a path. Delete it and replace it with your model. In our case, FROM phi4.
- Find the line PARAMETER and append a new line below it: PARAMETER num_ctx 16000
- Create the new model: ollama create -f Modelfile phi4-16k
- In this case we named it phi4-16k, but feel free to call it however you like!
- Finally, run: ollama run phi4-16k
Unsupported compiler, at least C++11 support is needed
If you are getting this error while installing the AI SDK’s requirements.txt in Ubuntu 22, you need to run the following command:
apt-get install build-essential -y
Can I avoid writing the log to the filesystem?
To avoid generating log files, you can send the output of the AI SDK/chatbot to the console by adding the –no-logs flag to the python run.py command:
python run.py both --no-logs
How should I configure Azure OpenAI in the AI SDK?
When working with Azure OpenAI, you normally have an URL that looks like this:
To configure this LLM in the AI SDK, you have to decouple the previous URL into the following parameters:
- {LLM/THINKING}_PROVIDER: azure
- {LLM/THINKING}_MODEL. This is your deployment name. In the previous example, it would be: GPT4o_mini_2024-07-18
- AZURE_ENDPOINT. https://endpoint_namei.openai.azure.com/
- AZURE_API_VERSION: 2024-08-01-preview
- AZURE_API_KEY. Your API key to access the Azure OpenAI service.
If I don't want to use sample data, which privileges does my Metadata Sync User need?
For a user to have sufficient privileges to get metadata from Denodo, it must have:
- Connection privilege in the admin database within Denodo.
- Connect privilege in your target database within Denodo.
- Metadata privilege in your target database within Denodo.
