Denodo Azure AI Search Custom Wrapper - User Manual
You can translate the document:
Introduction
denodo-azureaisearch-customwrapper is a Virtual DataPort Custom Wrapper for querying Azure AI Search indexes.
It bridges the gap between Virtual DataPort and AI-indexed content in Azure by establishing a predefined schema for search results, vectors, and metadata. This enables the integration of hybrid search and Retrieval-Augmented Generation (RAG) capabilities into Denodo.
What is Azure AI Search?
Azure AI Search is a fully managed, cloud-hosted search service that connects data to AI, enabling agents and LLMs to produce reliable, grounded answers using context and history.
Use cases span classic search and modern agentic retrieval (RAG), suitable for enterprise and consumer scenarios like adding search to websites, apps, agents, or chatbots.
Search service capabilities include:
- Two engines: classic search for single requests and agentic retrieval for parallel, iterative, LLM-assisted search.
- Query types: Full-text, vector, hybrid, and multimodal over indexed and remote content.
- AI enrichment for chunking, vectorizing, and making raw content searchable.
- Relevance tuning.
- Azure scale, security, monitoring, and compliance.
- Azure integrations with data platforms, Azure OpenAI, and Microsoft Foundry.
Architecture and Features
denodo-azureaisearch-customwrapper allows us to create base views on Azure AI Search indexes and execute data retrieval on those indexes.
This wrapper leverages the Azure AI Search Java client. This integration grants Denodo users full access to the service's search capabilities, executed via base views. Supported query types include: full-text, vector, hybrid, and multimodal search.
Azure AI Search Custom Wrapper Architecture
This is a brief summary of the wrapper’s current features:
- API key and Azure Role-Based Access Control (RBAC) authentication.
- Base view schema discovery either through metadata retrieval or by inferring it via an introspection query.
- Supports full-text, vector, hybrid, and multimodal search queries.
- Field projections are delegated to AI Search service.
- The following WHERE condition delegations.
- AND and OR conditions.
- Operators: =, <>, <, >, <=, >=.
- ORDER BY clause delegation.
Installation
The Denodo Azure AI Search Custom Wrapper distribution consists of:
- /dist:
- denodo-azureaisearch-customwrapper-{denodo-version}-{version}.jar. The custom wrapper.
- denodo-azureaisearch-customwrapper-{denodo-version}-{version}-jar-with-dependencies.jar. The custom wrapper plus its dependencies. This is the package we recommend to use, as it is easier to install in VDP.
- denodo-azureaisearch-customwrapper-{denodo-version}-{version}-sources. The custom wrapper source code.
- /lib: All the dependencies required by this wrapper in case you need to use the denodo-azureaisearch-customwrapper-{denodo-version}-{version}.jar.
Usage
Importing the Custom Wrapper
To import the custom wrapper, follow these steps:
- In Design Studio, go to File → Extension management
- Click on “Create” button and select the “denodo-azureaisearch-customwrapper-{denodo-version}-{version}-jar-with-dependencies.jar” file, located in the dist folder of the Denodo Azure AI Search Custom Wrapper distribution, downloaded from the Denodo Support Site.
Creating the Azure AI Search data source
To create a new Azure AI Search custom data source:
- In the Design Studio, go to: File → New… → Data source → Custom
- In the “Create a New Custom Data Source” window, do the following:
- Set a name for the new Azure AI Search data source in the “Name” field.
- Click on “Select Jars” and select the file imported in the previous section.
- Click to refresh the input parameters of the data source.
- Configure the data source parameters:
- Search Service Endpoint (mandatory): the unique URL for your Azure AI Search instance, which can be found in the Url field on the service's Overview page in the Azure Portal (e.g., https://<service-name>.search.windows.net):
Besides, AI Search authentication must be configured in the service endpoint HTTP path using either API Key (found under the Keys section) or RBAC credentials via Microsoft Entra ID, depending on your service security configuration.
- Click on the “Save” button.
Creating the base view
To create a new base view using the Azure AI Search data source:
- Double-click on the Azure AI Search data source and then click on “Create base view”.
- Set the parameters as follows:
- Index Name (mandatory): specific name of the Azure AI Search index to query. This name must exactly match an existing index within your service instance, as it determines the schema and searchable data available for the resulting base view.
- Use Introspection Query: when enabled, the wrapper executes a sample query (defined in the Introspection Query field) to automatically infer the schema and data types of the search index. This is particularly useful for generating the base view when the provided credentials lack administrative metadata privileges but have permission to execute search queries.
- Introspection Query: Specify the Azure AI Search REST API query to be used for schema discovery when Use Introspection Query is enabled. To minimize performance impact while ensuring accurate column detection, it is recommended to use a query that limits the result set, such as:
Introspection Query sample \{ "top" : "5", "select" : "*" \} |
- Default Limit: This parameter specifies the total number of records to be retrieved when no explicit limit is defined within the VQL query. The default value is 50.
- Minimal View: When enabled, this option generates a simplified minimal view containing only the Azure AI Search document fields. This configuration is specifically optimized for RAG queries originating from the Denodo Assistant and AI-SDK. By default, this feature is disabled, meaning the resulting view will include standard Azure AI Search API input parameters (see Default (Full) View section).
- Vector Fields: specify a comma-separated list of vector field names. When the Minimal View option is active, the wrapper uses these names to generate the corresponding text search input fields to use in the search within the base view schema. To ensure successful execution, each specified field must have an associated vector configuration and vectorizer already defined within the Azure AI Search service.
Default (Full) View
The schema includes all columns corresponding to the index document fields, plus two dedicated, structured fields: _input and _metadata. The _input field maps to all supported Azure AI Search service input parameters (such as search, search_mode, and vector_query), enabling full query customization via VQL WHERE clauses. The _metadata field contains service output metadata prefixed with @ in the API response.
Example
- In order to show an example of execution of the wrapper, a test Index was created in a sample service. This index was created following the Microsoft vector quickstart tutorial.
- Create a base view:
- Search Service Endpoint
- Base URL: https://<service-name>.search.windows.net
- HTTP headers: api-key -> *******
- Index Name: hotels-vector-quickstart
- The schema of the base view is shown and you can rename it:
- After clicking on “Ok”, you can execute SELECT queries, for example:
- SELECT * FROM azurewrapper_ds;
Input Fields
The Azure AI Search API includes numerous input parameters, such as search, search_mode, and vector_query, that cannot be directly represented using standard VQL clauses. To accommodate these parameters within VQL queries, we have mapped each one to a subfield within the _input struct.
Consequently, these parameters can be specified via conditions in the query's WHERE clause. Since these are not standard VQL conditions on entity fields, they must be combined exclusively using the AND operator; NOT or OR operators are not permitted (except in the case of the vector_query struct field commented in the Handling Multi-Value Vector Queries section) . An example of a valid query would be:
SELECT *
FROM hotels_vector_quickstart
WHERE
(_input).search_mode = 'all' AND
(_input).search = 'hotel' AND
(_input).debug = 'all' AND
(_input).count = true AND
(_input).facets = 'Category, Tags' AND
category = 'Suite' OR category = 'Boutique';
Metadata Fields
On the other hand, the Azure AI Search output metadata fields, which are prefixed with @ in the API response, are placed within a dedicated field called _metadata.
Minimal View
The Minimal View is designed to provide a more streamlined and user-friendly interface by restricting the schema to columns that correspond directly with index fields. To maintain this simplicity, the _input and _metadata fields are automatically omitted from the view.
Enabling this mode allows the wrapper to generate specialized text search input fields derived from your Vector Fields settings. These fields work by converting plain text entries into vector embeddings, which then trigger vector search queries using the nearest neighbor algorithms in Azure AI Search.
NOTE: This feature enhances compatibility with the Denodo Assistant and Denodo AI-SDK tools by offering a cleaner and more accessible view architecture.
Alternatively, if your use case involves advanced vector search control (such as providing raw embeddings, adjusting k-values, or assigning specific weights) or requires semantic/textual searches, the Default (Full) View should be used instead.
Example
- The index used for this example is hotels-vector-quickstart, the one created for the Full View example.
- Create a minimal base view:
- Search Service Endpoint
- Base URL: https://<service-name>.search.windows.net
- HTTP headers: api-key -> *******
- Index Name: hotels-vector-quickstart
- Minimal View: true
- Vector Fields: DescriptionVector
- The base view schema is shown. You can see that a special column with the _input_search prefix is created for each Vector Field introduced. In this case _input_search_descriptionvector is the field generated for executing vector search queries on the DescriptionVector field.
- Then you can execute vector search queries on the DescriptionVector field:
- SELECT * FROM hotels_vector_search WHERE _input_search_description = 'hotels near the mountain' AND _input_view_search_limit = 10;
Limitations
Write operations
Insert/update/delete operations are not currently supported through this custom wrapper.
Index Management and Creation
This wrapper is designed exclusively for data retrieval. It does not support Data Definition Language (DDL) operations; therefore, you cannot create, modify, or delete Azure AI Search indexes directly through it.
Handling Multi-Value Vector Queries
Since conditions on array input fields are currently not correctly delegated to Custom Wrappers, the vectorQueries REST API multi-valued parameter (documented here) is mapped within Denodo to the struct input field vector_query. Consequently, to supply multiple values for this field, the query must join the distinct values using the OR operator:
SELECT *
FROM hotels_vector_quickstart
WHERE
(_input).search_mode = 'all' AND
(_input).search = 'hotel' AND
(_input).debug = 'all' AND
(_input).count = true AND
(
-- First vector input
(_input).vector_query = ROW('vector', 7, null, 'DescriptionVector', '-0.0455, 0.0286...', null, true, 1.0)
OR
-- Second vector input
(_input).vector_query = ROW('vector', 7, null, 'DescriptionVector', '-0.0142, 0.0202...', null, false, 0.5)
) AND
(_input).facets = 'Category, Tags';
References
Azure AI Search official pages:
- Documentation
