You can translate the question and the replies:

Azure Cognitive search Vs Denodo Search

Hi Team, We have a current functionality where we have our data stored in the Azure blob storage. We have the Rest API functionality that utilizes Azure cognitive search to look for the data in the Azure blob storage. Is there any functionality in the Denodo that can replace the Azure cognitive search capability? I tried to index the sample data through Denodo, but it is not as efficient as the Azure Cognitive search.
user
29-04-2021 15:12:40 -0400

1 Answer

Hello, In terms of performance for the indexing, there are two pieces to consider: 1. **Access to the data **in blob storage. There are different ways to accomplish this: use the [distributed file system wrapper](https://community.denodo.com/docs/html/document/denodoconnects/8.0/Denodo%20Distributed%20File%20System%20Custom%20Wrapper%20-%20User%20Manual), leverage a data lake engine like Databricks, or use a [Denodo Presto cluster in AKS](https://community.denodo.com/docs/html/document/denodoconnects/8.0/Denodo%20Presto%20Cluster%20on%20Kubernetes%20-%20User%20Manual). Performance will be better with a data lake engine like Databricks or Presto than with the wrapper 2. **Indexing process**. Denodo supports two options: an embedded indexer distriubuted with Denodo Scheduler, based on Luzene, and [Elastic Search](https://community.denodo.com/docs/html/browse/8.0/en/scheduler/administration/creating_and_scheduling_jobs/data_sources/elasticsearch_sources). For large volumes, Elastic will yiled better results, but you will need to provide that infrastructure, for example using https://azure.microsoft.com/en-us/overview/linux-on-azure/elastic/ Additionally, make sure to follow the best practices for indexing outlined in[ this article](https://community.denodo.com/docs/html/browse/8.0/en/scheduler/administration/creating_and_scheduling_jobs/configuring_new_jobs/vdpindexer_extraction_section#recommendations-for-the-indexing-processes), and make sure to use incremental indexing whenever possible to avoid full refreshes Regarding the search process, Elastic will also scale out better with large data volumes, as it can be clusterized and take advantage of parallel processing.
Denodo Team
04-05-2021 12:32:13 -0400
You must sign in to add an answer. If you do not have an account, you can register here