Denodo Embedded MPP¶
Note
This feature is only available with the subscription bundle Enterprise Plus. To find out the bundle you have, open the About dialog of Design Studio. See more about this in the section Denodo Platform - Subscription Bundles.
Denodo includes embedded Massively Parallel Processing (MPP) capabilities to improve performance on environments containing data in an object storage. For this purpose Denodo now embeds a customized version of Presto, which is an open source parallel SQL query engine that excels in accessing data lake content.
To deploy the Denodo Embedded MPP cluster, download the “Denodo Embedded MPP” Denodo Connect from the Support Site and follow the instructions of the Embedded MPP Guide. The Denodo Connect “Denodo Embedded MPP” includes an MPP engine based on Presto that has been customized to interact with the Denodo Platform. In addition, the deployment process includes a final step that creates a new special data source in Denodo called “embedded_mpp”.
It also configures the Denodo query optimizer to consider this embedded MPP for query acceleration.
The data source “embedded_mpp” is located in a new database “admin_denodo_mpp”. It can be used for multiple purposes:
Explore an object storage like Amazon S3, Azure Data Lake Storage or HDFS and create base views over data stored in Parquet or Delta format (see Object Storage Data in Parquet, Delta and Iceberg format).
Read data in Parquet and Delta format from the object storage with the power of a massive parallel processing engine.
Load data into the object storage using Parquet or Iceberg format (see Create Iceberg Tables in the Denodo Embedded MPP).
Cache the Denodo server views data. The Denodo server can also use the “embedded_mpp” data source as cache. (see Configure the Denodo Server to Use the Embedded MPP as Cache Using Iceberg Tables).
Accelerate queries. The Embedded MPP allows the query optimizer to apply new Embedded MPP Acceleration techniques that have been specially designed for queries accessing this kind of data.