USER MANUALS


Denodo Lakehouse Accelerator Resource Management

To correctly allocate the necessary resources for the Denodo Lakehouse Accelerator (formerly known as Denodo Embedded MPP) cluster and effectively manage the available resources, consider the following:

  • Sizing Recommendations: Follow the recommendations in the section Sizing Recommendations for the Denodo Lakehouse Accelerator to select the right cluster size for your environment.

  • Memory Limits Adjustment: Adjust the memory limits using global properties within the values.yaml file (specifically under presto -> worker | coordinator -> additionalConfig). Keep in mind that the Lakehouse Accelerator will kill a query if it exceeds these limits. The main properties are:

    • query.max-total-memory-per-node: max amount of user and system memory a query can use on a MPP worker.

    • query.max-memory-per-node: max amount of user memory a query can use on a worker.

    • query.max-memory: max amount of user memory a query can use across the entire cluster.

  • Autoscaling Configuration: Configure pod-based auto scaling (via presto.autoscaling in values.yaml) to control CPU and memory management. We strongly recommend enabling graceful shutdown to minimize disruptions during scale-down operations and preserve the integrity of ongoing queries.

  • Denodo Resource Manager: For further control you can use the Denodo Resource Manager, as rules governing both concurrency and execution time will also have an indirect effect on the CPU and memory used by the Lakehouse Accelerator:

    • Concurrency: Limiting the number of concurrent queries puts an immediate ceiling on the total memory and CPU demand. It can be controlled from the Denodo resource manager with options like:

      • Set the maximum number of concurrent queries

      • Set the maximum number of concurrent queries per user

      • Set the maximum number of queue queries

      • Set the maximum number of queries per time unit

    • Execution time: Non-priority users may submit excessively complex or runaway queries that monopolize CPU cores and hold memory for an excessive duration. In order to prevent this situation, it is a good idea to impose limit on the execution time using options:

      • Stop queries when the maximum execution time has been reached

      • Stop query when the maximum number of returned rows has been reached

Add feedback