USER MANUALS


Resource Groups

In an environment with multiple concurrent user sessions running queries, not all user sessions have the same importance. You may want to give more priority to some type of queries over another. Denodo Embedded MPP makes use of Resource Groups to organize how different workloads are prioritized.

Resource Groups manage quotas for two main resources: CPU and memory. In addition, granular resource constraints such as concurrency, time, and cpuTime can be specified.

Resource Group limits are only enforced during admission. Once the query starts executing, the Resource Group manager has no control over it. Instead, the concept of penalty is used for groups that exceed their resource specification, checking if a Resource Group has exceeded its limit before allowing it to start a new query.

To enable Resource Management in the Denodo Embedded MPP, set presto.coordinator.resourceGroups: true in the values.yaml file.

Example

In the example configuration shown below, the one that is distributed (presto/conf/resource_groups.json, to be modified according to your needs), there are several Resource Groups and four selectors that define which queries are executed in which Resource Group:

  • The first selector matches queries from datascience and places them in the global.adhoc.datascience group.

  • The second selector matches queries from bi and places them in the global.adhoc.bi group.

  • The third selector matches queries from pipeline and places them in the global.pipeline group.

  • The fourth selector matches queries from admin and places them in the admin group.

  • The last selector is a catch-all for all queries that have not been matched into any group.

Together, these selectors implement the following policy:

  • The admin group can run up to 50 concurrent queries (hardConcurrencyLimit).

For all other groups:

  • No more than 100 total queries may run concurrently (hardConcurrencyLimit).

  • Ad-hoc queries, such as queries from BI tools can run up to 10 concurrent queries (hardConcurrencyLimit) and queries from Data Science tools can run up to 2 concurrent queries (hardConcurrencyLimit).

  • Non ad-hoc queries will run under the global.pipeline group, with a total concurrency of 45 (hardConcurrencyLimit). Queries are run in FIFO order.

  • All remaining queries are placed in the global group.

resource_groups.json
  {
  "rootGroups": [
      {
      "name": "global",
      "softMemoryLimit": "80%",
      "hardConcurrencyLimit": 100,
      "maxQueued": 1000,
      "schedulingPolicy": "weighted",
      "jmxExport": true,
      "subGroups": [
          {
          "name": "adhoc",
          "softMemoryLimit": "10%",
          "hardConcurrencyLimit": 50,
          "maxQueued": 1,
          "schedulingWeight": 10,
          "subGroups": [
              {
              "name": "datascience",
              "softMemoryLimit": "10%",
              "hardConcurrencyLimit": 2,
              "maxQueued": 1,
              "schedulingWeight": 10,
              "schedulingPolicy": "weighted_fair"
              },
              {
              "name": "bi",
              "softMemoryLimit": "10%",
              "hardConcurrencyLimit": 10,
              "maxQueued": 100,
              "schedulingWeight": 10,
              "schedulingPolicy": "weighted_fair"
              }
          ]
          },
          {
          "name": "pipeline",
          "softMemoryLimit": "80%",
          "hardConcurrencyLimit": 45,
          "maxQueued": 100,
          "schedulingWeight": 1,
          "jmxExport": true
          }
      ]
      },
      {
      "name": "admin",
      "softMemoryLimit": "100%",
      "hardConcurrencyLimit": 50,
      "maxQueued": 100,
      "schedulingPolicy": "query_priority",
      "jmxExport": true
      }
  ],
  "selectors": [
      {
      "source": "datascience",
      "group": "global.adhoc.datascience"
      },
      {
      "source": "bi",
      "group": "global.adhoc.bi"
      },
      {
      "source": "pipeline",
      "group": "global.pipeline"
      },
      {
      "source": "admin",
      "group": "admin"
      },
      {
      "group": "global"
      }
  ],
  "cpuQuotaPeriod": "1h"
  }

For Denodo to take advantage of the Denodo Embedded MPP’s Resource Group mechanism, you need to create different data sources in Denodo for the same Denodo Embedded MPP and select the corresponding Resource Group (source) using the applicationNamePrefix driver properties.

embedded_mpp data source driver properties

embedded_mpp data source driver properties

Add feedback