Resource Groups¶
In an environment with multiple concurrent user sessions running queries, not all user sessions have the same importance. You may want to give more priority to some type of queries over another. Denodo Embedded MPP makes use of Resource Groups to organize how different workloads are prioritized.
Resource Groups manage quotas for two main resources: CPU and memory. In addition, granular resource constraints such as concurrency, time, and cpuTime can be specified.
Resource Group limits are only enforced during admission. Once the query starts executing, the Resource Group manager has no control over it. Instead, the concept of penalty is used for groups that exceed their resource specification, checking if a Resource Group has exceeded its limit before allowing it to start a new query.
To enable Resource Management in the Denodo Embedded MPP, set presto.coordinator.resourceGroups: true
in the values.yaml
file.
Example
In the example configuration shown below, the one that is distributed (presto/conf/resource_groups.json
,
to be modified according to your needs), there are several Resource Groups and four selectors that define which queries are executed in which Resource Group:
The first selector matches queries from
datascience
and places them in theglobal.adhoc.datascience
group.The second selector matches queries from
bi
and places them in theglobal.adhoc.bi
group.The third selector matches queries from
pipeline
and places them in theglobal.pipeline
group.The fourth selector matches queries from
admin
and places them in theadmin
group.The last selector is a catch-all for all queries that have not been matched into any group.
Together, these selectors implement the following policy:
The
admin
group can run up to 50 concurrent queries (hardConcurrencyLimit
).
For all other groups:
No more than 100 total queries may run concurrently (
hardConcurrencyLimit
).Ad-hoc queries, such as queries from BI tools can run up to 10 concurrent queries (
hardConcurrencyLimit
) and queries from Data Science tools can run up to 2 concurrent queries (hardConcurrencyLimit
).Non ad-hoc queries will run under the
global.pipeline
group, with a total concurrency of 45 (hardConcurrencyLimit
). Queries are run in FIFO order.All remaining queries are placed in the
global
group.
{
"rootGroups": [
{
"name": "global",
"softMemoryLimit": "80%",
"hardConcurrencyLimit": 100,
"maxQueued": 1000,
"schedulingPolicy": "weighted",
"jmxExport": true,
"subGroups": [
{
"name": "adhoc",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 50,
"maxQueued": 1,
"schedulingWeight": 10,
"subGroups": [
{
"name": "datascience",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 2,
"maxQueued": 1,
"schedulingWeight": 10,
"schedulingPolicy": "weighted_fair"
},
{
"name": "bi",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 10,
"maxQueued": 100,
"schedulingWeight": 10,
"schedulingPolicy": "weighted_fair"
}
]
},
{
"name": "pipeline",
"softMemoryLimit": "80%",
"hardConcurrencyLimit": 45,
"maxQueued": 100,
"schedulingWeight": 1,
"jmxExport": true
}
]
},
{
"name": "admin",
"softMemoryLimit": "100%",
"hardConcurrencyLimit": 50,
"maxQueued": 100,
"schedulingPolicy": "query_priority",
"jmxExport": true
}
],
"selectors": [
{
"source": "datascience",
"group": "global.adhoc.datascience"
},
{
"source": "bi",
"group": "global.adhoc.bi"
},
{
"source": "pipeline",
"group": "global.pipeline"
},
{
"source": "admin",
"group": "admin"
},
{
"group": "global"
}
],
"cpuQuotaPeriod": "1h"
}
For Denodo to take advantage of the Denodo Embedded MPP’s Resource Group mechanism, you need to create different data sources in Denodo for the same Denodo Embedded MPP and select the corresponding Resource Group (source
) using the applicationNamePrefix
driver properties.