Resource Groups¶
In an environment with multiple concurrent user sessions running queries, not all user sessions have the same importance. You may want to give more priority to some type of queries over another. Denodo Embedded MPP makes use of Resource Groups to organize how different workloads are prioritized.
Resource Groups manage quotas for two main resources: CPU and memory. In addition, granular resource constraints such as concurrency, time, and cpuTime can be specified.
Resource Group limits are only enforced during admission. Once the query starts executing, the Resource Group manager has no control over it. Instead, the concept of penalty is used for groups that exceed their resource specification, checking if a Resource Group has exceeded its limit before allowing it to start a new query.
To enable Resource Management in the Denodo Embedded MPP, set presto.coordinator.resourceGroups: true in the values.yaml file.
Example
In the example configuration shown below, the one that is distributed (presto/conf/resource_groups.json,
to be modified according to your needs), there are several Resource Groups and four selectors that define which queries are executed in which Resource Group:
The first selector matches queries from
datascienceand places them in theglobal.adhoc.datasciencegroup.The second selector matches queries from
biand places them in theglobal.adhoc.bigroup.The third selector matches queries from
pipelineand places them in theglobal.pipelinegroup.The fourth selector matches queries from
adminand places them in theadmingroup.The last selector is a catch-all for all queries that have not been matched into any group.
Together, these selectors implement the following policy:
The
admingroup can run up to 50 concurrent queries (hardConcurrencyLimit).
For all other groups:
No more than 100 total queries may run concurrently (
hardConcurrencyLimit).Ad-hoc queries, such as queries from BI tools can run up to 10 concurrent queries (
hardConcurrencyLimit) and queries from Data Science tools can run up to 2 concurrent queries (hardConcurrencyLimit).Non ad-hoc queries will run under the
global.pipelinegroup, with a total concurrency of 45 (hardConcurrencyLimit). Queries are run in FIFO order.All remaining queries are placed in the
globalgroup.
{
"rootGroups": [
{
"name": "global",
"softMemoryLimit": "80%",
"hardConcurrencyLimit": 100,
"maxQueued": 1000,
"schedulingPolicy": "weighted",
"jmxExport": true,
"subGroups": [
{
"name": "adhoc",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 50,
"maxQueued": 1,
"schedulingWeight": 10,
"subGroups": [
{
"name": "datascience",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 2,
"maxQueued": 1,
"schedulingWeight": 10,
"schedulingPolicy": "weighted_fair"
},
{
"name": "bi",
"softMemoryLimit": "10%",
"hardConcurrencyLimit": 10,
"maxQueued": 100,
"schedulingWeight": 10,
"schedulingPolicy": "weighted_fair"
}
]
},
{
"name": "pipeline",
"softMemoryLimit": "80%",
"hardConcurrencyLimit": 45,
"maxQueued": 100,
"schedulingWeight": 1,
"jmxExport": true
}
]
},
{
"name": "admin",
"softMemoryLimit": "100%",
"hardConcurrencyLimit": 50,
"maxQueued": 100,
"schedulingPolicy": "query_priority",
"jmxExport": true
}
],
"selectors": [
{
"source": "datascience",
"group": "global.adhoc.datascience"
},
{
"source": "bi",
"group": "global.adhoc.bi"
},
{
"source": "pipeline",
"group": "global.pipeline"
},
{
"source": "admin",
"group": "admin"
},
{
"group": "global"
}
],
"cpuQuotaPeriod": "1h"
}
For Denodo to take advantage of the Denodo Embedded MPP’s Resource Group mechanism, you need to create different data sources in Denodo for the same Denodo Embedded MPP and select the corresponding Resource Group (source) using the applicationNamePrefix driver properties.
embedded_mpp data source driver properties¶
