Autoscaling¶
The presto.autoscaling section of the values.yaml configures the horizontal autoscaling of the Denodo Embedded MPP.
We support autoscaling/v2. Horizontal scaling means that the response to an increase in
load is to deploy more pod replicas. If the load decreases, and the number of pods is greater than the configured minimum,
HorizontalPodAutoscaler (HPA) instructs the workload resource to scale down.
It is disabled by default and the scaleDown and scaleUp behavior is optional:
presto:
autoscaling:
enabled: false
maxReplicas: 20
targetCPUUtilizationPercentage: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
There are two requirements to ensure the HorizontalPodAutoscaler (HPA) works:
A monitoring tool is installed for providing metrics to Kubernetes: Metrics Server, Prometheus, etc.
The Worker has CPU resource requests,
presto.worker.resources, defined in thevalues.yamlfile. If the resources requests are not configured the autoscaler (HPA) will not take any action.Note, that resources are not configured by default, as we leave this configuration as a choice for the Kubernetes cluster administrator. So for example, consider
presto.worker.resources.requests.cpu. This setting represents the amount of CPU reserved for the pod. We recommend setting the value to be 80% of the total number of CPUs of a Worker as a starting point. So for 32 core nodes you can set the value to 25.6 or 25600m, each of which represents 25.6 cores:
MPP Worker resources¶presto: worker: resources: requests: cpu: 25600mThis can then be adjusted as needed. The autoscaler will use this value along with the
targetCPUUtilizationPercentageto determine if a new Worker is needed.
You can check the current status of the HorizontalPodAutoscaler running:
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
presto-worker Deployment/presto-worker 22%/80% 2 6 2
It is strongly recommended to scale down the cluster at least during off-peak hours, otherwise unnecessary costs may be incurred.
Graceful Shutdown
When using autoscaling, we strongly recommend enabling graceful shutdown. This helps minimize disruptions during scale-down operations, preserves the integrity of ongoing queries, and maintains a high level of service reliability.
If graceful shutdown is enabled, workers will no longer receive new tasks when shutting down. Instead, they’ll wait for the configured grace period to allow time for the completion of tasks already in progress.
The grace period should be configured based on query execution times to ensure that even long-running queries have a chance to complete.
To enable graceful shutdown is necessary to set true presto.worker.gracefulShutdown.enabled property and configure presto.worker.gracefulShutdown.gracePeriodSeconds properly.
It’s also necessary to set the presto.worker.terminationGracePeriodSeconds to a value of at least two times the configured gracePeriodSeconds.
gracefulShutdown:
enabled: true
gracePeriodSeconds: 120
terminationGracePeriodSeconds: 240
Node-Based Autoscaling
It is strongly recommended to use pod-based autoscaling together with node-based autoscaling to coordinate scalability of the pods with the behavior of the nodes in the cluster, since our recommendation is to use one single node for each MPP worker. This way, when you need to scale up, the cluster autoscaler can add new nodes, and when scaling down, it can shut down unneeded nodes.
For node-based autoscaling in Amazon EKS see Configuring an Autoscaling Denodo Embedded MPP Cluster in EKS.
For node-based autoscaling in Azure Kubernetes Service see Configuring an Autoscaling Denodo Embedded MPP Cluster in AKS
