Sizing recommendations for the Embedded MPP¶
The size and resources available for the MPP cluster can have a big impact on the overall performance so it is important to choose the best configuration for your scenario. Optimal sizing can vary a lot depending on the characteristics of your workloads. Therefore, we recommend you to perform your own sizing tests using data and queries as close as possible to your real scenario. In order to do so take into account the following considerations:
If you are configuring N Presto workers, create your cluster with N+2 nodes so each worker is placed on one node and there is extra room for your coordinator and your metastore.
In general, adding more workers will make your queries run faster as those operations that Presto can run in parallel will be distributed among more nodes. For instance, reading Parquet files from a table.
Increasing the number of CPU cores will also speed-up queries, specially under concurrency.
For the previous considerations, make sure the total number of CPU cores of your Presto cluster does not reach the maximum allowed by your Denodo license.
Ensure that your cluster has abundant memory. It is important to make sure there is enough memory available to avoid Presto from spilling data to disk or crashing under concurrency.
The recommendations below are based on our own sizing tests using TPC-DS benchmark at scale factor 100. Therefore, they should be informative for workloads composed of analytical queries working on hundreds of millions of rows or higher:
Start with 8 workers or higher running on nodes with at least 128GB of memory and 16-32 cores. For example, in Amazon Web Services you can start with m5.8xlarge or r5.4xlarge nodes.
If you experience crashes or poor performance try doubling the size of your cluster. With queries managing hundreds of millions of rows, even isolated queries with no concurrency will typically run much faster than with smaller clusters. With concurrency, smaller clusters will probably not be able to provide adequate performance. When you double the number of workers in your cluster, you can expect to see:
With no concurrency: better performance for most queries, but not dramatic improvements. In many cases you will see 5-15% improvements.
Moderate concurrency (e.g. sustained load of 5 concurrent queries): average execution times can decrease between 30-40% as you double the number of workers, but the effect will weaken with every increase.
Medium concurrency (e.g. 10-15 sustained concurrent queries). The effects of doubling the number of workers will be stronger and more sustained. With 8 workers (128GB and 32 cores each), a significant number of queries can start to take minutes, so you should probably use at least 16 workers.
Higher concurrency: start with 32 workers (128GB and 32 cores each) and scale your cluster accordingly with the expected workload.
When increasing the size of the cluster does not improve your performance try using nodes with more memory and CPU. Take into account that some operations cannot run in parallel (like an aggregation consolidating distributed results) and may require more resources specially under concurrency. For instance, in AWS it’s typically a good idea to use memory optimized nodes, like rx8large, instead of general purpose nodes, like mx8large. That said, memory optimized nodes can be significantly more expensive and the effects can vary a lot for different workloads, so you may want to perform specific tests for your scenario. In general, the effect will increase as you increase concurrency but it can vary a lot depending on the query, from small improvements to executions running 3 times faster with moderate-high concurrency.
Once you have achieved the desired performance you can configure your cluster to use autoscaling to make sure you only use the resources you need:
You can configure the pods autoscaling as described in the Autoscale section of the Presto cluster on Kubernetes tool Guide.
You can configure the cluster nodes autoscaling following the instructions of your specific cloud provider. For instance, if you are using AWS visit the Autoscaling section of the EKS User Guide.
If increasing the size of the cluster and the resources of the cluster nodes does not improve your performance check the latency of the network between Denodo and Presto and between Presto and the object storage you are accessing.