• User Manuals /
  • Denodo Presto Cluster on Kubernetes - User Manual

Denodo Presto Cluster on Kubernetes - User Manual

Download original document


Overview

Presto is a high performance, distributed SQL query engine for big data. It was developed by Facebook, but now is open-source.

The main purpose of deploying Presto on Kubernetes is to access huge datasets in an efficient manner, using only SQL.

The steps to follow are:

  1. Store your dataset in S3.

  1. Deploy Presto on Kubernetes.

  1. Expose S3 data as Hive tables in Presto.

  1. Configure a Presto data source in the Denodo Platform.

  1. Run SQL queries in the Denodo Platform against Presto.

This document focuses on datasets stored in AWS S3, as an example, but the Presto Cluster can read data from many distributed storage systems, such as:

  • HDFS
  • Azure Blob Storage
  • Azure Data Lake Storage Gen1 and Gen2
  • Google Cloud Storage
  • S3-compatible storage

Since version 20230301, Denodo Presto Cluster on Kubernetes includes a Presto that has been customized to behave as the Denodo Platform's embedded Massively Parallel Processing (MPP) to accelerate queries.

This feature requires the Denodo subscription bundle Enterprise Plus and the Denodo 8.0 20230103 Update.

Presto on Kubernetes

Architecture

The architecture of the Presto cluster looks like this:

                                        Presto Cluster architecture

Presto uses its Hive Connector to access datasets stored in S3. This Hive Connector relies on Hive Metastore to map S3 data files to tables. The Hive Metastore saves this metadata catalog in a RDBMS, a PostgreSQL internal to the cluster, that also could be an external one provided by the user.

When the internal PostgreSQL is used, the cluster uses a Kubernetes Persistent Volume, to ensure the persistence of the catalog metadata. This way, all the services of the cluster: Presto, Hive Metastore and PostgreSQL, can go down temporarily and we will not lose any of the table definitions that Presto relies upon.

Kubernetes Cluster Infrastructure

Denodo Presto cluster is designed to run on Kubernetes, locally or on the cloud with:

  • Amazon Elastic Container Service for Kubernetes (Amazon EKS)

  • Azure Kubernetes Service (AKS)

  • Red Hat OpenShift

  • Google Kubernetes Engine (GKE)

It is deployed using Helm.

Docker Desktop

You can run Kubernetes in Docker Desktop for testing purposes, as it includes a standalone Kubernetes server (single-node cluster) and a client. To enable it, launch Docker Desktop, go to Settings > Kubernetes and choose Enable Kubernetes.

With Docker Desktop you can also upload your images to a registry, as you can do with cloud providers like Amazon, Azure, etc. To run this local registry execute:

$ docker run -d -p 5000:5000 --name registry registry:2

The URI of this local registry is the default value of the repository variable in the values.yaml file of the helm chart in the distribution:

repository: "localhost:5000"

Amazon EKS

Amazon EKS (Elastic Container Service for Kubernetes) is a managed Kubernetes service that allows you to run Kubernetes on AWS.

These are the steps for setting up an EKS cluster:

  1. Create an Amazon EKS cluster in the EKS console or with the eksctl command tool.

  1. Add a Node Group to the EKS cluster. By doing so, you are creating the Kubernetes worker nodes: EC2 instances where the containers will run.

  1. When the cluster is ready you should configure kubectl to communicate with your cluster from the AWS CLI.

  1. Set up the Amazon Elastic Container Registry (ECR). This registry will contain the images used to create the pods.

✸ Before deploying the Presto Cluster you must create three repositories in Amazon ECR and push the images (distributed in the images folder), to them: prestocluster-postgresql, prestocluster-hive-metastore and prestocluster-presto.

The Amazon Elastic Container Registry URI of the repository variable in the values.yaml file of the helm chart, has the following format:

repository: "<awsaccountid>.dkr.ecr.<region>.amazonaws.com"

  1. Deploy Denodo Presto Cluster on the Amazon EKS cluster using the cluster.sh script of the distribution.

For detailed information see https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html.

Azure Kubernetes Service

Azure Kubernetes Service (AKS) is a managed Kubernetes service that allows you to run Kubernetes on Microsoft Azure.

The steps for setting up an AKS cluster are the following:

  1. Create an AKS cluster in the Azure portal or with the Azure CLI.

  1. When the cluster is ready, you should configure kubectl to communicate with the cluster from the Azure CLI.

  1. Set up the Azure Container Registry (ACR).  This registry should contain the images used to create the pods, that you can find in the distribution, in the images folder.

The Azure Container Registry URI of the repository variable in the values.yaml file of the helm chart, has the following format:

repository: "<registry>.azurecr.io"

  1. Deploy Denodo Presto Cluster on AKS cluster using the cluster.sh script of the distribution.

For detailed information see https://docs.microsoft.com/en-us/azure/aks/tutorial-kubernetes-prepare-app

Red Hat OpenShift

Red Hat OpenShift is an enterprise-ready Kubernetes container platform.

The steps for setting up an OpenShift cluster are the following:

  1. Create an OpenShift cluster.

  1. When the cluster is ready, you should configure kubectl to communicate with the cluster. Installing the OpenShift command-line tool, oc, to interact with your Kubernetes cluster is also recommended.

  1. Create an OpenShift project to create a Kubernetes namespace to deploy Kubernetes resources.

  1. Set up the OpenShift Container Registry (OCR). This registry should contain the images used to create the pods, that you can find in the distribution, in the images folder.

The OpenShift Container Registry URI of the repository variable in the values.yaml file of the helm chart, has the following format:

repository: "default-route-openshift-image-registry.<cluster_name>.<base_domain>/<openshift_project>"

  1. Deploy Denodo Presto Cluster on OpenShift cluster using the cluster.sh script of the distribution.

For detailed information see https://docs.openshift.com/

Configuration

Before deploying the Presto Cluster you should check the following configuration.

General: values.yaml

The following parameters can be tuned in the prestocluster/values.yaml file:

  • image.repository : Container Registry where Presto, Hive Metastore and PostgreSQL images reside.

  • presto.numWorkers: the number of Presto workers that will have the cluster. It should be configured according to your environment.

  • presto.cpusPerNode: number of physical cores of each node of the cluster. It should be configured according to your environment.

  • presto.memoryPerNode: total memory in GB of each node of the cluster. It should be configured according to your environment.

  • presto.service.annotations: Presto service annotations. A typical use case is to configure the Presto service to use a static IP address instead of an ephemeral IP address assigned by the load balancer.

  • presto.service.loadBalancerIP: An alternative way to presto.service.annotations to specify a static IP address for the Presto service. But notice that loadBalancerIP is deprecated since Kubernetes v1.24 and users are encouraged to use implementation-specific annotations (AKS, EKS, etc.) through presto.service.annotations.

  • presto.service.loadBalancerSourceRanges: List of IPs to restrict traffic through the load balancer.

  • presto.coordinator.passwordAuth.prestoPassword: pr3st% by default. But you can customize it here or provide a new one in the deployment process with the cluster.sh deploy option.

  • presto.coordinator.resourceGroups: true or false. Flag to enable the Resource management in Presto.

Presto makes use of Resource groups in order to organize how different workloads are prioritized.

See Resource Groups section for more information.

  • securityContext and containerSecurityContext:
  • presto.coordinator.securityContext/containerSecurityContext: Security context for the Presto Coordinator
  • presto.worker.securityContext/containerSecurityContext: Security context for the Presto Worker

The default configuration prevents Presto processes to run as root, ensures that no child process can gain more privileges than its parent and drops all permissions to perform kernel level calls.

  • resources:
  • presto.coordinator.resources: Resources definition for the Presto Coordinator
  • presto.worker.resources: Resources definition for the Presto Worker

Kubernetes schedules pods across nodes based on the resource requests and limits of each container pod. If a container pod requests certain values of CPU and/or memory, Kubernetes will only schedule it on a node that can guarantee those resources. Limits, on the other hand, make sure a container pod never goes above a certain value.

Notice that you cannot set requests that are larger than resources provided by your nodes.

Notice also that resources are commented out, as we leave this setting as a conscious choice.

  • topologySpreadConstraints:
  • presto.coordinator.topologySpreadConstraints: Topology constraints for the Presto Coordinator
  • presto.worker.topologySpreadConstraints: Topology constraints for the Presto Worker

Kubernetes has other mechanisms than resources to schedule pods across nodes. Since Kubernetes 1.19, you can use topology constraints to control how your pods will be distributed across your cluster based on regions, zones, nodes, etc.

The default configuration distributes Coordinator and Workers pods in an absolute even manner across instances.

  • server: Denodo server uri with the format: "//<ip_or_hostname>:<port>/admin_denodo_mpp". Make sure the Denodo server is accessible from the cluster.
  • i18n: i18n configuration of the connection with Denodo
  • user: Denodo user, denodo_mpp_user by default.
  • password: Denodo user password, password for denodo_mpp_user by default.

It is recommended to specify the password encrypted to avoid entering as plain text.

If the password is encrypted, prefix it with encrypted:, e.g. "encrypted:<encrypted_password>".

You should encrypt the password using the <DENODO_HOME>/bin/encrypt_password.sh script.

  • queryTimeout: Maximum time, in milliseconds, Presto will wait for a Denodo query to finish. If it is not set, the default value is 900000 ms. If 0, Presto will wait indefinitely until the Denodo query finishes.
  • ssl: Whether SSL is enabled in the Denodo server
  • trustStore: You need to configure this trustStore parameter only if the certificate used by the Denodo server is signed by a private authority.

In this case, copy the trust store file that contains the Denodo SSL certificate, to the prestocluster/presto directory and set the file name here in the trustStore parameter.

  • trustStorePassword: Password of the trust store, only if the trustStore parameter is configured.

  • presto.autoscaling: a HorizontalPodAutoscaler template is distributed, but is disabled by default in presto.autoscaling.enabled parameter.

Horizontal scaling means that the response to increased load is to deploy more Pods. If the load decreases, and the number of Pods is above the configured minimum, the HorizontalPodAutoscaler instructs the workload resource to scale back down.

If you want to autoscale based on a resource's utilization as a percentage, you must specify requests for that resource in the presto.worker.resources parameter.

See Autoscale section for more information.

  • postgresql.enabled: true or false. Flag to use an internal or an external database for the Hive Metastore.

By default, a PostgreSQL internal to the cluster will be used as the RDBMS of the Hive Metastore.

But if you want to use a database external to the cluster, you can configure it using the metastore.connectionUrl and metastore.connectionDriverName parameters.

In addition, there is an initialization script for the external RDBMS, MySQL, Oracle, PostgreSQL or SQL Server, included in the prestocluster/hive-metastore/scripts that should be run on that database before deploying the cluster.

  • metastore.connectionUrl: JDBC connection string of RDBMS of the Hive Metastore, which can be:
  • an internal PostgreSQL to the cluster, the default one:

                    jdbc:postgresql://postgresql:5432/metastore

  • an external MySQL
  • an external Oracle
  • an external PostgreSQL
  • an external SQL Server

  • metastore.connectionDriverName: JDBC Driver class name to connect with the RDBMS of the Hive Metastore, which can be:
  • org.postgresql.Driver for PostgreSQL, the default one
  • org.mariadb.jdbc.Driver for MySQL
  • com.microsoft.sqlserver.jdbc.SQLServerDriver for SQL Server
  • oracle.jdbc.OracleDriver for Oracle

  • metastore.metastorepvclaim.storage: Storage size requested in case of using the internal PostgreSQL of the cluster. This PostgreSQL is in charge of the metadata persistence storage, that is, the Presto table definitions.

Default size is 2Gi, but it should be configured according to your environment.

  • metastore.affinity: Inter-pod affinity allows you to schedule pods based on their relationship to other pods, such as colocation of pods that are part of the codependent services.

Using the default configuration of metastore.affinity, Hive Metastore deployment will run on the same machine as the internal PostgreSQL to improve performance, avoid network latency issues and connection failures.

  • metastore.securityContext and containerSecurityContext:  Security context for Hive Metastore.

The default configuration prevents Hive Metastore process to run as root, ensures that no child process can gain more privileges than its parent and drops all permissions to perform kernel level calls.

  • postgresql.affinity: Inter-pod affinity allows you to schedule pods based on their relationship to other pods, such as colocation of pods that are part of the codependent services.

Using the default configuration of postgresql.affinity, the internal PostgreSQL deployment will run on the same machine as Hive Metastore to improve performance, avoid network latency issues and connection failures.

  • postgresql.securityContext and containerSecurityContext:  Security context for PostgreSQL.

The default configuration prevents PostgreSQL process to run as root, ensures that no child process can gain more privileges than its parent and drops all permissions to perform kernel level calls.

Persistent Storage

When the internal PostgreSQL is used, the Cluster uses a Persistent Volume from Kubernetes, to ensure the persistence of the metadata.

But the Presto Cluster does not include a Persistent Volume object, since the user instantiating it may not have permission to create Persistent Volumes.

Therefore, there are two options:

  1. Use the actual definition that causes a Persistent Volume to be automatically provisioned for the Presto Cluster with the default StorageClass.

Many cluster environments have a default StorageClass installed, or the administrators can create their own default StorageClass.

  1. Provide a storage class name into the postgresql.pvClaim.storageClassName field in the values.yaml file.

SSL/TLS Configuration

The Java Keystore, prestocluster/presto/conf/presto.jks, contains a self-signed certificate, certs/certificate.crt, that is distributed ONLY for testing purposes. You are encouraged to use a certificate issued by a CA in production and import it, with its private key, in the Java Keystore of the Presto Cluster or replace the existing one: prestocluster/presto/conf/presto.jks (password sslpassphrase).

The self-signed certificate, certs/certificate.crt, accepts localhost and presto-denodo as Presto host names:

  • localhost: for tests performed in a local Kubernetes cluster.

  • presto-denodo: for tests performed in a Kubernetes cluster on the cloud.

In this case, you have to add an entry in the hosts file where the Denodo server is running, with presto-denodo and the IP that appears as the EXTERNAL-IP of the Presto service.

Deployment

The script cluster.sh , in the root folder of the distribution, automatizes the whole deployment process.

✸ For running cluster.sh on Windows you need a Bash compatible shell installed like Cygwin or Git Bash or use Windows Subsystem for Linux (WSL).

Usage: cluster.sh delete | deploy OPTIONS | register

       

OPTIONS:

--presto-password               Custom password for the user 'presto'

--abfs-storage-account   Azure Data Lake Gen 2 Storage Account

--abfs-storage-key       Azure Data Lake Gen 2 Storage Key

--adl-client-id          Azure Data Lake Gen 1 Client Id

--adl-client-secret      Azure Data Lake Gen 1 Client Secret

--adl-tenant-id          Azure Data Lake Gen 1 Tenant Id

--s3-access-key          AWS Access Key ID

--s3-secret-access       AWS Secret Access Key

--wasb-storage-account   Azure Blob Storage Account

--wasb-storage-key       Azure Blob Storage Key

--credstore-password  Password to protect the encrypted credentials file

--metastore-password    Password of the external RDBMS for the Hive Metastore

--register             Create an embedded MPP data source at Denodo pointing to this Presto Cluster

--register-user                Denodo user that will create the embedded MPP database 'admin_denodo_mpp', data source 'embedded_mpp' and user 'denodo_mpp_user' at Denodo. User must have CONNECT privileges to 'admin' database

--register-password            Denodo password for the user that will create the embedded MPP database 'admin_denodo_mpp', data source 'embedded_mpp' and user 'denodo_mpp_user' at Denodo

cluster.sh script has four prerequisites:

  1. Log in the Container Registry (ECR, ACR, GCR, OCR …)

  1. Push Presto, Hive Metastore and, optionally, PostgreSQL images in the Container Registry. These images can be found in the images folder of the distribution.

$ docker load < prestocluster-postgresql-image-<version>.tar.gz

$ docker tag prestocluster-postgresql:<version> <repository>/prestocluster-postgresql:<version>

$ docker push <repository>/ prestocluster-postgresql:<version>

$ docker load < prestocluster-hive-metastore-image-<version>.tar.gz

$ docker tag prestocluster-hive-metastore:<version> <repository>/prestocluster-hive-metastore:<version>

$ docker push <repository>/ prestocluster-hive-metastore:<version>

$ docker load < prestocluster-presto-image-<version>.tar.gz

$ docker tag prestocluster-presto:<version> <repository>/prestocluster-presto:<version>

$ docker push <repository>/ prestocluster-presto:<version>

  1. Configure the connection to the Kubernetes cluster in the kubeconfig file.

  1. Install Helm package manager for Kubernetes.

Running on Windows

If you are running cluster.sh on Windows you need to apply an extra configuration.

Check if the environment variable HADOOP_HOME is defined on this computer. To see the list of environment variables, open a command line and execute SET.

If HADOOP_HOME is undefined:

  1. Create a directory. For example, <DENODO_HOME>\hadoop_win_utils.

  1. Create a directory called bin within the new directory. For example, <DENODO_HOME>\hadoop_win_utils\bin.

  1. Define the environment variable HADOOP_HOME to point to <DENODO_HOME>\hadoop_win_utils.

  1. Copy the content of the directory <DENODO_HOME>\dll\vdp\winutils to %HADOOP_HOME%\bin.

Let's see now how to deploy Presto on Kubernetes!

The Presto Cluster is distributed as a Helm chart. However, we still provide the script cluster.sh on top of Helm that manages transparently the encryption of all credentials supplied by the user. For this, the Hadoop credential provider framework is used to create a keystore file, creds.jceks, to avoid using clear values (e.g. the S3 Access Key Id and the S3 Secret Access Key).

cluster.sh deploy [OPTIONS]:  deploys a cluster that accesses datasets in S3, Azure Blob Storage, Azure Data Lake Gen 1 and/or Azure Data Lake Gen 2, depending on the credentials specified by the user using the OPTIONS available.

COMMON OPTIONS

--presto-password: This option configures the password of the presto user, instead of using the default one: pr3st%

If this password is not specified in the command line, cluster.sh deploy will prompt for it, keeping passwords out of the bash history.

--register: With this option the deployment process includes a final step that creates a new special data source in Denodo called ‘embedded_mpp’. And also configures the Denodo query optimizer to use Presto as its embedded Massively Parallel Processing (MPP) to accelerate queries.

This feature requires the Denodo subscription bundle Enterprise Plus and the Denodo 8.0 20230103 Update.

--credstore-password: Password to protect the encrypted credentials file, creds.jceks, that will store all the credentials supplied by the user.

If this password is not specified in the command line, cluster.sh deploy will prompt for it, keeping passwords out of the bash history.

--metastore-password: Password of the external RDBMS for the Hive Metastore. Optional: by default, a PostgreSQL internal to the cluster will be used as the RDBMS of the Hive Metastore.

S3 datasets

There are two ways for accessing the S3 datasets:

  • The recommended one: no AWS credentials required by the cluster.sh deploy command. Used when the Presto Cluster will run in EC2 because it will use the instance profile credentials

cluster.sh deploy --credstore-password xxx

  • Supply these options to the cluster.sh script:

  • --s3-access-key: the Access Key Id

  • --s3-secret-access: the Secret Access Key

If this secret is not specified in the command line, cluster.sh deploy will prompt for it, keeping secrets out of the bash history

cluster.sh deploy --s3-access-key xxx --s3-secret-access yyy --credstore-password zzz

You should also check the repository variable in the prestocluster/values.yaml file to point to a valid Container Registry.

The Amazon Elastic Container Registry URI has the following format:

repository: "<aws_account_id>.dkr.ecr.<region>.amazonaws.com"

The OpenShift Internal Registry URI has the following format:

repository: "default-route-openshift-image-registry.<cluster_name>.<base_domain>/<openshift_project>"

S3-compatible storage

S3-compatible storage allows access to and management of the data it stores over an S3 compliant interface, e.g. MinIO, IBM Cloud Object Storage, Dell EMC ECS,...

In order to access datasets stored in S3-compatible storage, you should configure the following parameter in the prestocluster/values.yaml file:

  • presto.hive.s3Endpoint: The S3-compatible storage endpoint server. It can be found in the documentation of the S3-compatible storage.

Then, supply these options to the cluster.sh script:

  • --s3-access-key: the Access Key Id

  • --s3-secret-access: the Secret Access Key

If this secret is not specified in the command line, cluster.sh deploy will prompt for it, keeping secrets out of the bash history

cluster.sh deploy --s3-access-key xxx --s3-secret-access yyy --credstore-password zzz

Glue Data Catalog

In case that you already have a Glue Catalog containing table definitions you want to access from the Denodo Presto Cluster, you could use Glue Data Catalog as the Cluster metastore or as an additional metastore.

For this, you have to manually create a .properties for the new catalog in prestocluster/presto/conf/catalog/ with the content below.

For hive tables:

connector.name=hive-hadoop2

hive.metastore=glue

# AWS region of the Glue Catalog

hive.metastore.glue.region=

# The ID of the Glue Catalog in which the metadata database resides

hive.metastore.glue.catalogid=

# Access Key and Secret Key for Glue

# Credentials and core-site.xml are not required when the cluster runs in EC2

# because it will use the instance profile credentials

hive.metastore.glue.aws-access-key=

hive.metastore.glue.aws-secret-key=

hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml

#

hive.parquet.use-column-names=true

                                        glue_hive.properties

For iceberg tables

connector.name=iceberg

iceberg.catalog.type=HIVE

hive.metastore=glue

# AWS region of the Glue Catalog

hive.metastore.glue.region=

# The ID of the Glue Catalog in which the metadata database resides

hive.metastore.glue.catalogid=

# Access Key and Secret Key for Glue

# Credentials and core-site.xml are not required when the cluster runs in EC2

# because it will use the instance profile credentials

hive.metastore.glue.aws-access-key=

hive.metastore.glue.aws-secret-key=

hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml

#

hive.parquet.use-column-names=true

                                        glue_iceberg.properties

Azure Blob Storage datasets

Supply these options to the cluster.sh script:

  • --wasb-storage-account: the name of the Storage Account

  • --wasb-storage-key: the access key that protects access to your Storage Account

If this access key is not specified in the command line, cluster.sh deploy will prompt for it, keeping access keys out of the bash history

cluster.sh deploy --wasb-storage-account xxx --wasb-storage-key yyy --credstore-password zzz

You should also check the repository variable in the prestocluster/values.yaml file to point to a valid Container Registry.

The Azure Container Registry URI has the following format:

repository: "<registry>.azurecr.io"

The OpenShift Internal Registry URI has the following format:

repository: "default-route-openshift-image-registry.<cluster_name>.<base_domain>/<openshift_project>"

Azure Data Lake Gen 1 datasets 

Supply these options to the cluster.sh script:

  • --adl-client-id: Client id

  • --adl-client-secret: OAuth2 refresh token from the Azure Active Directory service associated with the client id

If this secret is not specified in the command line, cluster.sh deploy will prompt for it, keeping secrets out of the bash history

  • --adl-tenant-id: id of the application's Azure Active Directory tenant

cluster.sh deploy --adl-client-id www --adl-client-secret xxx --adl-tenant-id yyyy --credstore-password zzz

You should also check the repository variable in the presto-cluster/values.yaml file to point to a valid Container Registry.

The Azure Container Registry URI has the following format:

repository: "<registry>.azurecr.io"

The OpenShift Internal Registry URI has the following format:

repository: "default-route-openshift-image-registry.<cluster_name>.<base_domain>/<openshift_project>"

Azure Data Lake Gen 2 datasets

Supply these options to the cluster.sh script:

  • --abfs-storage-account: the name of the Storage Account

  • --abfs-storage-key: the access key that protects access to your Storage Account

If this access key is not specified in the command line, cluster.sh deploy will prompt for it, keeping access keys out of the bash history

cluster.sh deploy --abfs-storage-account xxx --abfs-storage-key yyy --credstore-password zzz

You should also check the repository variable in the prestocluster/values.yaml file to point to a valid Container Registry.

The Azure Container Registry URI has the following format:

repository: "<registry>.azurecr.io"

The OpenShift Internal Registry URI has the following format:

repository: "default-route-openshift-image-registry.<cluster_name>.<base_domain>/<openshift_project>"

Register Presto as Denodo's MPP

Since version 20230103, Denodo Presto Cluster on Kubernetes includes a Presto that has been customized to behave as the Denodo Platform's embedded Massively Parallel Processing (MPP) to accelerate queries.

This feature requires the Denodo subscription bundle Enterprise Plus and the Denodo 8.0 20230103 Update.

Because of this new Denodo feature, since version 20230103, Denodo Presto Cluster on Kubernetes includes the option to register the Presto Cluster as Denodo's MPP.

The registration process can be executed:

  • within the deployment process: –register option in the cluster.sh deploy command.

  • after the deployment process: cluster.sh register --register-user --register-password [--presto-password] command.

The registration process consists of:

  1. Creation of a new database admin_denodo_mpp, a new user denodo_mpp_user and a special data source in Denodo called embedded_mpp:

This data source allows connecting to an Object Storage (S3, HDFS) graphically, exploring its data and creating base views on top of Parquet files and the correspondent Hive tables in the embedded Presto data source.

A similar functionality as the Metadata Discovery Tool for Parquet files but now integrated in the Denodo Platform.

  1. Configuration of the Denodo query optimizer to consider this embedded MPP for query acceleration.

This is useful in scenarios where a query combines large amounts of Parquet data stored in an Object Storage like HDFS, S3 or ADLS with data in a different data source. In these cases, the Denodo query optimizer can decide to send the query to the MPP. The embedded MPP can access the data in the Object Storage using its own engine and it can access the data outside the Object Storage in streaming through Denodo, without the need to create temporary tables or files.

This way the Denodo query engine can combine its powerful optimization techniques, federation and security capabilities along with parallel processing on big data.

Connection

You can check the pods status in the cluster with kubectl get pods:

Kubernetes pods status

Once deployed, Presto is accessible at External-IP:8443 of the presto service. Run kubectl get svc to find out the External-IP:

Kubernetes services status

Presto is accessible at:

Presto Web UI      https://<ip>:8443/ui/

             https://presto-denodo:8443/ui/

JDBC URI        

         jdbc:presto://<ip>:8443/hive/default?SSL=true

                           ↓  (TLS certificate)

         jdbc:presto://presto-denodo:8443/hive/default?SSL=true   

                     

User          presto

Password      pr3st%

/etc/hosts file

Cleanup

Do not forget to run cluster.sh delete to delete all the resources you have created, when you do not need the cluster anymore.

In case you are using the PostgreSQL internal to the cluster the delete command destroys the metadata persistence, that is, the Presto table definitions.

If you want to save the cluster metadata for a later deployment, you need to execute a dump over the PostgreSQL database of the Hive Metastore, before deleting the cluster. This dump generates a text file with SQL commands that, when fed back to the Hive Metastore, will recreate the database in the same state as it was at the time of the dump.

Dump in the old cluster:

$ kubectl exec <PostgreSQL Pod> -- bash -c "PGPASSWORD=hive pg_dump -c -U hive -h localhost metastore" > database.sql

Restore in the new cluster:

$ cat database.sql | kubectl exec -i <PostgreSQL Pod> -- bash -c "PGPASSWORD=hive psql -U hive -h localhost -d metastore"

✸  Notice that restore the database is the first thing you should do before creating new tables in the Presto Cluster.

Presto Tuning

Machines

When talking about Presto, fewer bigger machines is better, than more smaller machines. It is recommended that you start with big balanced machines, like these in AWS:

  • r5.4xlarge

  • m5.8xlarge

  • r5.8xlarge

  • m5.16xlarge

  • c5.24xlarge

CPU

Presto requires a certain amount of CPU to process the data for a given query. And generally, if you double the CPU in the cluster, maintaining the same memory, the query will take half of the time.

So, as a generalization, more CPUs mean shorter queries.

Memory

Memory is about the data structures required to run the query, and is needed for JOIN, GROUP BY, and ORDER BY operations.

As a generalization, allocate 80% of the memory machine to the Java heap, -Xmx option in file prestocluster/presto/conf/jvm.config - this is done automatically by our Helm chart.

You can configure other properties regarding memory, such as query.max-memory and query.max-memory-per-node. They can be tuned in files prestocluster/presto/conf/config.properties.coordinator and prestocluster/presto/conf/config.properties.worker.

  • The query.max-memory property is the maximum amount of user memory a query can utilize across all workers in the cluster. For example, memory used by the hash tables built during execution, memory used during sorting, etc. When the user memory allocation of a query across all workers hits this limit it is killed. Default value is 20GB.

But if the cluster needs to handle bigger queries, you should make query.max-memory bigger. For this, the recommendation is to run the EXPLAIN ANALYZE with these big queries and see what the peak max memory is examining the query plan.

  • The query.max-memory-per-node property is the maximum amount of user memory a query can use on a worker. Default value is JVM max memory * 0.1.  It is a useful parameter when you have skewed distributions.  When doing a distributed hash join, the rows are divided across machines using a hash function.  If your data is not uniformly distributed, you will get more data on a machine. So if you have perfectly uniformly distributed data you could set query.max-memory-per-node = query.max-memory / number of workers. If you have a small cluster, you could double this value.

Parquet files

If you are suffering from network or IO bandwidth, it is recommended that you compress your files:

  • High compression algorithm: ZSTD is preferred over ZIP

  • Low compression algorithm: LZ4 is preferred over Snappy

Also, avoid small files, any file less than 8 MB is considered small. Hundred of MB is the recommended size.

Do not use tiny row groups, default is 128 MB, but some tools generate Parquet files with typical size of 4k, causing bad IO patterns in Presto.

Resource Groups

In an environment with multiple concurrent user sessions that run queries, not all user sessions have the same importance. You may want to give more priority to some type of queries over another. Presto makes use of Resource groups in order to organize how different workloads are prioritized.

The Presto Resource groups manage quotas for two main resources: CPU and memory. Additionally, there are granular resource constraints that can be specified such as concurrency, time, and cpuTime.

The Resource group limits only apply during admission. Once a query starts execution, the Resource group manager has no control over the query. Rather, Presto introduces the concept of penalty for groups who exceed their resource specification, checking whether a Resource group has exceeded its limit before letting it start a new query.

Example

In the example configuration below, the one distributed (prestocluster/presto/conf/resource_groups.json, that you should tweak according your needs), there are several Resource groups and four selectors that define which queries run in which Resource group:

  • The first selector matches queries from datascience and places them in the global.adhoc.datascience group.

  • The second selector matches queries from bi and places them in the global.adhoc.bi group.

  • The third selector matches queries from pipeline and places them in the global.pipeline group.

  • The fourth selector matches queries from admin and places them in the admin group.

  • The last selector is a catch-all for all queries that have not been matched into any group.

Together, these selectors implement the following policy:

  • The admin group can run up to 50 concurrent queries (hardConcurrencyLimit).

For the remaining groups:

  • No more than 100 total queries may run concurrently (hardConcurrencyLimit).

  • Ad-hoc queries, like queries coming from BI tools can run up to 10 concurrent queries (hardConcurrencyLimit) and queries coming from Data Science tools can run up to 2 concurrent queries (hardConcurrencyLimit).

  • Non ad-hoc queries will run under the global.pipeline group, with a total concurrency of 45 (hardConcurrencyLimit). Queries are run in FIFO order.

  • All remaining queries are placed in the global group.

{

  "rootGroups": [

    {

      "name": "global",

      "softMemoryLimit": "80%",

      "hardConcurrencyLimit": 100,

      "maxQueued": 1000,

      "schedulingPolicy": "weighted",

      "jmxExport": true,

      "subGroups": [

        {

          "name": "adhoc",

          "softMemoryLimit": "10%",

          "hardConcurrencyLimit": 50,

          "maxQueued": 1,

          "schedulingWeight": 10,

          "subGroups": [

            {

              "name": "datascience",

              "softMemoryLimit": "10%",

              "hardConcurrencyLimit": 2,

              "maxQueued": 1,

              "schedulingWeight": 10,

              "schedulingPolicy": "weighted_fair"

            },

            {

              "name": "bi",

              "softMemoryLimit": "10%",

              "hardConcurrencyLimit": 10,

              "maxQueued": 100,

              "schedulingWeight": 10,

              "schedulingPolicy": "weighted_fair"

            }

          ]

        },

        {

          "name": "pipeline",

          "softMemoryLimit": "80%",

          "hardConcurrencyLimit": 45,

          "maxQueued": 100,

          "schedulingWeight": 1,

          "jmxExport": true

        }

      ]

    },

    {

      "name": "admin",

      "softMemoryLimit": "100%",

      "hardConcurrencyLimit": 50,

      "maxQueued": 100,

      "schedulingPolicy": "query_priority",

      "jmxExport": true

    }

  ],

  "selectors": [

    {

      "source": "datascience",

      "group": "global.adhoc.datascience"

    },

    {

      "source": "bi",

      "group": "global.adhoc.bi"

    },

    {

      "source": "pipeline",

      "group": "global.pipeline"

    },

    {

      "source": "admin",

      "group": "admin"

    },

    {

      "group": "global"

    }

  ],

  "cpuQuotaPeriod": "1h"

}

prestocluster/presto/conf/resource_groups.json

For Denodo to take advantage of the Presto's Resource group mechanism, you could create different data sources in Denodo for the same Presto cluster and select the correspondent Resource group (source) using the applicationNamePrefix driver properties.

Denodo dataSource driver properties

In the Presto web interface you can check that queries executed from the above Denodo dataSource, with the applicationNamePrefix equals to bi source, are assigned to the global.adhoc.bi Resource group:

Presto UI

For more information see Presto documentation.

Cluster Scalability

When you deploy the Presto Cluster on Kubernetes, you define in the prestocluster/values.yaml how many Presto workers (presto.numWorkers) you would like to run. But you can scale the Presto Cluster afterwards, increasing or decreasing the number of Presto workers.

Manually

You can scale the cluster manually with the kubectl scale command, changing the number of Presto workers (replicas) you want to run:

$ kubectl scale deployment/presto-worker --replicas=6

Needless to say, the cluster should have sufficient resources available.

Autoscale

You can scale the cluster in an automatic manner enabling presto.autoscaling.enabled in the prestocluster/values.yaml, configuring a maximum number of replicas for your Presto workers, presto.autoscaling.maxReplicas, and a CPU utilization target, presto.autoscaling.targetCPUUtilizationPercentage. The autoscaler (HorizontalPodAutoscaler) will periodically adjust the number of replicas to match the average CPU utilization that you have specified.

You can check the current status of the HorizontalPodAutoscaler running:

$ kubectl get hpa

The output is similar to:

NAME          REFERENCE                 TARGETS   MINPODS   MAXPODS   REPLICAS

presto-worker Deployment/presto-worker  22%/80%   2             6       2     

There are two requirements to ensure the HorizontalPodAutoscaler (HPA) works and therefore the kubectl autoscale command.

  1. A monitoring tool is installed for providing metrics to Kubernetes: Metrics Server, Prometheus,...

See https://github.com/kubernetes-sigs/metrics-server#installation for instructions on how to deploy the Metrics Server add-on. Metrics Server provides CPU/Memory based horizontal autoscaling.

  1. The Presto worker has CPU resource requests, presto.worker.resources, defined in the prestocluster/values.yaml file. If the CPU utilization is not configured the autoscaler (HPA) will not take any action.

Note, that resources are not configured by default, as we leave this setting as a conscious choice for the cluster administrator.

Parquet datasets as tables

Hive tables

To access Parquet datasets from Presto, you have to map the AWS S3 data files (or Azure Blob Storage, or Azure Data Lake, etc..) to Hive tables in the Hive Metastore. The mapping is done through the CREATE TABLE sentence, where you have to provide the schema of the data, the file format, and the data location.

CREATE TABLE orders (

    orderkey bigint,

    custkey bigint,

    orderstatus varchar,

    totalprice double,

    orderpriority varchar,

    clerk varchar,

    shippriority integer,

    comment varchar

) WITH (

     external_location = 's3a://my_bucket/path/to/folder/',

     format = 'PARQUET'

);

  • external_location: The data location must be a bucket name or a bucket name and one or more folders, not a specific file.

Don't forget to use a trailing slash, otherwise Presto will throw the following error: Can not create a Path from an empty string.

  • format: Supported file formats are:

  • ORC
  • PARQUET
  • AVRO
  • RCBINARY
  • RCTEXT
  • SEQUENCEFILE
  • JSON
  • TEXTFILE
  • CSV

The WITH clause of the CREATE TABLE can also be used to set other properties on the table. The available properties are described in the Appendix A: Hive Table Properties.

There are several methods to create Hive tables at Presto when the cluster is running:

  • Use the embedded Presto data source in Denodo to graphically explore the Parquet datasets, create tables at Presto and base views at Denodo.

This feature requires the Denodo subscription bundle Enterprise Plus and the Denodo 8.0 20230103 Update.

  • Execute the metadata discovery tool, explained later in this manual, to create tables at Presto and base views at Denodo.

  • Use your favorite SQL client to create tables at Presto.

Partition Tables

Hive tables can have partitions, as they can improve query performance, especially when the columns being queried for are the partitioned ones. This means that, for each column value of the partitioned column, there will be a separate folder under the location of the dataset.

Presto supports tables from partitioned datasets which follows the Hive naming convention for the folders, that is: columnName=value, e.g. elb_name=abc.

.../weblogs/elb_name=abc/elb_response_code=200/…

.../weblogs/elb_name=abc/elb_response_code=404/…

.../weblogs/elb_name=xyz/elb_response_code=200/…

CREATE TABLE IF NOT EXISTS weblogs (

    request_timestamp varchar,

    request_ip varchar,

    request_port integer,

    backend_ip varchar,

    backend_port integer,

    request_processing_time double,

    backend_processing_time double,

    client_response_time double,

    received_bytes bigint,

    sent_bytes bigint,

    request_verb varchar,

    url varchar,

    protocol varchar,

    elb_name varchar,

    elb_response_code varchar)

WITH (

    format = 'parquet',

    partitioned_by = ARRAY['elb_name', 'elb_response_code'],

    external_location = s3://my_bucket/path/to/folder/'

);

Notice that Hive requires the partition columns to be the last columns in the CREATE TABLE sentence.

However, if we query a partition table after creating it, we will find that it returns zero results. The Hive Metastore needs to discover which partitions exist by querying the underlying storage system. The Presto procedure system.sync_partition_metadata(schema_name, table_name, mode) is in charge of detecting the existence of partitions.

When data is inserted in new partitions we need to invoke the sync_partition_metadata procedure again, to discover the new records.

call system.sync_partition_metadata('default', 'weblogs', 'FULL');

The metadata discovery tool invokes sync_partition_metadata automatically, after each partition table creation.

Metadata Discovery Tool

The metadata discovery tool explores the distributed storage, starting from a folder given by the user, and from every Paquet dataset it finds, in a folder or subfolder, it creates the corresponding table in Presto. Then, it calls a Denodo stored procedure that introspects Presto and creates the Denodo base views for all the new tables created in Presto.

The metadata discovery tool can create Presto tables from partitioned datasets which follows the Hive naming convention for the folders, that is: columnName=value, e.g. elb_name=abc.

.../weblogs/elb_name=abc/elb_response_code=200/…

.../weblogs/elb_name=abc/elb_response_code=404/…

.../weblogs/elb_name=xyz/elb_response_code=200/…

The metadata discovery tool invokes sync_partition_metadata automatically, after each partition table creation.

The tool is distributed both as a Denodo stored procedure and as script.

Stored Procedure

The DISCOVER_PRESTO_METADATA stored procedure is distributed in the folder metadata-discovery/sp/denodo-presto-k8scluster-{version}-jar-with-dependencies.jar.

To be able to use this store procedure in Denodo, first, you need to import the denodo-presto-k8scluster-{version}-jar-with-dependencies.jar file using  the File > Extension management menu option of the VDP Administration.

Second, you have to create the stored procedure by clicking on the menu File > New > Stored procedure option, in the VDP Administration, and provide the following values:

Name: a name

Class name: com.denodo.connect.presto.MetadataDiscoveryStoredProcedure

Select Jars: denodo-discover-presto-metadata

FInally, you can execute the stored procedure:

  • Using the Execute button in the dialog that displays the schema of the stored procedure. Denodo will show a dialog where you have to enter the input values.

  • Using the VQL Shell:

SELECT * FROM discoverprestometadata()

WHERE dir_to_explore='s3a://bucket/dir/'

and credentials_path='/absolute/path/to/core-site.xml'

and datasource_name='the_presto_datasource'

and duplicate_strategy='NEVER_REPLACE'

and included_pattern=NULL

and baseview_folder='/01 - Connectivity/02 - Base Views'

and baseview_prefix='bv_'

and analyze_tables=true

The input parameters required by the stored procedure are:

  • dir_to_explore: directory of the distributed storage path (s3a, adl, wasb, abfs), to start the metadata discovery process.

  • datasource_name: the name of a Presto data source already created in Denodo (see Connect Denodo to Presto section for more information).

The base views created by this stored procedure will belong to this Presto data source.

The following input parameters are optional:

  • credentials_path: absolute path to core-site.xml with the distributed storage credentials.

  • duplicate_strategy: it chooses what to do with the metadata that already exists at Presto and at Denodo. Possible values are NEVER_REPLACE, REPLACE_IF_NEW_COLUMNS, ALWAYS_REPLACE. If no value is provided for the parameter  NEVER_REPLACE is selected.

  • NEVER_REPLACE: It will not replace tables at Presto and data sources and base views at Denodo if they already exist.

  • REPLACE_IF_NEW_COLUMNS: It will replace tables at Presto, and its corresponding base views at Denodo, if new columns are detected for existing Presto tables.

This can affect existing base views at Denodo that had modifications made by the user, like renamed fields, field types modifications, cache configuration... as base views will be replaced and these kinds of changes will be lost.

Our recommendation is to always leave Denodo base views as is and make changes only in views that derive from these base views.

  • ALWAYS_REPLACE: It will replace tables at Presto, and its corresponding base views at Denodo, if any kind of change is detected for existing Presto tables: new columns, deleted columns, different column types, etc.

This can affect existing base views at Denodo that had modifications made by the user, like renamed fields, field types modifications, cache configuration... as base views will be replaced and these kinds of changes will be lost.

Our recommendation is to always leave Denodo base views as is and make changes only in views that derive from these base views.

  • included_pattern: a regular expression that will select only certain files or folders in the exploring process. E.g., (.*)invoices(.*)

  • baseview_folder: all the base views created by the stored procedure will be located at this folder when this parameter is specified. The folder will be created by the stored procedure if needed.

  • baseview_prefix: all the base views created by the stored procedure will be prefixed with this value when this parameter is specified. When this parameter is not configured and a base view with an invalid name should be created, the prefix "bv" will be appended.

  • analyze_tables: whether to run the ANALYZE command after each table creation or not.

This command collects statistical information about the data that the Presto optimizer uses to plan the query, based on cost strategies.

The output of this stored procedure is EXECUTED_COMMAND, that shows the DDL statements executed successfully at Presto and at Denodo:

  • the SQL of each tables created at Presto

  • the  VQL of each Presto wrapper and base view created at Denodo

Script

The metadata discovery script is distributed in the folder metadata-discovery/bin. A configuration file is required as input:

$ denodo-presto-metadata-discovery.sh ../conf/configuration.properties

Configuration file

dirToExplore=s3a://example-bucket/dir/

credentials.xml.path=../conf/core-site.xml

#includedPattern=(.*)blahblah(.*)

# Denodo connection properties

denodo.jdbcUrl=jdbc:vdb://localhost:9999/presto_tests

denodo.username=admin

denodo.password=admin

# Presto connection properties

#     Requisites for this code to be able to connect to Presto:

#     If the certificate used by the Presto server is signed by a private authority:

#     - import certificate.crt in the TrustStore of the JVM

#       (the one executing this code and the Denodo's JVM)

#     OR

#     - use the JDBC URI parameters (SSLTrustStorePath and

#       SSLTrustStorePassword) to provide the path

#       to a TrustStore with the certificate.crt

presto.jdbcUrl=jdbc:presto://localhost:8443/hive/default?SSL=true\

&SSLTrustStorePath=truststore-path&SSLTrustStorePassword=truststore-pwd

presto.username=presto

presto.password=pr3st%

# Default values

denodo.datasource.name=ds_presto_k8s

denodo.datasource.folder=/01 - Connectivity/01 - Data Sources

denodo.baseview.folder=/01 - Connectivity/02 - Base Views

denodo.baseview.prefix=bv_

presto.analyze.tables=true

# Possible values: NEVER_REPLACE, REPLACE_IF_NEW_COLUMNS, REPLACE_ALWAYS

metadata.duplicated.strategy=NEVER_REPLACE

configuration.properties

  • dirToExplore: directory of the distributed storage path (s3a, adl, wasb, abfs), to start the metadata discovery process.

  • credentials.xml.path: core-site.xml with the distributed storage credentials.

  • includedPattern: a regular expression that will select only certain files or folders in the exploring process. E.g., (.*)invoices(.*).

  • Denodo connection properties: JDBC URI, user and password.

  • Presto Connection properties: JDBC URI, user and password.

Requisites to be able to connect to Presto

If the certificate used by the Presto server is signed by a private authority:

  1. Import certs/certificate.crt in the TrustStore of the JVM (the one executing this code and the Denodo's JVM).

Note: Since Denodo 8.0 Update 7 this import is not needed when using the the self-signed certificate and the Denodo Cloud Marketplace images or Denodo servers created through the Solution Manager Automated Mode with the images Provided by Denodo, since the images generated with Denodo 8.0 Update 7 include the self-signed certificate, certs/certificate.crt distributed with this Denodo Presto Cluster.

      OR

  1. Use the JDBC URI parameters SSLTrustStorePath and SSLTrustStorePassword to provide the path to a TrustStore with the certs/certificate.crt

The certs/certificate.crt is distributed ONLY for testing purposes. This certificate accepts localhost and presto-denodo as Presto host names:

  • localhost: for tests performed in a local Kubernetes cluster, like the one provided by Docker Desktop.

  • presto-denodo: for tests performed in a Kubernetes cluster on the cloud, like the ones provided by AWS and Azure.

In this case, you have to add an entry in the hosts file where the Denodo server is running, with presto-denodo and the IP that appears as the EXTERNAL-IP of the presto service.

Kubernetes services status

/etc/hosts file

But you are encouraged to use a certificate issued by a CA in production. And this certificate will be the one that you have to import in the TrustStore of the client code (if this certificate is signed by a private authority), and in the Java Keystore, with its private key, of the Presto Cluster (prestocluster/presto/conf/presto.jks) before deploying it.

  • denodo.datasource.name: the name of the Presto data source that will be created at Denodo by this script.

  • denodo.datasource.folder: the Presto data source created at Denodo by this script will be located at this folder when this parameter is specified. The folder will be created by this script if needed.

  • denodo.baseview.folder: all the base views created by this script will be located at this folder when this parameter is specified. The folder will be created by this script if needed.

  • denodo.baseview.prefix: all the base views created by this script will be prefixed with this value when this parameter is specified. When this parameter is not configured and a base view with an invalid name should be created, the prefix "bv" will be appended.

  • presto.analyze.tables: whether to run the ANALYZE command after each table creation or not. This command collects statistical information about the data that the Presto optimizer uses to plan the query, based on cost strategies. Default is true.

  • metadata.duplicated.strategy: it chooses what to do with the metadata that already exists at Presto and at Denodo. Possible values are NEVER_REPLACE, REPLACE_IF_NEW_COLUMNS, ALWAYS_REPLACE. Default is NEVER_REPLACE.

  • NEVER_REPLACE: It will not replace tables at Presto and data sources and base views at Denodo if they already exist.

  • REPLACE_IF_NEW_COLUMNS: It will replace tables at Presto, and its corresponding base views at Denodo, if new columns are detected for existing Presto tables.

This can affect existing base views at Denodo that had modifications made by the user, like renamed fields, field types modifications, cache configuration... as base views will be replaced and these kinds of changes will be lost.

Our recommendation is to always leave Denodo base views as is and make changes only in views that derive from these base views.

  • ALWAYS_REPLACE: It will replace tables at Presto, and its corresponding base views at Denodo, if any kind of change is detected for existing Presto tables: new columns, deleted columns, different column types, etc.

This can affect existing base views at Denodo that had modifications made by the user, like renamed fields, field types modifications, cache configuration... as base views will be replaced and these kinds of changes will be lost.

Our recommendation is to always leave Denodo base views as is and make changes only in views that derive from these base views.

Output

The output of the process would be:

  • the list of tables created at Presto using the Hive connector

  • the Presto data source created at Denodo, with the configured presto.jdbcUrl as the Database URI of the data source in Denodo.

  • the Presto base views created at Denodo

METADATA DISCOVERY RESULTS

******************************************

Commands executed at Presto:

------------------------------------------

CREATE TABLE "customer" (

ANALYZE "customer"

CREATE TABLE "store" (

ANALYZE TABLE "store"

CREATE TABLE "store_sales" (

ANALYZE "store_sales"

CREATE TABLE "date_dim" (

ANALYZE "date_dim"

------------------------------------------

Commands executed at Denodo (data source):

--------------------------------------------------------------

CREATE FOLDER '/01 - Connectivity';

CREATE FOLDER '/01 - Connectivity/01 - Data Sources';

CREATE DATASOURCE JDBC "ds_presto_k8s"

------------------------------------------

Commands executed at Denodo (base views):

-------------------------------------------------------------

CREATE OR REPLACE FOLDER '/01 - Connectivity/02 - Base Views';

CREATE OR REPLACE WRAPPER JDBC bv_customer

CREATE OR REPLACE TABLE bv_customer I18N us_pst (

CREATE OR REPLACE WRAPPER JDBC bv_date_dim

CREATE OR REPLACE TABLE bv_date_dim I18N us_pst (

CREATE OR REPLACE WRAPPER JDBC bv_store

CREATE OR REPLACE TABLE bv_store I18N us_pst (

CREATE OR REPLACE WRAPPER JDBC bv_store_sales

CREATE OR REPLACE TABLE bv_store_sales I18N us_pst (

                        Output of the metadata discovery process

Cost Based Optimization

If your queries are complex and include joins of large data sets, running ANALYZE on tables/partitions may improve query performance. The metadata discovery tool runs ANALYZE automatically, by default, after each table creation.

This command collects statistical information about the data that the Presto optimizer uses to plan the query, based on cost strategies.

Once again, you can use your favorite SQL client to analyze the tables you want when the cluster is running.

ANALYZE store_sales10;

The image above shows an example of the statistics collected by Presto after running ANALYZE table.

SHOW stats for store_sales10:

Table statistics

Iceberg tables

Apache Iceberg is a high-performance table format for huge analytic datasets. Iceberg tables provide schema evolution, partition evolution, and table version rollback, without the need for table rewrites or table migration.

The Denodo Presto Cluster supports Iceberg tables and uses the Hive Metastore as the metadata catalog. This enables an additional user case for the Presto Cluster, the possibility of reusing existing Hive Metastore to access Iceberg tables already created, instead of starting from scratch.

But let's see now how to create new Iceberg tables in Presto.

Before creating Iceberg tables you need to create a new schema that sets the location where the tables -- their Parquet files and metadata files -- will be located.

For this you can use your favorite SQL client:

CREATE SCHEMA iceberg.<schema_name> WITH (location = 's3a://my_bucket/path/to/folder/');

For table creation the CREATE TABLE sentence is required:

CREATE TABLE orders (

    orderkey bigint,

    custkey bigint,

    orderstatus varchar,

    totalprice double,

    orderpriority varchar,

    clerk varchar,

    shippriority integer,

    comment varchar

) WITH (

     location = 's3a://my_bucket/path/to/folder/',

     format = 'PARQUET'

);

The WITH clause of the CREATE TABLE can also be used to set other properties on the table. The available properties are described in the Appendix B: Iceberg Table Properties.

The CREATE TABLE sentence creates an empty table even if there is data already present in the S3 folder location. Iceberg table creation does not work the same way as Hive table creation since, in the latter case, existing data contained in the S3 bucket compatible with the table schema would be considered as the table content.

Therefore, to access Parquet datasets using Iceberg tables you should create Hive tables for those Parquet datasets and then use the CREATE TABLE AS SELECT (CTAS) statement to create the new Iceberg tables from those Hive tables.

CREATE TABLE IF NOT EXISTS iceberg.schema.ctas_orders

AS (SELECT * FROM hive.default.orders);

# partition tables

CREATE TABLE IF NOT EXISTS iceberg.schema.ctas_weblog  

WITH (

        partitioning = ARRAY['elb_name', 'elb_response_code']

)

AS (SELECT * FROM hive.default.weblog)

The drawback of this method is that you will double the storage for the dataset temporarily, as you will store both the data for the Hive table and the data for the new Iceberg table.

The alternative is to use the migrate Spark procedure, that replaces the existing Hive table with an Iceberg table using the same data files, as there is no Presto support for migrating Hive tables to Iceberg.

There is also no support to register already existing Iceberg tables in Presto, but the register_table Spark procedure can be used as an alternative.

Connect Denodo to Presto

To establish a connection to Presto you have two options:

  1. Register the Presto Cluster as Denodo's MPP. See Register Presto as Denodo's MPP section.

This feature requires the Denodo subscription bundle Enterprise Plus and the Denodo 8.0 20230103 Update.

  1. Create a JDBC data source in Denodo and fill all the required fields:
  • Name
  • Database adapter: PrestoDB
  • Driver class path: presto-0.1x
  • Driver class: com.facebook.presto.jdbc.PrestoDriver
  • Database URI: jdbc:presto://presto-denodo:8443/hive/default?SSL=true
  • Login: presto
  • Password: pr3st% (or the custom one if it was specified in the deployment process of the Cluster: cluster.sh deploy)

Notice that, if the certificate used by the Presto server is signed by a private authority and you do not specify an alternative Java TrustStore in the Database URI, with the parameters SSLTrustStorePath and SSLTrustStorePassword, (see all the URI parameters accepted), you have to import the Presto server certificate (certs/certificate.crt) in the Denodo server trust store.

Note: Since Denodo 8.0 Update 7 this import is not needed when using the self-signed certificate and the Denodo Cloud Marketplace images or Denodo servers created through the Solution Manager Automated Mode with the images Provided by Denodo, since the images generated with Denodo 8.0 Update 7 include the self-signed certificate, certs/certificate.crt distributed with this Denodo Presto Cluster.

The certs/certificate.crt is distributed ONLY for testing purposes. This certificate accepts localhost and presto-denodo as Presto host names:

  • localhost: for tests performed in a local Kubernetes cluster, like the one provided by Docker Desktop.

  • presto-denodo: for tests performed in a Kubernetes cluster on the cloud, like the ones provided by AWS and Azure.

In this case, you have to add an entry in the hosts file where the Denodo server is running, with presto-denodo and the IP that appears as the EXTERNAL-IP of the presto service.

Kubernetes services status

/etc/hosts file

But you are encouraged to use a certificate issued by a CA in production. And this certificate will be the one that you have to import in the TrustStore of the Denodo server (if the certificate is signed by a private authority) and in the Java Keystore, with its private key, of the Presto Cluster (prestocluster/presto/conf/presto.jks) before deploying it.

Presto datasource in Denodo

Also, it is recommended to configure these two properties in the Source Configuration tab of the data source:

  • Number of processing units: the number of Presto workers in the cluster
  • Number of CPUs per processing unit: number of CPUs (not vCPUs) of each node of the cluster

as these values are used by the Denodo optimizer for its estimations.

Source Configuration tab in Denodo

Once the data source is configured, click on "Create base View" to explore Presto schemas with their tables and their columns.

Presto tables in Denodo

Finally, you have reached your goal! You can run any query you want on your S3 datasets.

Query results in Denodo

Bulk Data Load

The Presto data source can be used to perform bulk data loads.

Before configuring the Bulk Data Load in Denodo you have to create a new schema in Presto that sets the location where the parquet files, created by Denodo, will be located.

For this you can use your favorite SQL client:

CREATE SCHEMA hive.<schema_name> WITH (location = '<filesystem_schema>://<host>/<folders>');

To configure the Bulk Data Load, check Use Bulk Data Load APIs of the Presto data source in its Read & Write tab, and fill the parameters: server timezone, storage path (adl://<adl_host>/<dirs> in the example), and the catalog and schema:

Then, depending on the file system chosen, you may have to add some Hadoop properties to configure the authentication. In the example, the properties to configure the access to Azure Data Lake Storage are:

Finally, press the Test bulk load button to check that all is working fine:

When using S3 storage, you should set the Virtual DataPort property 'com.denodo.vdb.util.tablemanagement.sql.PrestoTableManager.useExternalTables' to 'false' if you receive the error: External location must be a directory.

Upgrade from previous versions

The process of upgrading to a new version of the Denodo Presto Cluster on Kubernetes involves transferring the settings and the metadata from the version installed to the new one.

Step1. Transferring the settings.

        The configuration of the cluster resides in prestocluster/values.yaml. You need to be aware of the modifications you made in the values.yaml file and make these changes in the new version of the values.yaml file before executing the deployment of the new cluster.

Step 2. Transferring the metadata.

In case you were using the PostgreSQL internal to the cluster, you need to save the metadata from the old cluster executing a dump over the PostgreSQL database of the Hive Metastore, before deleting the cluster. This dump generates a text file with SQL commands that, when fed back to the Hive Metastore of the new cluster, will recreate the database in the same state as it was at the time of the dump: keeping the whole Presto schema metadata, that is, your datasets exposed as tables.

Dump in the old cluster:

$ kubectl exec <old Hive Metastore Pod> -- bash -c "PGPASSWORD=hive pg_dump -c -U hive -h localhost metastore" > database.sql

Restore in the new cluster:

$ cat database.sql | kubectl exec -i <new Hive Metastore Pod> -- bash -c "PGPASSWORD=hive psql -U hive -h localhost -d metastore"

✸  Notice that restore the database is the first thing you should do before creating new tables in the Presto Cluster.

Appendix A: Hive Table Properties

The optional WITH clause of the CREATE TABLE can be used to set properties on the newly created table or on single columns.

Available table properties are:

  • avro_schema_url (varchar): URI pointing to Avro schema for the table.

  • bucket_count (integer): Number of buckets.

  • bucketed_by (array(varchar)): Bucketing columns.

  • csv_escape (varchar): CSV escape character.

  • csv_quote (varchar): CSV quote character.

  • csv_separator (varchar): CSV separator character.

  • dwrf_encryption_algorithm (varchar): Algorithm used for encryption data in DWRF.

  • dwrf_encryption_provider (varchar): Provider for encryption keys in provider.

  • encrypt_columns (array(varchar)): List of key references and columns being encrypted. Example: ARRAY['key1:col1,col2', 'key2:col3,col4'].

  • encrypt_table (varchar): Key reference for encrypting the whole table.

  • external_location (varchar): File system location URI for external table.

  • format (varchar): Hive storage format for the table. Possible values: [ORC, PARQUET, AVRO, RCBINARY, RCTEXT, SEQUENCEFILE, JSON, TEXTFILE, CSV].

  • orc_bloom_filter_columns (array(varchar)): ORC Bloom filter index columns.

  • orc_bloom_filter_fpp (double): ORC Bloom filter false positive probability.

  • partitioned_by (array(varchar)): Partition columns.

  • preferred_ordering_columns (array(varchar)): Preferred ordering columns for unbucketed table.

  • sorted_by (array(varchar)): Bucket sorting columns.

Appendix B: Iceberg Table Properties

The optional WITH clause of the CREATE TABLE can be used to set properties on the newly created table or on single columns.

Available table properties are:

  • format (varchar): File format for the table.

  • format_version (varchar): Format version for the table.

  • location (varchar): File system location URI for the table.

  • partitioning (array(varchar)): Partition transforms.