Kubernetes Deployment¶
Denodo Lakehouse Accelerator (formerly known as Denodo Embedded MPP) is designed to run on Kubernetes 1.23+ with:
Amazon Elastic Kubernetes Service (EKS).
See Denodo Lakehouse Accelerator AWS Checklist for planning your deployment on Amazon EKS.
For an automated deployment check Deploying Denodo Lakehouse Accelerator in AWS Using CloudFormation guide
Azure Kubernetes Service (AKS).
See Denodo Lakehouse Accelerator Azure Checklist for planning your deployment on Azure Kubernetes Service.
For an automated deployment check Deploying the Denodo Lakehouse Accelerator in Azure Using ARM guide
Red Hat OpenShift.
Google Kubernetes Engine (GKE).
The main steps for deploying it on a Kubernetes cluster are:
Create the Kubernetes cluster.
It is recommended to create a cluster with N + 2 nodes, with no other applications running on the nodes:
One single node for the Lakehouse Accelerator coordinator.
One single node for the Embedded Hive Metastore and its Embedded PostgreSQL.
N nodes for the Lakehouse Accelerator workers: one single node for each Lakehouse Accelerator worker.
Since the Denodo Lakehouse Accelerator requires a certain amount of CPU and memory to process queries, it is also recommended to start with nodes with at least 16-32 cores and 64-128 GB of memory.
For example, in Amazon Elastic Kubernetes Service you can start with m6a.8xlarge or r6a.4xlarge nodes.
See Sizing Recommendations for the Denodo Embedded MPP for more details.
Decide how the Denodo Lakehouse Accelerator will be exposed to Denodo server, based on the Kubernetes service type. The options are:
LoadBalancer. It is the option configured by default.
ClusterIP and Ingress. For this, it is necessary to install an ingress controller, such as the NGINX Ingress Controller, beforehand.
ClusterIP and Route. This option is only available in OpenShift, since routes are specific to OpenShift.
When using the Embedded PostgreSQL make sure the cluster has configured Kubernetes Storage Classes to provision Kubernetes Persistent Volumes. Because the cluster will use a Persistent Volume to ensure the persistence of metadata. See Persistent Volume for the Embedded PostgreSQL for details.
The Denodo Lakehouse Accelerator Helm chart and the container images are available at the Denodo’s Harbor Registry and also included in the Denodo Lakehouse Accelerator distribution.
Important
Denodo’s Harbor Registry credentials expire every 6 months. This implies log in to Denodo Harbor every 6 months so that Kubernetes can pull images from the Denodo’s Harbor Registry.
While this option is suitable for testing and proof-of-concept (POC) purposes, please consider a Private Container Image Registry for production scenarios as a best practice.
Configure the Denodo Lakehouse Accelerator through the Helm chart
values.yamlfile. See Configuration section.Deploy the Denodo Lakehouse Accelerator using
helm install lakehouseaccelerator lakehouseaccelerator/. See Deployment section.Autoscaling is optional, but recommended.
For autoscaling in Amazon EKS see Configuring an Autoscaling Denodo Lakehouse Accelerator Cluster in EKS.
For autoscaling in Azure Kubernetes Service see Configuring an Autoscaling Denodo Lakehouse Accelerator Cluster in AKS.
