Applies to:
Denodo 8.0
Last modified on: 23 Nov 2021
Tags:
Administration
Docker
Kubernetes
Containers technology is gaining a lot of popularity due to all the benefits it provides. Their efficiency, portability, and agility are making a revolution in the IT departments of many companies. However, with great power comes great responsibility, and so it is important to learn about how they work before including them in our architecture.
One of the most unexpected facts about containers is that they do not persist the data by default. Their ephemeral nature means that once the container dies and the death container is removed, the data inside the container is also gone. There are some cases where this is not a problem, for instance, if we do not expect changes in our application, or if the Denodo metadata is stored in an external database.
But in general, we will be interested in persisting data such as denodo metadata or custom configuration files, so the data remains available once the container dies.
By default, the data inside the Denodo containers is not persisted. So, if we start a Denodo container and create some views, the views will be lost once the container dies. If those views have to be saved then we will need to do something about them:
If your denodo deployment will only have changes to its metadata during the deployment of new features then you probably want to persist the changes as part of the image, so you may benefit from the rollout and rollback functionality provided by the container management infrastructure that you could be using, like the Kubernetes deployment mechanisms.
You should also consider that data for some parts of the deployment should reside outside of the Denodo container to start with, like the user authentication database, or the database for the cache if in use.
With Docker you can create a new container image from a running container with the following commands:
docker stop <my-container> docker commit <my-container> <my-image>:<my-tag> docker start <my-container> |
The docker commit creates a new Docker image with the changes in the container. This will allow you to run new containers with the same status as the container that you used to create the image. Although it is preferable to generate images with Dockerfiles, this is still a valid option to generate images in a development environment that you can later deploy in Production.
Using volumes is the typical solution to persist data out of the container. Using volumes simplifies the management and is a good solution for development environments.
In Docker, we can add volumes in the docker run command with the option -v. For instance, in order to persist the metadata folder we can execute the following command:
docker run -v "C:/Denodo/Metadata:/opt/denodo/metadata" denodo-platform:8.0-latest ./denodo-container-start.sh --vdpserver |
So, now the question is to decide which folders from the Denodo installation folder should be persisted. Basically, there are two approaches:
Hence, if we take the second approach, we will need to select the folders to be persisted. In order to make the decision, we will go folder by folder and check what is being stored there. If the folder contents will not change in our environment, then that folder does not need to be persisted. Only folders whose content might change should be persisted.
For instance, the following folders are subject to change in some scenarios and may require to be persisted:
Folder |
Reason |
/opt/denodo/bin |
The Denodo scripts can be regenerated after changing the JVM configuration |
/opt/denodo/conf |
The configuration files of Denodo are stored in this folder |
/opt/denodo/lib-external |
Contains JDBC drivers not distributed with Denodo |
/opt/denodo/lib/data-catalog-extensions |
Contains Jar libraries used by Data Catalog |
/opt/denodo/lib/scheduler-extensions |
Contains Jar libraries used by Scheduler |
/opt/denodo/lib/solution-manager-extensions |
Contains Jar libraries used by Solution Manager (used in Solution Manager containers only) |
/opt/denodo/logs |
The logs are stored in this folder (check the section below for more information on this) |
/opt/denodo/metadata |
The metadata of our Denodo servers is saved in this folder |
/opt/denodo/resources/apache-tomcat |
The folder includes the configuration of the embedded Tomcat |
/opt/denodo/work/arn/data/index |
The indexes created with Scheduler Index are stored in this folder |
/opt/denodo/extensions/thirdparty/sap-jco |
Contains libraries needed by SAP BW and SAP BI data sources |
The volumes in Kubernetes work in a different way than in Docker. When you create a volume in Kubernetes, the volume will be created as an empty folder. In many cases, the folder where we want to mount that volume has some content in the container image, so we expect this default content to be included in the volume. For instance, if you mount a volume in the /opt/denodo/metadata path of Denodo, you will notice that it will be empty instead of including the metadata structure distributed with the Denodo Platform.
With Docker new volumes are initialized with the contents of the folder where they are mounted, so, in order to achieve the same behavior with Kubernetes volumes, you must initialize the volumes explicitly when launching for the first time the pods that will use that volume, in a way that the volumes include the expected files that were distributed with the Denodo container image.
The below YAML shows a deployment of a Denodo container that is mounting the folder /opt/denodo/metadata in a volume. Notice that without the initContainers section of the deployment, the pod will not be able to start, since with an empty metadata folder the Denodo application won’t work:
apiVersion: apps/v1 kind: Deployment metadata: name: denodo-deployment spec: selector: matchLabels: app: denodo-app replicas: 1 template: metadata: labels: app: denodo-app spec: hostname: denodo-hostname initContainers: - name: init-volume image: denodo-platform:8.0-latest command: ["/bin/sh", "-c", 'if [ -z "$(ls -A /tmp)" ]; then cp -R /opt/denodo/metadata/* /tmp/; fi'] volumeMounts: - name: metadata-volume mountPath: /tmp containers: - name: denodo-container image: denodo-platform:8.0-latest command: ["./denodo-container-start.sh"] args: ["--vqlserver"] ports: - name: denodo-port containerPort: 9999 - name: web-container containerPort: 9090 volumeMounts: - name: config-volume mountPath: /opt/denodo/conf/denodo.lic subPath: denodo.lic - name: metadata-volume mountPath: /opt/denodo/metadata volumes: - name: config-volume configMap: name: denodo-license - name: metadata-volume |
Basically, to initialize the volume we are mounting it first with a different container on a different path, and copying all the contents from the folder we want to mount only if the volume is empty. That way, the next time the volume is mounted, it will be ready to be used in our main container including all the default configuration and files coming from the original image.
If a containerized application unexpectedly ends you will probably need to check the logs, but if the logs will disappear with the container you won't probably be able to check them.
To solve these issues you need to choose a solution to manage your application logs, and Denodo allows you to do that in different ways. Some of the alternatives for managing logs in Denodo:
For instance, in order to redirect the vdp.log file content to the standard output so docker logs can show you the lines, you can just replace the Root logger content in the Log4j configuration file /opt/denodo/conf/vdp/log4j2.xml:
... <Root level="error"> <AppenderRef ref="FILEOUT" /> </Root> ... |
With:
... <Root level="error"> <AppenderRef ref="STDOUT" /> </Root> ... |