Introduction
In this document we will see an example of a CI/CD Pipeline for Denodo implemented with:
- Jenkins
- git(GitHub)
- Denodo Solution Manager (SM)
- Denodo Testing Tool (Optional).
In the following sections, we will break down the process of implementing the steps illustrated in this diagram.
Prerequisites
This document assumes the following:
- Denodo VCS integration is configured. In this example we use GitHub. You can find more information about the integration with your VCS in the section Version Control Systems Integration of the Virtual DataPort Administration Guide.
- An accessible Jenkins instance.
Finally, It is also highly recommended to enable backups to be saved once deployments finish.
As an additional note, while this article focuses on a pipeline implemented specifically with Jenkins, a similar approach could be adapted for use with other tools such as Azure DevOps. For further information on implementing this approach with other tools, please refer to the CICD articles in KB.
Step by step overview of the reference procedure
- Developers work on a private Virtual Database (i.e. project_dev).
- Developer pushes changes to the VCS repository (i.e. project1)
- The push to the VCS notifies Jenkins.
- Jenkins calls the Virtual DataPort server to pull from the VCS repository project1 to the main database in Virtual DataPort (project_main)
The developer database (project1_dev) and the main database (project1_main) are both synchronized against the same VCS repository project1. This allows the main database to pull changes made by the developer(s) on their own private vdbs
- (Optional): Integrate the Denodo Testing Tool, this allows you to execute a set of automated tests before deploying.
- (Optional) Policy logic to trigger the deployment.
- Generate an incremental revision.
- Deploy the revision in the target environment.
The next sections will showcase how you could implement the final steps in the Jenkins job using bash commands and a Python script.
Step 1: Developers work on a private Virtual Database.
In this initial step, each VQL developer will engage with a local copy of the database.
This copy enables developers to implement changes and conduct their work without impacting the progress or contributions of other VQL developers, ensuring seamless collaboration while maintaining autonomy over individual tasks.
In order to facilitate working with local copies of databases, developers can refer to the documentation on how to import a Virtual Database (VDB) from a repository.
This leverages the Centralized Workflow with Private Databases model.
Step 2: Developer pushes changes to the VCS repository.
When Denodo developers are modeling, they will sync these changes to the Version Control System (VCS) repository. This involves systematically creating commits to track the progress of their work. Once a certain activity is completed developers will then push these changes to the designated repository.
Git Operations details the operations that Denodo offers for working with the repository. In that documentation, typical operations when working with a repository are explained, such as commit, pull, push, or revert. These operations are fundamental in version control systems and are essential for efficient collaboration and project management.
Step 3: The push to the VCS notifies Jenkins.
In order to implement a CICD pipeline, once a Denodo developer pushes changes to the VCS, our Continuous Integration/Continuous Deployment (CICD) tool, in this case Jenkins, will be notified of these changes to trigger the next set of actions. .
Configure a Webhook
For Jenkins to detect when a change is pushed to the repository, a Webhook for your repository can be configured. A Webhook will automatically inform specified systems in case certain events happen. In our case that will be a push to the repository.
Note that there could be other alternatives to a Webhook such as Jenkins periodically polling the repository for any changes. This however can be very resource intensive and therefore is not recommended.
The example below demonstrates how to set up a web hook in GitHub. The process might look different for other Version Control Systems:
- Navigate to the Settings section of your GitHub Repository.
- Open the Webhooks section.
- Enter the URL address of your Jenkins instance. Make sure that your Jenkins instance is reachable. I.e.: https://myjenkinshost/github-webhook/ (depending on your repository server the endpoint could change, i.e. for gitlab the endpoint will be “/jenkins/project/webhook” see the section “Set up your Jenkins job” for more information).
- Choose trigger events, i.e. Push events
- Click on Add webhook to save the web hook.
Step 4 : Jenkins calls the Virtual DataPort server to pull from the common VCS repository to the main database.
Set up your Jenkins job
In the Jenkins Dashboard navigate to New Item and create a Freestyle project.
- Under Source Code Management select Git and specify the Repository URL with the necessary Credentials to access the repository.
- Make sure your Jenkins server is able to reach your Git Repository.
- You may need to change the Branch Specifier from */master to */main depending on the name of your primary branch.
- Next navigate to the Build Triggers section and choose GitHub hook trigger for GITScm polling.
With the Webhook configured, the Jenkins Job will be started automatically once a change is pushed to the repository.
To implement the reference procedure we need to execute the final steps in the build section.
In the Build section, click on Add build step.
Choose either Execute shell or Execute Windows batch command, depending on your operating system (use the latter if you’re on Windows and need to work with Windows batch commands).
Keep in mind that you can configure the shell used by navigating to Manage Jenkins > Configure System > Shell > Shell executable.
In this document we will use bash shell commands.
Finally, you can detail the necessary commands and scripts for the final steps in the ‘Command’ section. Additionally, you have the option to consolidate some or all of these commands along with their parameters into a Python script, assuming Python is installed on the Jenkins Host. The same approach applies to other programming and scripting languages. In this document, we will illustrate an example using Python.
Step 5: (Optional): Automated testing with the Denodo Testing Tool
The Denodo Testing Tool allows Denodo users to easily automate the testing of their data virtualization scenarios, acting as a safety net before applying any significant changes to these environments (unit and regression testing).
Configuration
Continuing with the Jenkins Job, in the Build section, we can include this optional part, where upon arrival of a new commit, tests will be executed to verify that these new changes do not alter the previously expected behavior.
To achieve this, we will launch a shell script (assuming our Jenkins server is on a Linux machine), and we can use the following syntax, as outlined in the Denodo Testing Tool documentation.
If the tests pass successfully, this script will return a 0 and proceed with the next step of the build process. If the tests fail, it will return a 1 and halt the Jenkins job.
You can add several test steps if you wish. For example, you can execute first some quick tests to fail fast and then more exhaustive tests but that take longer.
Another interesting thing could be to have suite files per database and use the Jenkins environment variable to interpolate the file name.
Extensive documentation is available in the Dendo Testing Tool User Manual. Note that the Denodo Testing Tool is not a requirement for the implementation of a CI/CD pipeline but can serve as a relief from common testing scenarios by automating them.
You can run the Denodo Testing Tool from the Jenkins Job:
Example:
cd <TESTING_TOOL_HOME>/bin bash .//denodo-test.sh /<TESTING_TOOL_HOME>/conf/configuration.properties <TESTING_TOOL_HOME>/conf/test1.denodotest |
The configuration.properties file includes the data source connectivity settings as well as the reporter settings. With the reporter setting you can configure how the results of the testing tool will be published.
configuration.properties example:
encoding=UTF-8 #-------------------- maxRowsInMemoryForMatching=10000 #--------------------- #configure csv file the workspace of the jenkins job as test result output reporter.csv.com.denodo.connect.testing.reporter.CSVTestReporter reporter.csv.pathToOutputFiIe=/var/jenkins_home/workspace/my_first_job/CSVReporter.csv reporter.csv.overwriteOutputFiIeIfExists=true # —------------------- #configure development vdp server as data source vdpdev.driverCtassName.com.denodo.vdp.jdbc.Driver vdpdev.jdbcUrI=jdbc:vdb://192.168.56.1:9999/project1_main vdpdev.username=admin vdpdev.password=admin vdpdev.dbAdapter=denodo-8.0.0 |
Example of a simple test file test1.denodotest
%DESCRIPTION This test is pretty simple. It will fail when SELECT COUNT(*) FROM viewx will not return 20. #----------------------------- %EXECUTION[query] {ds:vdpdev} SELECT COUNT(*) FROM viewx #---------------------------- %RESULTS[data] count 20 |
In this example we configured the Testing Tool to export the result to a csv file. We can use this file to parse the result of the test.
Depending on the result we can go ahead with the promotion, or if we do not wish a promotion in case of a failed test, we can abort the Jenkins job.
This decision can be implemented as part of a script as it will be shown in the next section.
Step 6: (Optional) Policies applicable
In order to have better control over the workflow, certain behaviors such as generating a revision or deploying a revision can be restricted or triggered through policies.
Using Jenkins Variables
In the given example, we utilize specific variables provided by Jenkins to identify crucial aspects related to commits and job execution. These variables allow to track commit information within Jenkins job executions effectively.
- $GIT_PREVIOUS_SUCCESSFUL_COMMIT: This variable indicates the last commit that successfully triggered the Jenkins job. It helps in identifying the commit associated with the previous successful job execution.
- $GIT_PREVIOUS_COMMIT: This variable provides information about the commit preceding the current one. It aids in tracking changes between consecutive commits.
- $GIT_COMMIT: This variable signifies the commit that invoked the webhook, hence representing the current commit being processed by the Jenkins job.
Making use of these variables, we can configure revision generation processes based on various criteria. For instance:
- Revision Generation from Last Successful Commit to Latest: By using the information from $GIT_PREVIOUS_SUCCESSFUL_COMMIT and $GIT_COMMIT, we can generate revisions covering changes between the last successful execution and the current commit.
- Custom Logic Implementation: Jenkins Jobs can define custom logic using variables defined at the job level. These variables can be programmatically managed to cater to specific revision generation requirements. For example:
- Including commits with specific tags or references in commit messages.
- Specifying a minimum number of commits required before generating a revision.
- Implementing any custom logic deemed necessary for revision generation.
In addition to these variables, utilizing the Jenkins documentation provided from the Jenkins build step allows us to access all the variables it provides. Beyond the predefined variables provided by Jenkins, we can create our own variables or utilize our own systems for persisting information to enforce these policies.
Step 7: Generate incremental revision
Denodo 9 introduces the capability to obtain delta changes in VQL straight from git.
This functionality is provided via a new Export command that allows generating the VQL with the differences between 2 commits or tags from a git repository.
This feature is particularly valuable in CI/CD pipelines, and the command was explicitly tailored for these workflows.
In this section we will walk through the steps to create incremental revisions using this new VQL command.
Jenkins build step
Below is a new build step for the Jenkins job, where we call a Shell Script named export.sh, located in <DENODO_HOME>/bin.
This Jenkins build step includes both the generation of the revision and the deployment in the same step. This setup is due to the fact that in this example, the names of the revisions are generated based on the date. Since these are incremental revisions, it is beneficial to have these files sorted by their date, making it easier to share their names.
If this export is generated successfully, the script returns 0 and continues with the build step.
By leveraging these variables and implementing custom logic as needed, Jenkins enables flexible and adaptable revision generation processes tailored to the organization's requirements.
vdp_ip="vdp.local" sm_ip="sm1.local" vdp_uri="${vdp_ip}:19999/admin" sm_uri="${sm_ip}:10090" user_vdp=vdp_user user_pass_vdp=vdp_password user_sm=sm_user user_pass_sm=sm_password env2_id=4 #check specific env id of target environment from_commitc=$GIT_PREVIOUS_SUCCESSFUL_COMMIT to_commit=$GIT_COMMIT date=$(date +"%Y%m%d-%H%M%S") output_file=/incremental_revisions/${date}_incr_revision denodo_home="/denodo9/" export_sentence="--revision -l $user_vdp -p $user_pass_vdp -h $vdp_uri -f ${output_file}.vql -fc $from_commitc -tc $to_commit -P includeprivileges=yes -P includeproperties=yes" if [ -n "$GIT_COMMIT" ]; then bash ${denodo_home}/bin/export.sh $export_sentence
else if [ "$GIT_PREVIOUS_SUCCESSFUL_COMMIT" != "$GIT_PREVIOUS_COMMIT" ]; then from_commitc=$GIT_PREVIOUS_SUCCESSFUL_COMMIT to_commit=$GIT_PREVIOUS_COMMIT if [ -z $from_commitc ]; then from_commitc=$GIT_PREVIOUS_COMMIT to_commit="HEAD" fi bash ${denodo_home}/bin/export.sh $export_sentence
else echo "No diffs between commits" exit 0; fi fi python3 /tmp/incremental_revisions/create_and_deploy_revision.py $date $from_commitc $to_commit $sm_uri $user_sm $user_pass_sm $env2_id $output_file |
If we isolate the export statement, it would be as follows:
export_sentence="--revision -l $user_vdp -p $user_pass_vdp -h $vdp_uri -f ${output_file}.vql -fc $from_commitc -tc $to_commit -P includeprivileges=yes -P includeproperties=yes" bash ${denodo_home}/bin/export.sh $export_sentence |
For reference, find below the detailed description of the options available for this command. You can find more details in documentation.
export --revision -l <vdp_user> # Mandatory, login -p <vdp_pass> # Mandatory, password -h localhost:9999/admin # Mandatory, VDP connection details -f revision.output # Optional, output file for the revision -fc ab12cd34 # Mandatory, fromCommit, specifies from which commit to generate the revision -tc HEAD ~2 # Optional, toCommit, specifies how far you want to see changes from the fromCommit. If not specified, it's HEAD -fd db1,"db 2" # Optional, specify if you want to limit the revision generation to these databases -P includeUserPrivileges=yes # Optional, include statements for generating users, roles, and permissions -P includeServerProperties=yes # Optional, include VDP parameters -P includeProperties=yes # Optional, in addition to the output file, a file with the .properties extension is generated containing the values of these properties. Sensitive values are encrypted -mp customPassword # Optional and used in conjunction with includeProperties. If specified, encrypted sensitive values will be generated with a custom key to be imported into other environments where the keystore is not shared -P exclude_jdbc_wrapper_properties=yes # Optional, if enabled, properties as CATALOGNAME and SCHEMANAME are interpolated instead of using parametrized values |
Step 8: Deploy the revision in the target environment
Steps to deploy a revision
Once the incremental revision is generated, the steps related to the SM are as follows:
- Creating a revision from VQL
- Deploying the revision
- (Optional) Checking if the revision failed due to missing properties, and in that case, creating them
- (Optional) If the deployment failed and the properties were created, retrying the deployment
- Checking the deployment status after a certain period of time to account for deployment delays.
For this purpose, we have created a Python script to facilitate the modification of the script and adapt it to the needs of each environment.
This would be the script launched from the Jenkins job:
python3 /tmp/incremental_revisions/create_and_deploy_revision.py $date $from_commitc $to_commit $sm_uri $user_sm $user_pass_sm $env2_id $output_file |
In order to run this Python script, it is necessary to have the requests library installed. You can install it using the following syntax. Please note that this command will install the library for all users, not for a specific Python environment:
python3 -m pip install --user requests |
Python Script
Here is the code for the Python script:
import sys import requests import base64 import json import time def convert_to_base64(file_path): with open(file_path, 'rb') as file: file_content = file.read() # Delete UTF8-boom if file_content.startswith(b'\xef\xbb\xbf'): file_content = file_content[3:] base64_content = base64.b64encode(file_content).decode('utf-8') return base64_content def generate_revision_api(): revision_file = f"{FILE_PATH}.vql" base64_content = convert_to_base64(revision_file) # Data needed for the POST request url = f"http://{IP_ADDRESS}/revisions/loadFromVQL" json_data = { "name": DATE, "description": f"Incremental revision between commits {FROM_COMMIT} and {TO_COMMIT}", "content": base64_content } # Make the POST request response = requests.post(url, json=json_data, auth=(USER_SM, USER_PASS_SM)) # Check the response status if response.status_code == 200: # Successful request json_response = response.json() return json_response['id'] else: # Request failed print("Error during generate_revision_api request. Status code:", response.status_code, response.json()) exit(1) def create_properties_api(json_data): url = f"http://{IP_ADDRESS}/environments/{ENV2_ID}/vdpProperties" response = requests.post(url, json=json_data, auth=(USER_SM, USER_PASS_SM)) if response.status_code == 200: # Successful request json_response = response.json() return json_response else: # Request failed print("Error in create_properties_api request. Status code:", response.status_code, response.json()) exit(1) def deploy_revision_api(revision_id): # Data needed for the POST request url = f"http://{IP_ADDRESS}/deployments" deploy_json_data = { "revisionIds": [ revision_id ], "environmentId": ENV2_ID, "description": f"Deployment of incremental revision between commits {FROM_COMMIT} and {TO_COMMIT}" } # Make the POST request response = requests.post(url, json=deploy_json_data, auth=(USER_SM, USER_PASS_SM)) # Check the response status if response.status_code == 200: # Successful request json_response = response.json() if json_response.get('validationResponse', {}).get('validation', {}).get('resultVqlValidation') == 'ERROR': json_list = [] missing_properties = json_response.get('validationResponse', {}).get('validation', {}).get('missingVqlPropertiesWithServerValues').keys() print("Some properties are missing: ", missing_properties) for key in missing_properties: value = get_value_from_file(key, f"{FILE_PATH}.properties") if value is not None: json_data = generate_properties_json(key, value) json_list.append(json_data) if not json_list: exit(1) print("Creating following properties: ", json_list) create_properties_response = create_properties_api(json_list) response = requests.post(url, json=deploy_json_data, auth=(USER_SM, USER_PASS_SM)) json_response = response.json() return json_response else: # Request failed print("Error during deploy_revision_api request. Status code:", response.status_code, response.json()) exit(1) def get_value_from_file(key, file_path): with open(file_path, 'r') as file: for line in file: if line.startswith(key + '='): value = line.strip().split('=', 1)[1] value = value.replace('\\', '') return value return None def generate_properties_json(key, value): json_data = { "name": key, "value": value } return json_data def check_deploy_status(deployId): # Data needed for the POST request url = f"http://{IP_ADDRESS}/deployments/{deployId}/progress" # Make the POST request response = requests.get(url, auth=(USER_SM, USER_PASS_SM)) # Check the response status if response.status_code == 200: # Successful request json_response = response.json() return json_response else: # Request failed print("Error during check_deploy_status request. Status code:", response.status_code, response.json()) exit(1) if __name__ == "__main__": # Check if the expected arguments are provided if len(sys.argv) != 9: print("Usage: python3 create_and_deploy_revision.py date from_commit to_commit ip_address user_sm user_pass_sm env2_id file_path") sys.exit(1) global DATE, FROM_COMMIT, TO_COMMIT, IP_ADDRESS, USER_SM, USER_PASS_SM, ENV2_ID, FILE_PATH # Store the arguments in variables DATE = sys.argv[1] FROM_COMMIT = sys.argv[2] TO_COMMIT = sys.argv[3] IP_ADDRESS = sys.argv[4] USER_SM = sys.argv[5] USER_PASS_SM = sys.argv[6] ENV2_ID = sys.argv[7] FILE_PATH = sys.argv[8] # Call the functions with the provided arguments revision_id = generate_revision_api() deploy_response = deploy_revision_api(revision_id) deploy_id = deploy_response.get('deploymentId') error = True if deploy_id : # Wait 20 seconds to check status of deploy time.sleep(20) deploy_status = check_deploy_status(deploy_id) status = deploy_status['state'] if status != "ERROR" and status != "CANCELLED": error = False print(f"Revision {revision_id} has been deployed. Deploy Id: {deploy_id} has status: {status}.") if error: print(f"More info about deploy_status: {deploy_status}") sys.exit(1) sys.exit(0) |
Properties Management and Secure Value Handling
When generating a revision between two commits/tags, the revision includes environment-dependent parameterized values. The export script provides an option to generate a properties file containing these parameter values, encrypting sensitive values.
During deployments in environment 2, since the revision contains parameterized values, it is necessary to add these parameters to that environment through the Secrets Management (SM) tool. This allows the SM to replace the values with the corresponding ones in that environment.
Hence, for new elements, in the provided Python script, we have included the option to create the parameter through the SM API if it does not exist. This is achieved by generating a properties file, where we search for the missing property keys and values, and create them in the deployment environment.
However, a challenge arises in Denodo 9 regarding the keystore. Starting with Denodo 9, each installation has its own keystore, meaning encrypted values on one machine cannot be imported into another unless both machines share the same keystore/key or are exported with a custom key and imported with the same custom key.
Currently, through the SM API, importing these values using a custom key is not possible. Therefore, it's necessary for both Environment 1 and Environment 2's VDPs (Versioned Data Properties) to have the same key/keystore, just like the SM, which also needs to share that keystore/key.
Looking ahead, when the SM allows importing values via API with a custom password, the export command will include the option to generate the properties file with sensitive values encrypted using a custom key.
Summary
This document presents a refined workflow for efficient database development, leveraging Denodo Virtual Database in conjunction with Jenkins integration.
Developers start their work within their private Virtual Database, subsequently pushing modifications to a Version Control System (VCS) repository. Upon VCS notification, Jenkins orchestrates actions such as pulling updates from the VCS database project into the main database via the Virtual DataPort server. Optionally, integration with the Denodo Testing Tool facilitates automated testing. Incremental revisions are then systematically generated and seamlessly deployed within the target environment, ensuring a streamlined and dependable deployment process.
References
Denodo Testing Tool - User Manual
The information provided in the Denodo Knowledge Base is intended to assist our users in advanced uses of Denodo. Please note that the results from the application of processes and configurations detailed in these documents may vary depending on your specific environment. Use them at your own discretion.
For an official guide of supported features, please refer to the User Manuals. For questions on critical systems or complex environments we recommend you to contact your Denodo Customer Success Manager.