This document is a compilation of best practices and guidelines about promoting elements between environments.
Note: the Solution Manager can work in Standard mode (when the cluster is created manually) and Automated mode (when all the resources are managed by the Solution Manager). Although there are specific sections for the environment configuration, most of the recommendations in this article apply to both working modes: Standard and Automated.
The Denodo metadata evolves over time and has its own lifecycle. Changes made in the Development environment need to be consolidated on Testing or/and Staging to be finally promoted to the Production environment.
In addition, to offer high availability and low latency, companies tend to build large infrastructures clustering the Denodo Servers. Such infrastructure may consist of several geographically distributed clusters, made up of several Denodo Servers sharing the same metadata.
Therefore, administrators need to control the promotion of the changes in the environment keeping the Denodo Servers synchronized in each environment. This complex responsibility includes the following tasks:
- Denodo VDP Servers metadata properties adjustment.
- Denodo VDP Servers metadata synchronization.
- Denodo Scheduler metadata synchronization.
- Denodo Data Catalog metadata synchronization.
- Denodo Cache and Summaries synchronization.
- Backup configuration.
- Rollback configuration.
- User authorization management.
- Deployment progress monitoring.
This document describes the best practices for each of those tasks organized throughout the main metadata lifecycle phases.
First, some best practices for implementation are described. Those best practices (during the implementation phase) will make the promotion easier.
Then, the document describes the best practices regarding the Solution Manager Administration Tool's main features: environment configuration, revision creation, and deployment execution.
Finally, the article explains some limitations and possible workarounds.
Implementation Best Practices
Ensuring a strong naming convention for all elements is necessary so that the administrators (or developers) creating a Revision have no ambiguity about whether what they are doing is appropriate to promote.
In addition, it allows auditing tasks by making clearer who has promoted what.
As part of the promotion tasks, it is often necessary to manually (or automatically) execute some Scheduler Jobs to update the cache, statistics, and/or the summaries. To easily identify the load processes that are needed for each deployment it is recommended to carefully identify the Scheduler Jobs by using meaningful names following the naming conventions.
After promoting the metadata it is necessary to update the cache tables. The easiest way to do it is by automating the execution of Scheduler Cache Jobs.
It could be recommended to have one single Scheduler Job for each individual cached view because it makes more clear which Job has to be executed after the promotion (depending on if the corresponding view has been added to the Revision or not).
However, when working with a very large and complex set of Scheduler Jobs, it is easy to lose track of what must be executed. In that case, it would be recommended to add a single Scheduler Job having all the views that need to be updated on a given schedule. This approach brings better visibility and control over concurrency and a better view of their overall enterprise schedule, but can be very time consuming to execute with every Revision.
Having a single job for each view makes it easy to include the corresponding cache loads in the Revision but having a single job including many cache loads makes the company workload easy, therefore it would be recommended to follow a hybrid approach:
- Scheduler Jobs containing many cached views. These jobs will be executed to update the cached views on the workload schedule and also when a Revision including the full metadata model is executed.
- Scheduler Jobs for each view. These jobs will be (automatically) executed when the corresponding view is included in the Revision (and they are marked for execution). They often do not need their CRON schedule, as they will be largely redundant of the many view job(s).
User and Roles
Companies implementing strong security policies may have different users (and roles) in each environment. In those cases it is recommended to use a naming convention, using prefixes (or suffixes) to distinguish them.
It is also useful to name the users common to different environments with a common prefix (or suffix) like “common_username”.
Environment Configuration Best Practices
Deployment Backup and Rollback
Configuring the automatic backup and rollback is a good practice when dealing with small to medium-sized metadata volumes.
When dealing with significant metadata volume the backup and rollback could be slow. The backup process will not directly impact the business (or QA) users in the target environment, but additional deployments can not start until it is done, and Deployments without service interruption will not restore full cluster load until the backup is complete. If an error occurs during the deployment, however, a long-running rollback could significantly impact traffic user activity in the target environment. In those cases, it will be interesting to delegate the responsibility to an external process such as a database layer backup handled by the external metadata database DBA.
A failed backup can be manually rolled back by creating a new Revision (from the most recent backup) using the Revision from VQL option and deploying that revision in the target server.
In offline promotion scenarios (where can be assumed a service interruption during the deployment) it is recommended to configure the deployment as Simple type. This is recommended, for instance, in Development and Testing environments where a service interruption is not critical.
In Production environments, where an offline update is not possible, it is recommended to configure the deployment Without service interruption type with a half by half strategy (to avoid data inconsistency during the deployment execution).
Note: to deal with the load balancer, the Solution Manager uses the scripts defined in the Deployment Scripts section. These scripts are used to disable the Denodo Servers from the load balancer. Based on the previous recommendation, these scripts will be used mostly in Production environments, however, it is recommended to test the script first in a no-production environment, and for that, the Without service interruption type must be configured (review Configuring Deployments).
Secondary Shared Cache
The Secondary shared cache (cache swap) guarantees that all the Denodo Servers in a cluster will have a valid cache (data and schema) during the deployment (those Denodo Servers not yet been updated are not affected by the changes in the cache).
Warning: the cache swap process could be quite slow since all the Scheduler cache jobs need to be executed.
Note: when configuring Scheduler in Cluster mode each node of the cluster shares the same database (global database). The Secondary Shared Cache (in cluster mode) does only work with the global cache (it does not consider cache defined individually).
The VDP Servers use a shared metadata database option that indicates (when enabled) to the Solution Manager that the Denodo Servers in the target environment share an external database for the metadata.
Warning: Automatic backups and rollbacks are not currently available with this option. However, a failed backup can still be manually rolled back by creating a new Revision from VQL as explained in the Deployment Backup and Rollback section.
Automated Cloud Mode
It is recommended to enable the Minimize Downtime option on the target Environment to maximize the service availability. Solution Manager recreates the cluster, waits for load balancer health checks to complete for the new servers, and then removes the old servers. This option minimizes possible Denodo Server downtime.
Warning: this option does not apply if the environment is configured with an external database to store the metadata.
When the deployment time is not critical, it is recommended to use the option “Update the image and recreate the cluster”. The cluster availability will not be affected during the deployment (and rollback would be automatically performed if needed) but deployments can take several minutes longer due to image creation and cluster recreation.
The option “Perform changes directly in the servers” is usually the fastest but several points must be taken into account:
- The availability of the cluster may be affected during the deployment.
- Because the image is not updated:
- Changes may need to be rolled back manually if something goes wrong.
- The cluster recreation must be done with the option “recreate cluster from server”, otherwise changes will be lost.
This deployment option is more intended for development or testing environments where changes are frequent and it may be needed to prioritize the time saved over the lack of automatically rolling back.
Warning: if the Virtual Data Port cluster has auto-scaling enabled, the option “Perform changes directly in the servers” is only available when the Virtual DataPort cluster uses a shared metadata database.
Revision Creation Best Practices
Who selects the elements and what elements will be selected depends on the development workflow:
- Developers create their own Revisions after completing a task. Such a methodology should usually standardize the naming convention or description of Revisions to reference an ID or similar descriptor from the task management system.
- Developer checks their code in via VCS and the developer’s lead creates Revisions after code review is complete (might group multiple developers’ work into a single Revision). As before, it would be recommended to reference the task ID (or similar).
- Automated workflow where a developer marks code as complete in the task management and an external process uses this to generate a Revision via API should follow a similar methodology: DESC only the elements indicated as new or modified to include in the Revision, and name it according to the task ID.
- Automated CI/CD workflow that performs a daily “full build” of all changes should leverage the time range options on GET_ELEMENTS() to DESC only the elements created or modified since the last build, following a naming convention referencing the interval of such builds.
In all four workflows, multiple Revisions can still be Promoted in one process, as discussed below.
The list of candidate elements will depend on the permissions of the user used to connect to the server when the revision is created. In the server definition dialog for the selected source Environment server, there is the option “Authenticate with current user credentials for creating revisions” and the behavior is the following:
- Enabled: when connecting to the Denodo Server to request the candidate elements, Solution Manager will connect using the current user (logged in the Solution Manager).
- Disabled: when connecting to the Denodo Server to request the candidate elements, Solution Manager will connect using the user from the server definition dialog (User field).
This option further controls a user’s access beyond being able to create revisions. By authenticating the same user against the source server, the candidate elements for the Revision are limited to those users that have access to it.
The option “Authenticate with current user credentials for creating revisions” is a good approach when the revisions are created in the Solution Manager by the developers making sure they only have access to their own elements. In this way, a user can not accidentally include elements that may not be ready for promotion.
The revision granularity will depend on the use cases and the metadata lifecycle. There are three broad strategies for creating Revisions:
- Promote everything.
- Promote a database.
- Promote selected elements.
It is recommended to create revisions including everything (or whole databases) when it is the first deployment of implementation, a large refactoring of the whole model, or when the metadata model does not (yet) contain many elements. In most cases, after the initial deployment, the recommended option is to create a revision with just the specific elements that have changed (or have been created).
Warning: take into account that the more elements included in a given Revision, the harder it will be to identify when a distinct change was introduced, obscuring the auditing and traceability of change history. Thus, the more granular a Revision can be defined, and the more distinct the associated change task that drove it, the more clear the history of any given object can be.
Cache Data Sources
When the target environment contains multiple clusters that do not share the same cache (for instance in geographically distributed implementations with a local cache for each cluster), the cache data sources should NOT be included in CREATE revisions. (Otherwise, during the deployment process, the cache configuration will be overwritten with the single value from the Environment properties.) Any changes to the cache configuration must be manually configured for each distinct cluster.
In such scenarios, to protect from accidental inclusion, the cache data source should be isolated with minimal, if any, non-admin access granted, and the source (Development) servers used for Revisions should be sure to Enable the “Authenticate with current user credentials for creating revisions” option. In this way, the majority of users responsible for creating Revisions do not have access to accidentally include the cache in any Revisions.
Solution Manager offers the option to Execute job when revision is deployed. Although there are some situations where the use of this functionality is justified (after updating a summary to refresh the content, for instance), it is not recommended to enable this option because it could considerably increase the deployment execution times, depending on the type of job. Instead, coordinate deployments with the next planned execution of such jobs so that their existing schedule will follow shortly after the deployment. The tolerance for how soon “shortly” needs to be will vary based on the type of job and business expectations.
VDP Cache Jobs
In cluster environments, the Solution Manager assumes that the cache is shared. Therefore, when a revision includes Scheduler Cache Jobs that have to be executed after the deployment, they will be executed just in the first Denodo Server.
The Solution Manager does not enforce the execution of the cache jobs. It is the responsibility of the person creating the revision to mark the necessary jobs for execution as part of the revision configuration. Then, when the revision is deployed the Solution Manager will launch the jobs marked.
The Solution Manager also launches the Scheduler Cache Jobs (included in the revision) when the swap cache is configured (even if they are not marked for execution).
Warning: it is recommended to share the cache in cluster environments (and it is assumed by the Solution Manager), therefore, when the cache is not shared it would be needed to create Scheduler Cache Jobs to load every single cache in the cluster and to execute then after a deployment (either marking them for execution or manually starting them).
Warning: swap cache option only applies to those deployments with at least one CREATE revision that contains a Scheduler VDP Cache job.
Summary Refresh Jobs
When the revision includes a summary update, the Scheduler Job to update the summary must be executed after the revision deployment.
Users and Roles
Users, roles, and privileges could be added as part of a revision. When a user or role is selected, the revision will include only the VQL to create (or update) these users or roles (not their privileges). The privileges are automatically added when the element (over which the user or role has the privilege) is included in the revision.
Therefore, if any privilege changes the user (or role) must be added to the revision together with the element over which the user or role has the privilege (even if the element has not changed). Therefore, in order to successfully promote the desired privileges, there are some recommendations to follow depending on if the users (and roles) are different in the source and target environment or not:
- Users and roles are consistent through environments.
When a revision may contain changes in the privileges, it is recommended to always include all the users and roles in a revision (option “Include users and privileges” in the revision wizard).
- Users and roles in the target exist on the source.
When a revision may contain changes in the privileges, it is recommended to always include all the target users and roles in a revision (for instance, production users).
- Users and roles are different between environments.
In this case, the users and roles can not be selected via the Revision Wizard, as they do not exist in the source. The appropriate GRANT statements can be issued via a Revision from VQL, which can then be Deployed via standard methods. This scenario does not require the elements to be included in the same VQL, but elements must exist in the target environment before the privilege Revision is Deployed. As such, any new elements must be Deployed prior to the Revision that GRANTs privileges on them. Note that creating a Revision from VQL requires Solution Manager Administrator privileges, as it would otherwise violate the security around Revision creation. Do not GRANT access directly in higher (non-Dev) environments without using the Promotions process, as this would subvert the backup/rollback functionality, negate the ability to effectively track the changes, and require redundant tasks among multiple nodes, with a high chance for human error.
Warning: the approach of “Include users and privileges” makes the privilege management easier but has the inconvenience that some unnecessary users or roles may be promoted. The alternative will be to promote just the user or roles affected by the selected elements.
Promotion Workflow Best Practices
The promotion workflow is tied to the use case and company practices. Therefore, this section presents a series of good practices that should be taken into account but adjusted to the company's needs.
The development environment experiences frequent change. Those changes must be tracked using a Version Control System. However, once the change has been consolidated and the scope of the changes is clear, this must be reflected in the Solution Manager by creating a revision including all the changed or updated elements.
Each company must define the promotion responsibilities (who has to do it and what has to be done). This is an example of development iteration:
- Developer: performs a change in the implementation.
- Developer: pushes the changes to the VCS.
- Development Lead: pulls the changes (from the VCS) to the Development “build” Server.
- Developer or Development Lead: creates a revision (or revisions) on the Solution Manager reflecting the changes. Each Revision should be validated against the target environments to identify any Properties that are not yet defined in Solution Manager.
- Promotions Administrator or Target Environment Metadata Manager: adds (or updates) the Virtual DataPort Properties (environment level) and/or Scheduler Properties (cluster level) needed to deploy the revision in the target environments.
- Promotions Administrator or Developer with Promotion rights: promotes one or more revisions to the next highest environment.
- Promotions Administrator or Developer with Promotion rights: reviews the results of the promotion to confirm success or failure, taking appropriate action. When automatic backups are configured, this should include validating the backup success, especially if the target environment is configured for automatic Rollbacks.
Note: step number four may include one or several revisions depending on the promotion granularity. In any case, meaningful names and descriptions must be used to reflect the content of the revision(s).
Note: this example is a very simple process as the promotion between environments must include testing steps before promoting to the next highest environment. Also, the process shows only a few actors, however, different Global Privileges could be used to define and limit the actor roles.
When an element is renamed it is not possible to directly promote the action to other environments using the Solution Manager as it does not support name change tracking (it will be seen as a new element instead).
There are three possible workarounds to correctly reflect the changes in higher environments: rename the element in the target environment, drop and create the element again in two distinct Revisions, or Promote a RENAME command via a Revision from VQL.
The first option is strongly discouraged, as it evades the change tracking of a strong Promotions process, and introduces the possibility of human errors or inconsistent states in multi-node environments.
The second alternative is to use the Solution Manager to create a DROP revision (before renaming the element) and then a CREATE revision. By Promoting both Revisions, the environments will be synchronized. However, this approach can be quite cumbersome when dealing with dependencies (a data source for instance).
The recommended approach would be the third one: create a revision from VQL containing the RENAME statements. This approach will bring more control and traceability.
Note: in general, it is not recommended to rename elements (it must be deprecated instead).
Graphically created Revisions will never issue an ALTER ROLE revoking privileges. Privilege changes through Solution Manager are additive only. Therefore, revokes need to be handled via Revision from VQL.
Web Services Deletion
Solution Manager at the moment allows to delete web service operations, but not the Web Service itself. Therefore, as explained with the RENAME statements, the recommendation would be to create a revision from VQL dropping the Web Service, in order to make sure that, after executing the revision promotion, source and target environments will be correctly synchronized.
Note: the VQL must include the database CONNECT and CLOSE statements.