Goal
This document describes the considerations that you should keep in mind before starting to use branching in the Virtual DataPort integration with Version Control Systems.
Introduction
Virtual DataPort allows integration with a Version Control System. You can review this functionality in the Version Control Systems Integration section of the Virtual DataPort Administration Guide. Denodo supports integration with Subversion (SVN), Git and Microsoft TFS VCS servers.
However, starting in Denodo 9, Denodo supports Git, including the support for Git repositories by Microsoft TFS. Integration with Subversion (SVN) and Microsoft TFS are now deprecated.
When developing in Denodo using VCS integration some users have to define multiple teams working on the same project but in different areas of the project. Those teams want to work in a collaborative environment (doing push to the repository so that those changes can be used by their team members) but without affecting the other teams.
Starting on Denodo 9.2, a new feature called development workspace has been introduced to simplify multiple and collaborative development in the Denodo Platform. Development workspaces are isolated areas created based on Git branches, where developers can create, edit, remove and test elements without affecting others.
To set this up, the first thing that comes to mind of users that have experience with code development and VCS is to define the workspaces and branches following these steps:
- To integrate the main workspace to the master branch in the git repository
- To define one workspace for each team, which automatically creates a branch in the git repository.
- To have the developers on each team working on their own workspace and synchronizing their work on its associated branch.
- To merge the branch with the master branch once the feature is developed.
This approach, traditionally followed in code development, can be applied to Denodo development, but some considerations need to be taken into account regarding the usages of branches when working with VCS in Denodo.
Starting on Denodo 9.2, Virtual DataPort integration with Version Control Systems now provides the functionality for merge operation when workspaces are created aside from the main workspace, which allows developers to merge the changes from another branch into their current branch. Users can now 1) merge from the main workspace or master branch and 2) merge from a remote branch in Git into their current workspace. This will allow the developers to merge the latest available changes from the main branch and from any other remote branches and sync other people’s work into their own branch.
But it is important to keep in mind that the following actions need to be done by an external tool like any Git client:
- merging branches and resolving conflicts of the same file by combining the code changes
- gated operations with approval flow like merge request or pull request that merge from the developer’s branch to the protected master branch
Virtual DataPort integration with Version Control Systems does not provide the functionalities to edit code inline to resolve merge conflicts or to provide approval for merge request or pull request.
So we recommend reviewing this document in order to decide if using branches fits your scenario based on the limitations described below.
Branching Limitations
The most common problem that can happen when working with multiple branches is that you can get to a situation where the VQL generated after doing the merge between several branches is incorrect and the resolution of those conflicts is complex.
The reason for this is that the VQL is not code. You cannot compile it to validate it. The problems are the changes to elements that have implications in other elements. This same problem will happen while working with any conventional database, not just with VQL and Denodo. The DDL sentences for your SQL need to be correct in order to load them into the RDBMS.
When importing a VQL or performing merge operations with no conflicts in a Virtual DataPort server, the server easily controls the change propagation and detects the problems. But doing the merge and resolving conflicts with an external tool by only looking at the VQL is not that easy.
Let’s explain this with a simple example. Imagine that we have 2 views, (department and employee) and 2 branches are created for 2 different features:
- project1_branch_feature1: renames one of the employee fields and creates a new view on top of the department and employee views.
- project1_branch_feature2: creates a new view on top of the department and employee views.
In the next step, an integrator will merge both branches with the integration branch.
This merge can happen without any conflict, but that there are no conflicts during the merge does not mean that everything is fine. In this scenario, the views created in project1_branch_feature2 on top of the employee table will fail because they are using a field that does not exist anymore (it was renamed by project1_branch_feature1). This means that the pull of the integration branch is going to fail because some elements cannot be loaded (missing field) and the VQL will need to be fixed from an external tool to solve this problem.
This is the most simple situation that can happen, but the more complicated your merge gets, the more difficult it will be to review the impact of the changes in an external tool. This becomes even more difficult if external jars or global elements like i18n maps are used.
Branching Best Practices
The recommendations to minimize these problems are:
- Enable workspaces for each developer or each feature/change to easily manage and isolate development work on Git branches.
- If workspaces are used, manage all external jars or global elements like i18n maps in the Main workspace. This ensures a single source of external jars and global elements and reduces conflict management across other workspaces.
- In most cases, branches can add new elements or modify elements provided that there is no intersection between the elements added or modified by the different branches. This can be achieved by following a good set of naming conventions and development collaboration.
- If each project depends on common projects or elements to be added or modified in different branches, developers must delegate the common elements among each other and coordinate the sequence they push the changes to the master branch to avoid and reduce merge conflicts. Then, each developer should work on their own workspaces and branches where they can merge the common elements changed by other team members.
- Do merges on a frequent basis in order to minimize the conflicts that you can find.
- When conflicts are found, the integrator should coordinate the responsible person for each branch in order to avoid those problems.
Alternatives to the usage of branches
One reason for deciding to use workspaces and branches is to let the developers work on some functionality in a collaborative environment without affecting other users. There is one alternative for achieving that objective without using branches for those scenarios where the limitations described above prevent the usage of branches.
Use the same branch for the whole project (it can be the master branch) so the VCS repository will keep track of the changes for the different project sprints or phases.
When a team wants to start the development of a new feature without affecting the main project, you can create a new database for the new development using the Import Database feature of the VDP Administration Tool and all the developers working on that functionality can work together in the same database directly in the server.
Once the development is done, they can perform a push to the VCS repository so the administrator can start testing that new functionality before promoting it.
The process in the VDP server in the Development environment will be something as follows:
- db_project_1 is a database synchronized with the remote database in the master branch.
- There are several features being developed and each one has its own database: db_project_1_featureN (using db_project_1 as the remote database in VCS)
- Multiple developers connect to db_project_1_featureN database to work in the new feature as described in the scenario "Centralized workflow with shared databases" in the Scenarios and Recommended Uses section.
- During the development of featureN:
- Changes on db_project_1_featureN database are not pushed to the remote repository.
- Periodically, db_project_1_featureN database is updated to integrate the new changes in db_project_1.
- When the development of featureN is completed:
- db_project_1_featureN database needs to be updated.
- conflicts/problems need to be resolved.
- After that, commit/push the changes to the remote database in VCS.
- Integrator user should stop other changes to be pushed to the remote repository.
- Integrator user should update the db_project_1 database in the Development environment, and perform the new feature tests.
- When testing is completed, other teams can start to commit/push changes to the db_project_1 database.
This approach allows developers to access the changes of other developers since they are developing directly on the database. However, you need to keep in mind that, in order to avoid conflicts, they must coordinate among themselves to prevent modifying the same elements simultaneously.
References
Scenarios and Recommended Uses
The information provided in the Denodo Knowledge Base is intended to assist our users in advanced uses of Denodo. Please note that the results from the application of processes and configurations detailed in these documents may vary depending on your specific environment. Use them at your own discretion.
For an official guide of supported features, please refer to the User Manuals. For questions on critical systems or complex environments we recommend you to contact your Denodo Customer Success Manager.

