General Structure of a Job¶
The following are the different types of jobs supported by Denodo Scheduler:
VDP: allow data to be extracted by querying views or processes stored from Denodo Virtual DataPort.
VDPCache: allow querying views from Denodo Virtual DataPort to preload its cache. In order to preload the cache of a view, it must have its cache enabled and properly configured. With one job of this type, multiple views can be preloaded.
VDPIndexer: allow querying views from Denodo Virtual DataPort to generate an index that is then used by the Data Catalog. This index contains a document for each field of each tuple retrieved from VDP.
VDP jobs can also execute DDL (CREATE, DROP, ALTER, TRUNCATE, …) and DML (INSERT, UPDATE, DELETE, CALL, …) statements.
All jobs have a name, a description, and share the job retries section (Retry Section), the result handlers section 1 (Handlers Section), as well as time-based scheduling section (Triggers Section). Also, VDP and VDPIndex jobs share the exporters section 2 (Exporters Section). The extraction section (Extraction Section) is different for each type of job and will be discussed in detail for each type in the following sections.
In the extraction section (Example of extraction section) the data source from which the data is obtained is specified using a previously created data source (see section Data Sources). Different configuration data needs to be supplied depending on the type of job.
In the exporters section (Example of exporters section) a list of exporters can be specified to dump the results into one or more external repositories. Denodo Scheduler supplies implemented exporters to CSV files (this exporter can also be used to generate files compatible with Microsoft Excel), relational databases which have a JDBC adapter, Elasticsearch and Scheduler Index. It also allows new exporters developed for special ad-hoc needs to be used.
In the retries section (Example of the retries section) the user can enable retries for the job and configure how and when the retries will be executed. In any case, a job will only be retried if there were errors during its execution or the number of exported documents was less than expected.
In the handlers section (Example of handlers section) actions to be performed once the extraction and exportation of all the tuples of a job have finished are specified. It allows, among other actions, sending an e-mail with the execution summary of a job to a series of e-mail addresses. It also allows the use of new handlers developed for custom needs.
Lastly, each job defines scheduling data that specifies when it will be executed, as shown in Example of triggers section. The current configuration allows similar features as the classic cron application for UNIX systems.
While creating or editing an existing job, there are several options to save its configuration, all of them accessible in the “Save” drop-down menu:
“Save” (default action). When using this option, the job’s configuration is validated so all the necessary parameters have a defined value.
“Save draft”. An incomplete job can be created by using this option. In this case, the only mandatory field is the name of the job. A draft job is a potentially incomplete job; thus, it cannot be started in the Scheduler perspective. A draft job can be edited like the rest of the jobs. After completely filling in all the mandatory fields of a job, pressing the Save button will create an executable job.
“Save as…” This option allows creating a new job with the current configuration and a new name. This option will only be enabled while editing a previously existing job.
You can create a new job with the same configuration as an existing one. To do this, open the job you want to clone and press the button “Clone”. Then provide a name for the new job, and it will be created with this name and the same configuration as the original one. This is similar to the “Save as…” option explained before, but without the need for being editing the selected job.