Advanced Configuration of the Sequence and Next Interval Iterator Components¶
The ‘Advanced’ tab of the Sequence component configuration wizard allows providing an alternative navigation sequence in order to load the optional page value received as input by the component while the ‘Denodo Browser’ tab allows configuring advanced options if Denodo Browser has been chosen to execute the sequence. The following subsections describe each of these aspects. These options are also available in the rest of components that execute navigation sequences: ExtractorSequence (see section Extractor Sequence), Next Interval Iterator (see section Next Interval Iterator) and Form Iterator (see section Form Iterator).
Loading a Page Received as Input Parameter¶
The Sequence component has a page-type element as optional input parameter. In these cases, the page will be loaded before executing the configured navigation sequence. This is useful when the sequence must continue from a previously browsed page.
Even though it is not required in the example provided in this guide (nor in most applications), it is possible to configure how ITPilot performs the loading of this input page by using the ‘Advanced’ tab. The following graphical elements are used:
- Check box “Use Custom Sequence for Restoring Input Page”. It will be checked when the user wants to load the input page by him/herself. In the default option ITPilot generates a GET/POST http request to load the page.
- If the previous check box has been checked, the user may load a specific navigation sequence by using the buttons “Load from File” or “Import from Browser”, as it was previously explained for the Sequence component (see section Web Browsing Automation). This can be useful if the input page load process performed by default by ITPilot does not worked adequately or can be optimized by means of an ad hoc sequence.
The sequence can be left blank if you know that a restore sequence is not needed (in that case, leaving it blank improves performance).
Preferences of Denodo Browser¶
In order to increase the efficiency of the Denodo Browser, components that execute sequences using this browser can be configured to optimize the sequence execution in two exclusive ways:
- Automatic Optimization: A test execution of the sequence is performed to detect what elements of the page are needed to execute the sequence. During that execution some information is collected to identify those elements, and in subsequent executions only the necessary elements are loaded.
- Manual Optimization: some optimization parameters can be manually set.
To enable the automatic optimization, click on the “Automatic Optimization” check box. Then, press the “Collect Optimization Information Now” button. This action will perform a test execution of the sequence to detect what elements of the accessed pages are needed and Denodo Browser will collect information to identify them. At execution time, Denodo Browser will use that information to load only the relevant elements of the pages and discard those that are not needed for the correct execution of the sequence. The button “Clear Optimization Information” allows removing the information collected by the automatic optimization process. Note that, the information collection process needs to be manually rerun every time the sequence is changed.
Once the wrapper is deployed on the server, the Denodo Browser automatic optimization (i.e. the use of the optimization information collected during the generation of the wrapper) can be enabled or disabled from the ITPilot web administration tool.
To enable the manual optimization, click on the “Manual Optimization” check box. When it is checked, then the following parameters can be configured:
- Use simplified DOM: if this option is checked, the browsers will not
build the complete DOM tree of the page, but a simplified one that
considers only the link-type HTML elements and the form ones. If the
required navigation in that source can be performed with only those
types of elements, the use of this option can be advisable in order
to generate a more efficient execution. The HTML tags considered are
SCRIPT. Therefore, NSEQL commands will only work with these tags. Appendix D: Constraints of the Simplified DOM describes further constraints with respect to these tags.
- Disable Iframes: if selected, the browsers will not download iframes content. If this content is not needed, this will improve performance.
- Exclude domains: list of domains from which no content is going to be downloaded. For instance, this is useful to avoid the downloading of ads or other content types that can slow down the navigation, and are not necessary.