Path Types in Virtual DataPort¶
When you create a data source of the type XML, JSON or delimited text file (DF), you have to specify a path to the data files. Virtual DataPort can retrieve the data files from these locations:
Local: retrieves the file from the local file system. See section Local Path.
From Variable: the data is not obtained from any file, instead the content of the file is provided by the client in the request. See section Path From Variable.
HTTP: retrieves the file sending an HTTP request. See section HTTP Path.
Denodo Browser: retrieves the file executing an ITPilot sequence. See section Denodo Browser Path.
FTP / SFTP / FTPS: retrieves the file from an FTP server. See section FTP / SFTP / FTPS Path.
HDFS: retrieves the file from a HDFS file system. See section HDFS Path.
S3: retrieves the file from a S3 bucket. See section S3 Path.
For all these path types except for “From Variable”, you can click the button Test Connection (next to “Ok”) to check that the path is accessible. If you entered interpolation variables in the path to the file, you will have to provide their values before creating the data source, to be able to test the connection.
The DF (delimited files) and JSON data sources have an option to specify the encoding of the data. If the check box Autodetect encoding is selected, the Server will try to detect the encoding of the file when the base views of the data source are queried. If you clear this check box, select the encoding of the file in the Charset Encoding drop-down list.
Note
It is recommended to indicate the encoding if known.
The autodetecting feature has the following limitation: The detection is made on the first characters from the response, and these characters may not be representative enough of the entire content, so an incorrect charset could be selected.
Compressed or Encrypted Data Sources¶
Virtual DataPort can access data files that are:
Compressed with ZIP
Compressed with GZIP.
Encrypted with the algorithm “Password-Based Encryption with MD5 and DES” (PBE with MD5 and DES). This encryption method is described in the Java Cryptography Architecture (JCA) Reference Guide.
This applies to all the types of paths except from “from variable”. In their configuration, the available options are:
Decompress (zip format). Select if the data file is a ZIP compressed file.
Decompress (gzip format). Select if the data file is a GZIP compressed file.
Decrypt. This filter decrypts data that was encrypted with the algorithm PBE with HMAC-SHA256-512 and AES-256 or with PBE with MD5 and DES. If you select this filter, provide the password that was used to encrypt the file.
PBE with HMAC-SHA256-512 and AES-256 stands for Password-based encryption using the algorithms HMAC-SHA256-512 and AES-256.
Note
The filter Decrypt - PBE with MD5 and DES is deprecated. If you can choose how the input file will be encrypted, we recommend using PBE with HMAC-SHA256-512 and AES-256 because it is securer.
See more about this in the section Features Deprecated in Denodo Platform 8.0.
Custom. You can develop your own custom filters to preprocess the data before is processed by Virtual DataPort. See more about custom filters in the section Custom Input Filters.
If an external application generates a DF, JSON or XML file to be read
by Virtual DataPort, you can encrypt it for security reasons. Then, when
creating the data source to access these data, select the “Decrypt”
filter. Virtual DataPort provides a tool to encrypt files with the algorithm “PBE with HMAC-SHA256-512 and AES-256” and with “PBE with MD5 and DES”. To do this, decompress the file <DENODO_HOME>/tools/db/denodo-crypt-tools/denodo-crypt-tools.zip
. This file contains a README file that explains how to:
Use the script
<DENODO_HOME>/tools/db/denodo-crypt-tools/bin/encrypt
to encrypt files and store the output in another file. You can also encrypt data provided by the “standard input”.Integrate the library “denodo-crypt-tools.jar” into your Java application to encrypt and decrypt data.
Custom Input Filters¶
Besides the input filters listed above, Virtual DataPort provides a Java API that allows you to develop filters that preprocess the data from the source in any way you need. This allows you to transform the input data in any way you like before it is processed by Virtual DataPort.
To develop a custom filter, see the section Developing Custom Input Filters of the Developer Guide. Once you have developed and imported the filter, you can assign it to a DF, JSON or XML data source.
To assign a filter to a data source, open the dialog to configure the path to the data file (delimited file, XML or JSON data sources). In this dialog, click the Custom option button (for “HTTP client” routes, this button is located in the Filters tab) and then, Edit. The Tool will display a dialog like the Assigning a custom filter to a DF data source.
Select the Select jars check box and then, select the class name of the custom filter from the Class name box.
If the custom filter has input parameters, click Add parameter and then, enter a name and a value for each parameter. If you select the Encrypted check box for a parameter, the value of the parameter will be hidden and sent encrypted to the Server. To remove a parameter, click .
Once you have finished, click Ok to close the dialog and continue configuring the data source as usual.
Paths and Other Values with Interpolation Variables¶
Paths and other parameters of data sources can be parameterized by adding interpolation variables to their value. By using these variables, you do not have to provide the value of a field at design time (when creating the data source). Instead, you will provide the values of these variables at runtime, when you query a view that uses the data source.
For instance, if you create an XML data source that retrieves the XML file from a Web server and you do not know the full path of the file when creating the view, you can enter an URL like “http://acme.com/reports/@{DEPT_NAME}.xml”
In this URL, DEPT_NAME
is an interpolation variable, which means
that the string @{DEPT_NAME}
will be replaced at runtime by the
value of the variable. The clients that query the views over this data
source will have to provide this value in the WHERE
clause of the
query.
When you create a data source with interpolation variables, the base views created over this source will have a field for each interpolation variable. The name of the field will be the same as the name of the variable. If the data obtained from the source has a field that conflicts with the name of the variable, the field from the source will be renamed.
In our example, the base views created over this XML data source will
have an extra field called DEPT_NAME
. When a client executes a query
that access this source:
SELECT *
FROM ...
WHERE DEPT_NAME = 'it'
the string @{DEPT_NAME}
will be replaced by it
. So, the Server
will retrieve the XML file from the URL http://acme.com/reports/it.xml
When you begin creating a base view over a data source that has interpolation variables, the Tool will display a dialog to enter the value of these variables (see Providing input values to interpolation variables). The Server will use these values to query the source and calculate the schema of the new base view.
Note
If you enter a literal that contains one of the special
characters used to indicate interpolation variables (@
, {
or
}
), in a parameter that accepts interpolation variables, you have to
escape these characters with \
.
For example, if the path to a file is like this C:/dir}/file.xml
,
you have to escape the brace like this: C:/dir\}/file.xml
In Windows systems, the character \
is used to specify file
paths, causing some ambiguous situations. For example, the Server could
interpret that the path C:\tmp\@{var}
references a file named
tmp@{var}
in the C:\
directory because the second
\
, due to its appearance just before @
, is
considered an escape character. Therefore, you have to escape the
\
character.
For example, the C:\\tmp\\@{var}
path references a file which
name is the value of the var
variable and it is located at
C:\tmp
.
The section Execution Context of a Query and Interpolation Strings of the VQL Guide explain this in more detail.