Specifying Paths in Virtual DataPort¶
To create a DF, Excel, JSON or XML data source you have to specify a path to the data. The syntax to specify this path is common between these data source types: DF, Excel, JSON, XML and Custom.
<route> ::=
LOCAL { 'LocalConnection' | 'VariableConnection' } <path:literal>
[ FILENAMEPATTERN = <literal> ]
[ CHARSET = <encoding:literal> ]
| HTTP
{
'http.CommonsHttpClientConnection [ <option> {, <option> }* } ]'
}
{ GET | POST } <uri:literal>
[ POSTBODY <body:literal> [ MIME <mimetype:literal> ] ]
[ HEADERS (
<header name:literal> = <header value:literal>
[, <header name:literal> = <header value:literal> ]*
) ]
[ <pagination settings> ]
[ CHECKCERTIFICATES ]
[ <http_authentication> ]
[ <proxy> ]
[ HTTP_ERROR_CODES_TO_IGNORE (<integer> [, <integer> ]*) ]
[ CHARSET = <encoding:literal> ]
| FTP 'ftp.FtpClientConnectionAdapter' <uri:literal> <login:literal>
{
<password:literal> [ ENCRYPTED ]
| SSH_KEY = <SSH key encoded in Base64:literal> [ ENCRYPTED ]
[ SSH_KEY_PASSWORD = <literal> [ ENCRYPTED ] ]
}
[ FILENAMEPATTERN = <literal> ]
[ CHARSET = <encoding:literal> ]
[ PASSIVE = <boolean> ]
[ EXPLICIT = <boolean> ]
[ EXECUTEPROTCOMMAND = <boolean> ]
[ PROTECTIONLEVEL = { C | P }]
[ PBSZ = <integer> ]
| HDFS 'hdfs.HdfsConnection' <uri:literal>
[ FILENAMEPATTERN = <literal> ]
[ <hdfs_authentication> ]
[ HADOOP_CUSTOM_PROPERTIES = ( [ <literal> = <literal> [, <literal> = <literal> ]* ] ) ]
[ CHARSET = <encoding:literal> ]
| S3 'hdfs.S3Connection' <uri:literal>
[ FILENAMEPATTERN = <literal> ]
[ <s3_authentication> ]
[ CUSTOM_PROPERTIES = ( [ <literal> = <literal> [, <literal> = <literal> ]* ] ) ]
[ CHARSET = <encoding:literal> ]
| ABFS 'hdfs.AbfsConnection' <uri:literal>
[ FILENAMEPATTERN = <literal> ]
[ <azure_authentication> ]
[ CUSTOM_PROPERTIES = ( [ <literal> = <literal> [, <literal> = <literal> ]* ] ) ]
[ CHARSET = <encoding:literal> ]
<pagination settings> ::=
PAGINATION_SETTINGS (
PAGE_SIZE_PARAMETER = <literal>
PAGE_SIZE = <integer>
PAGE_NUMBER_PARAMETER = <literal>
FIRST_PAGE_INDEX = <integer>
OFFSET_FOR_NEXT_REQUESTS = <integer>
MAX_NUMBER_OF_REQUESTS = <integer>
)
| PAGINATION_SETTINGS (
PAGE_SIZE_PARAMETER = <literal>
PAGE_SIZE = <integer>
NEXT_TOKEN_PARAMETER = <literal>
NEXT_TOKEN_PATH = <literal>
[ MAX_NUMBER_OF_REQUESTS = <integer> ]
)
| PAGINATION_SETTINGS (
PAGE_SIZE_PARAMETER = <literal>
PAGE_SIZE = <integer>
NEXT_TOKEN_PATH = <literal>
[ MAX_NUMBER_OF_REQUESTS = <integer> ]
)
<http_authentication> ::=
AUTHENTICATION {
OFF
| BASIC ( <credentials> )
| BASIC ( WITH PASS-THROUGH SESSION CREDENTIALS ( <credentials> ) )
| DIGEST ( <credentials> )
| DIGEST ( WITH PASS-THROUGH SESSION CREDENTIALS ( <credentials> ) )
| NTLM ( <ntlm_credentials> )
| NTLM ( WITH PASS-THROUGH SESSION CREDENTIALS ( <ntlm_credentials> ) )
| OAUTH10A (
CLIENTIDENTIFIER = <literal>
CLIENTSHAREDSECRET = <literal> [ ENCRYPTED ]
[
ACCESSTOKEN = <literal> [ ENCRYPTED ]
ACCESSTOKENSECRET = <literal> [ ENCRYPTED ]
]
SIGNATUREMETHOD = { HMAC_SHA1 | PLAINTEXT }
[
TEMPORARYCREDENTIALREQUESTURL = <literal> { GET | POST }
RESOURCEOWNERAUTHORIZATIONURL = <literal>
TOKENREQUESTURL = <literal> { GET | POST }
REDIRECTURL = { OOB | DEFAULT | <literal> }
]
)
| OAUTH20 (
ACCESSTOKEN = {
<token:literal> [ ENCRYPTED ]
| VARIABLE <name of the variable:literal>
}
REQUESTSIGNINGMETHOD = {
HEADER
| FORM_ENCODED
| URL [ <query parameter name:literal> ]
}
{
AUTHENTICATION_GRANT = PASSWORD_GRANT (
USER_IDENTIFIER = <literal>
USER_PASSWORD = <literal> [ ENCRYPTED ]
)
[ TOKENENDPOINTURL = <literal> ]
[ EXTRA_PARAMETERS_OF_REFRESH_TOKEN_REQUEST (
<parameter name:literal> = <parameter value:literal> [ ENCRYPTED | HIDDEN ]
[ , <parameter name:literal> = <parameter value:literal> [ ENCRYPTED | HIDDEN ] ]+
)
]
CLIENTIDENTIFIER = <literal>
CLIENTSECRET = <literal> [ ENCRYPTED ]
[ AUTHENTICATION_METHOD_OF_AUTHORIZATION_SERVERS =
{ HTTP_BASIC | REQUEST_BODY } ]
[ REFRESHTOKEN = {
<token:literal> [ ENCRYPTED ]
| VARIABLE <name of the variable:literal>
}
]
[ ACCESSTOKENEXPIRESIN =
<access token expires in # seconds:long> ]
]
|
AUTHENTICATION_GRANT = CLIENT_CREDENTIALS_GRANT
[ TOKENENDPOINTURL = <literal> ]
[ EXTRA_PARAMETERS_OF_REFRESH_TOKEN_REQUEST (
<parameter name:literal> = <parameter value:literal> [ ENCRYPTED | HIDDEN ]
[ , <parameter name:literal> = <parameter value:literal> [ ENCRYPTED | HIDDEN ] ]+
)
]
CLIENTIDENTIFIER = <literal>
CLIENTSECRET = <literal> [ ENCRYPTED ]
[ AUTHENTICATION_METHOD_OF_AUTHORIZATION_SERVERS =
{ HTTP_BASIC | REQUEST_BODY } ]
[ ACCESSTOKENEXPIRESIN =
<access token expires in # seconds:long> ]
]
|
AUTHENTICATION_GRANT = CODE_GRANT
[ TOKENENDPOINTURL = <literal> ]
[ EXTRA_PARAMETERS_OF_REFRESH_TOKEN_REQUEST (
<parameter name:literal> = <parameter value:literal> [ ENCRYPTED | HIDDEN ]
[ , <parameter name:literal> = <parameter value:literal> [ ENCRYPTED | HIDDEN ] ]+
)
]
CLIENTIDENTIFIER = <literal>
CLIENTSECRET = <literal> [ ENCRYPTED ]
[ AUTHENTICATION_METHOD_OF_AUTHORIZATION_SERVERS =
{ HTTP_BASIC | REQUEST_BODY } ]
[ REFRESHTOKEN = {
<token:literal> [ ENCRYPTED ]
| VARIABLE <name of the variable:literal>
}
]
[ ACCESSTOKENEXPIRESIN =
<access token expires in # seconds:long> ]
]
[
AUTHORIZATIONSERVERURL = <literal>
[ REDIRECTIONENDPOINTURL { DEFAULT | <literal> } ]
[ SCOPES = <scope 1:literal> [, <scope n:literal> ]* ]
SETSTATEPARAMETER = { TRUE | FALSE }
]
}
)
| TWO_WAY_SSL (
CERTIFICATE = <literal> [ ENCRYPTED ]
[ CERTIFICATE_PASSWORD = <literal> [ ENCRYPTED ] ]
)
| ON (
WITH OAUTH PASSTHROUGH SESSION CREDENTIALS
TOKEN_PASSTROUGH_STRATEGY = { USING_TOKEN_PASS_THROUGH | USING_TOKEN_EXCHANGE_FLOW | USING_OBO_FLOW }
(
TOKENENDPOINTURL = <literal>
CLIENTIDENTIFIER = <literal>
CLIENTSECRET = <literal> [ ENCRYPTED ]
OAUTH_USER = <literal>
OAUTH_PASSWORD = <literal> [ ENCRYPTED ]
SCOPES = <scope 1:literal> [, <scope n:literal> ]+
[ EXTRA_PARAMETERS_OF_REFRESH_TOKEN_REQUEST (
<parameter name:literal> = <parameter value:literal>
[ , <parameter name:literal> = <parameter value:literal> ]+
)
]
)
)
}
<proxy>::= PROXY
OFF
| DEFAULT
| ON ( HOST <literal> PORT <integer> [ <credentials> ] )
| AUTOMATIC ( PACURI <literal> )
<credentials> ::= USER <literal> PASSWORD <literal> [ ENCRYPTED ]
<ntlm_credentials> ::= <credentials> [ DOMAIN <literal> ]
<hdfs_authentication> ::= AUTHENTICATION {
SIMPLE ( USERNAME = <literal> )
| KERBEROS ( KRBUSERNAME = <literal> { KRBUSERPASSWORD | KRB_KEYTAB_FILE } = <literal> [ENCRYPTED] )
}
<s3_authentication> ::= AUTHENTICATION ( [
[ AWS_ACCESS_KEY_ID = <literal> ]
[ AWS_SECRET_ACCESS_KEY = <literal> [ENCRYPTED] ]
[ AWS_IAM_ROLE_ARN = <literal> ]
)
]
<azure_authentication> ::= AUTHENTICATION (
<azure_authentication_shared_key> |
<azure_authentication_oauth_client> |
<azure_authentication_oauth_managed_identity>
)
<azure_authentication_shared_key> ::=
AZURE_AUTH_TYPE = SHARED_KEY
[ SHARED_KEY = <literal> [ ENCRYPTED ] ]
<azure_authentication_oauth_client> ::=
AZURE_AUTH_TYPE = OAUTH_CLIENT_CREDENTIALS
OAUTH_CLIENT_ID = <literal>
[ OAUTH_CLIENT_SECRET = <literal> [ ENCRYPTED ] ]
OAUTH_TOKEN_ENDPOINT = <literal>
<azure_authentication_oauth_managed_identity> ::=
AZURE_AUTH_TYPE = OAUTH_MANAGED_IDENTITY
[ OAUTH_CLIENT_ID = <literal> ]
[ OAUTH_TENANT_ID = <literal> ]
[ OAUTH_TOKEN_ENDPOINT = <literal> ]
There are seven types of paths in Virtual DataPort:
Local (
LOCAL 'LocalConnection'
): path to a single file or to a directory. It can be in the local file system of the host where the Virtual DataPort server runs, or in a Windows shared drive.When the path is a directory and
FILENAMEPATTERN
is present, the data source will only process the files whose name matches the regular expressionFILENAMEPATTERN.
This clause is only valid for custom wrappers and delimited file data sources.The parameter
CHARSET
is available for DF and JSON data sources.From Variable (
LOCAL 'VariableConnection'
): use this path if the data is not be obtained from any source but it is provided by clients at runtime, in theWHERE
clause of the queries that involve the base views of the data source.HTTP Client (
HTTP 'http.CommonsHttpClientConnection'
): path to retrieve a file by sending an HTTP request. The parameters of this path are the following:Timeout: beside
http.CommonsHttpClientConnection
, you can specify the timeout of the request. For example,http.CommonsHttpClientConnection,120000
indicates that the timeout of the requests will be 2 minutes.HTTP method (
GET
orPUT
orPOST
orPATCH
orDELETE
) of the request. This parameter only has to be set withhttp.CommonsHttpClientConnection
.URI
of the file of the data source. This URI can have interpolation variables whose value will be provided at runtime. See section Execution Context of a Query and Interpolation Strings for information about interpolation variables.POSTBODY
andMIME
(optional): Use the parameterPOSTBODY
if the HTTP method isPOST
and you want to set the body of the request. MIME represents the Mime type of the body of this request. E.g.application/json
orapplication/xml
HEADERS
(optional): use this parameter to set the headers of the HTTP request.CHECKCERTIFICATES
(optional): Adding this clause is equivalent to selecting the check box Check certificates of the Configuration tab, in the HTTP client configuration of a path to a file. The section HTTP Path (subsection of Path Types in Virtual DataPort) of the Administration Guide explains when you should enable this option.HTTP_ERROR_CODES_TO_IGNORE
(optional): List of HTTP error codes to ignore when accessing the source. The section HTTP Path of the Administration Guide explains how to configure this option.PAGINATION SETTINGS
(optional): If you choose to or are required to use pagination, you should enable this option. This clause is equivalent to filling in thePagination Tab
of theCreate Data Source Dialog
. The section Pagination of the Administration Guide explains when you should enable this option.AUTHENTICATION
(optional): If the HTTP server requires authentication, add this parameter to indicate the credentials of the server.The supported authentication methods are BASIC, DIGEST, Mutual (two-way SSL), NTLM, OAuth 1.0a, OAuth 2.0 and SPNEGO (Kerberos).
The section HTTP Path of the Administration Guide explains how to configure these authentication mechanisms using the Administration Tool.
In the authentication methods
BASIC
,DIGEST
andNTLM
, if you add the clauseWITH PASS-THROUGH SESSION CREDENTIALS
, when a user queries a view that uses this data source, Virtual DataPort uses the credentials of this user to authenticate against the HTTP server. In this case, the values of the parametersUSERNAME
andPASSWORD
are used only by the Administration Tool to connect to the database and show the schemas of the database and their tables/views. But not for querying tables or views of the database.About
EXTRA_PARAMETERS_OF_REFRESH_TOKEN_REQUEST
: you can indicate the value of a parameter in plain text or encrypted. To provide the value encrypted, add the tokenENCRYPTED
. If you want to store the value of a parameter encrypted but want to provide it in plain text, add the tokenHIDDEN
. Virtual DataPort will store this value encrypted and the VQL of the data source will contain this value encrypted as well.For the values where you can add the
ENCRYPTED
modified next to them, you can provide those values in clear or encrypted. For the values you provide encrypted, add theENCRYPTED
modifier next to the value.To encrypt a value, execute the statement
ENCRYPT_PASSWORD
followed by the password. For example,ENCRYPT_PASSWORD 'password';
.The statement
ENCRYPT_PASSWORD
can be used to encrypt the Mutual Authentication certificate as well using the Base64 representation of the file as parameter.Warning
Users should be careful when enabling the cache for views that involve data sources with pass-through credentials enabled. The section Considerations When Configuring Data Sources with Pass-Through Credentials explains the issues that may arise.
The section OAuth Authentication of the Administration Guide provides more details about these authentication methods.
PROXY
(optional): If the HTTP request is sent through a proxy, you have three options:DEFAULT
: the data source will use the default HTTP proxy configuration of the Server. See the section Default Configuration of HTTP Proxy of the Administration Guide to learn how to configure these default values.ON
: the Server will connect to the proxy indicated in the parametersHOST
andPORT
. If the proxy requires authentication, you also have to provide the credentials of the proxy.AUTOMATIC
: provide the URL of aproxy.pac
file that contains the configuration parameters of the proxy.
The parameter
CHARSET
is available for DF and JSON data sources.
FTP / FTPS / SFTP (
FTP
): Path that accesses a file via FTP. The parameters of this path are the following:The URL of the FTP server with the following format:
host:port/path/file
Login of the user to connect to the FTP server.
Password of the user to connect to the FTP server.
SSH_KEY
: SSH key of the user to connect to the SFTP server. This value is the key encoded in Base64. Set this parameter if you are connecting to an FTP server that requires an SSH private key instead of a password.SSH_KEY_PASSWORD
: password of the SSH key.The clause
FILENAMEPATTERN
is only valid for custom wrappers and delimited-file data sources.The parameter
CHARSET
is available for DF and JSON data sources.The parameter
PASSIVE
allow to enable or disable passive connection.EXPLICIT
: parameter to enable or disable explicit connection.The parameter
EXECUTEPROTCOMMAND
enable or disable the execution of the PROT. When it is enabled we can specify with the parameterPROTECTIONLEVEL
the protection level:P
orC
.Additionally, when the PROT is enabled but it requires the negotiation of the protection buffer:
PBSZ
= <buffer_size>
HDFS (
HDFS 'hdfs.HdfsConnection'
): path to obtain the data from a file or a set of files located in a HDFS file system. The parameters of this path are the following:URI
of the file or directory in a HDFS file system. When the path is a directory andFILENAMEPATTERN
is present, the data source will only process the files whose name matches the regular expressionFILENAMEPATTERN.
This clause is only valid for custom wrappers and delimited file data sources.HADOOP_CUSTOM_PROPERTIES
(optional) is the parameter to set the same Hadoop properties that you would put in Hadoop configuration files likecore-site.xml
AUTHENTICATION
(optional): if HDFS requires authentication, add this parameter. The supported authentication methods are:SIMPLE
: you have to configure the user name. This authentication mode is equivalent to use the HADOOP_USER_NAME variable when you execute the Hadoop commands in a terminal.KERBEROS
: you have to configure the user name and the password or the keytab to connect to HDFS using Kerberos authentication.
The parameter
CHARSET
is available for DF and JSON data sources.
S3 (
S3 'hdfs.S3Connection'
): path to obtain the data from a file or a set of files located in a S3 bucket. The parameters of this paths are the following:URI
of the file or directory in a S3 bucket. When the path is a directory andFILENAMEPATTERN
is present, the data source will only process the files whose name matches the regular expressionFILENAMEPATTERN.
This clause is only valid for custom wrappers and delimited file data sources.CUSTOM_PROPERTIES
(optional) is the parameter to set the same properties that you would put in the Hadoop configuration files likecore-site.xml
to configure the S3A Hadoop connector.AUTHENTICATION
:AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
(optional): these are the tokens used to connect to the S3 bucket. If you leave them empty, Virtual DataPort will automatically obtain the credentials of the database from the AWS instance where this Virtual DataPort server is running.AWS_IAM_ROLE_ARN
(optional): The AWS user will assume the role to get the necessary privileges to connect to the bucket. The role ARN is the Amazon Resource Name of the IAM role you want to assume. Specify the role ARN likearn:aws:iam::<awsAccountId>:role/<roleName>
.
The parameter
CHARSET
is available for DF and JSON data sources.
ADLS (
ABFS 'hdfs.AbfsConnection'
): path to obtain the data from a file or a set of files located in a ADLS container. The parameters of this paths are the following:URI
of the file or directory in a ADLS container. When the path is a directory andFILENAMEPATTERN
is present, the data source will only process the files whose name matches the regular expressionFILENAMEPATTERN.
This clause is only valid for custom wrappers and delimited file data sources.CUSTOM_PROPERTIES
(optional) is the parameter to set the same properties that you would put in the Hadoop configuration files likecore-site.xml
to configure the ABFS Hadoop connector.AUTHENTICATION
:AZURE_AUTH_TYPE
: Authentication type to use. The supported values areSHARED_KEY
,OAUTH_CLIENT_CREDENTIALS
andOAUTH_MANAGED_IDENTITY
.SHARED_KEY
: Shared key to authenticate and access the ADLS container.OAUTH_CLIENT_ID
andOAUTH_CLIENT_SECRET
: The client ID and secret used to get the OAuth tokens to authenticate and access the ADLS container.OAUTH_TENANT_ID
: The tenant ID to use to get the OAuth tokens.OAUTH_TOKEN_ENDPOINT
: The endpoint URL to get the OAuth tokens.
The parameter
CHARSET
is available for DF and JSON data sources.
For Local, FTP/FTPS/SFTP, HDFS, S3 and ADLS routes, if URI
points to a directory
instead of a single file, when you query a base view created over this
data source, the Server will retrieve the data from all the files in the
directory and not just one file. In this case, the Server assumes that
all the files of the directory have the same format as the first one.