HTTP Path¶
Use an HTTP path when the data has to be obtained by sending an HTTP request to a server. This page explains everything related to the HTTP paths:
Find information about the Filters tab in Compressed or Encrypted Data Sources; filters work the same way for any type of path (local, HTTP, FTP…).
Configuration¶
With HTTP paths you can either send a request with the HTTP method GET, PUT, POST, DELETE or PATH.
There are two ways of adding parameters to the body of a POST, PUT or PATH request:
URL parameters: if the URL contains query parameters, they will be removed from the URL and sent in the body of the request as the values of an HTML form. The “Content-type” header of the HTTP request will be “application/x-www-form-urlencoded”.
For example, if the URL is
http://acme/customer?first_name=John&last_name=Smith
at runtime, the Server will send the HTTP request tohttp://acme/customer
(without the query parameters) and the body of the request will befirst_name=John&last_name=Smith
.Post body: the contents of the text area below this option will be sent in the body of the HTTP request. Enter the content type of the body in the Content type box.
To add a header to the HTTP requests sent to retrieve the data, click HTTP headers. In the “HTTP Headers” dialog, click New and enter the name and the value of the header.
The following elements of an HTTP path can contain interpolation variables (the section Paths and Other Values with Interpolation Variables explains what interpolation variables are):
The URL. It can contain one or more interpolation variables. For example,
https://acme.com/department?id=@{department_id}
The HTTP headers. Both the names and their values can be interpolation variables.
The “Post body” can contain one or more interpolation variables. This is useful if you want to send a POST request and one or more values of the body need to be set at runtime, when the data source is queried. For example, let us say that the body of the POST request is the following:
<employee_info> <last_name>@last_name</last_name> <first_name>@first_name</first_name> </employee_info>
@first_name
and @last_name
. The character “@”
indicates that first_name and last_name are interpolation variables, which
means that at runtime, the strings @first_name and @last_name
will be substituted
by a value when this data source is queried.When you create a base view over this data source, the view will have three extra fields: department_id, last_name and first_name. The queries to this view will always have to provide a value for these fields in the WHERE clause.
For example, if the query is
SELECT *
FROM view
WHERE department_id = 2 AND first_name = 'John' AND last_name = 'Smith'
the Server will send a request to
https://acme.com/department?id=@{department_id}
with this body:
<employee_info>
<last_name>Smith</last_name>
<first_name>John<first_name>
</employee_info>
Any query involving a base view created over this data source that does not provide a value for these three fields will fail.
Note
You have to escape the characters
@
,{
and}
when they are not part of the name of an interpolation variable. They are escaped with the character\
and the character\
with itself (\\
).
<employee_info_request>
<department_name>department \@1 \{ACCT\}</department_name>
</employee_info_request>
If the body of the request is loaded with Load file, the Tool escapes all these characters automatically.
The benefit of using interpolation variables is that the HTTP request to the Web server is not static and can be different for every query.
If the definition of the HTTP path has interpolation variables and you click “Test connection”, you will have to provide the value of the interpolation variables. You also have to do this when creating a base view over this data source. The Administration Tool will display a dialog like the following to provide the value of the variable.
In this dialog, select the URI parameter check box if the value of
the variable is the value of a query parameter of the URL. If selected,
the value of the variable will be escaped accordingly. Otherwise, if the
variable is part of the URL, the value will be escaped as any other part
of the URL. For example, if the URL is
http://acme/recipe?name=@dish
, when you provide the value of the
variable “dish”, select the check box “URI parameter”. That way, if the
value of dish
is “Mac&Cheese”, the Server will send a request to
http://acme/recipe?name=Mac%26cheese
. Note that “&” has been
properly encoded by replacing “&” with “%26”. If dish
is not marked
as a “URI parameter”, the URL will be
http://acme/recipe?name=Mac&cheese
, which in this case is not
correct because as the URI has “&” after “Mac”, the Web server will
treat “cheese” as another query parameter.
When the interpolation variable does not belong to the URL of the path, leave the “URI parameter” check box cleared.
The Connection timeout field enables the configuration of a timeout, in milliseconds, for the connection to be established. If the connection cannot be stablished within this time, the data source will cancel the request, and the query will fail.
Select the check box Check certificates if you are in one of these scenarios:
The service uses SSL/TLS (i.e. the URL starts with
https
) and you want Virtual DataPort to validate that the certificate presented by this service was issued by a Certificate Authority (CA) trusted by the Java Virtual Machine (JVM) included with the Denodo Platform. This validation will be performed for every connection established with the service.If the certificate presented by this service was not issued by a trusted CA or it was self-signed, but you still want Virtual DataPort to validate it, import the certificate into the list of trusted certificates of the JVM. The section Importing the Certificates of Data Sources (SSL Connections) of the Installation Guide explains how to do this.
Also, select this check box if the service requires SSL client authentication.
Clearing the check box has two implications:
Virtual DataPort will accept any certificate presented by the service without checking who issued it.
And, all the requests will fail if the service requires SSL client authentication.
Select the check box Generate output on empty content to allow the creation of a base view with this data source even if the http endpoint does not return data. The resulting base view will include
a column named result
, but this column will not contain any data when a query is executed.
Select the check box Ignore HTTP route errors to configure which HTTP errors the data source has to ignore. When the data source retrieves data from a source and one of the selected HTTP errors occur, the data source will fail silently and will not return rows. If any other error occurs, the data source will return an error, which will make the query to fail.
The difference between this option and Ignore route errors is that with Ignore route errors, the data source ignores any error that occurs when accessing the source.
Pagination¶
It is common for a REST web service to return the data “paginated”. That is, they limit the number of records it returns per request. The client application (in this case, Virtual DataPort) has to send several request with a certain parameter to obtain all the data it requires.
You can configure DF, JSON and XML data sources to retrieve data from a REST API paginated. When configured to do this, they return all the rows of the requests transparently.
To enable this option, go to the tab Pagination, select The service returns paginated data and select the pagination method of this service:
Use method #4 if the REST API you want to access does not support the other methods. That is because with method #4 you need to set a custom “tuple root” when creating the base view over this data source; with the other pagination methods is not necessary.
Also it is available pagination with POST request. In the post body it is posible include a json. For POST request there are two pagination methods.
POST API request data example
Here is an example of a POST API that will be used in the following sections.
To work with pagination using POST sources you need to go to the tab Configuration and locate post body section.
In this section, you should use the interpolation variable @next_page_token
to build the post body according to
the requirements of the server hosting the POST API.
Here’s an example:
\{
"filter": \{
^ExecuteIfIsNotNull("\"offset\":\"",@next_page_token,"\"")
\}
\}
The generated request will be composed, among other things, of the body and the parameters that we indicate in the pagination section. In this example, the requested item is 1003, and the number of items is 2.
{
"filter": {
"offset":"1003",
"count":2
}
}
And the response obtained from the POST API is a JSON document like this:
{
"contacts": [
{
"id": 1003,
"name": "peter",
"phone": 33333
},
{
"id": 1004,
"name": "johny",
"phone": 44444
}
],
"has-more": true,
"id-offset": 1005
}
Where id
is the ID of each contact
, has-more
indicates if there are more results, and id-offset
indicates the ID of the next result.
“If the Content-Type is XML
, the body should be an XML
structure, for example:”
<root>
<filter>
^ExecuteIfIsNotNull("<vidOffset>",@next_page_token,"</vidOffset>")
</filter>
</root>
“If the Content-Type is , text/plain
the body should be an plain text, for example:”
^ExecuteIfIsNotNull("\"vidOffset\":\"",@next_page_token,"\"")
1. Obtain the “Next Page URL” from an HTTP Header of the Response¶
Use this pagination method to connect to REST services that include the URL of the “next page” in an HTTP header of the responses or POST services that include the URL of the “next page” in an HTTP header of the responses .
In the tab Pagination, select Obtain the “next page URL” from an HTTP header of the response. Then, enter the name of the HTTP header that will contain the URL to the next page.
When a query involves this data source, the behavior of the data source is the following:
The data source sends the first request to the URL of the Configuration tab.
If the response contains the header of the field HTTP header, the data source will inspect its value:
If the value of the header starts with “http”, the data source will send a request to this URL.
If the value starts with the symbol “<”, the data source will assume the service uses the pagination method defined by the standard RFC 8288 - Web Linking. That is, it will look for the URL with the attribute
rel="next"
.This is to work with services that return a header like this one:
Link: <https://rest-api.acme.com/api/v1/customer.json?opaqueA>; rel="current", <https://rest-api.acme.com/api/v1/customer.json?opaqueB>; rel="next", <https://rest-api.acme.com/api/v1/customer.json?opaqueC>; rel="first", <https://rest-api.acme.com/api/v1/customer.json?opaqueD>; rel="last"
In this example, the data source will send the next request to https://rest-api.acme.com/api/v1/customer.json?opaqueB
If there is no URL marked with
rel="next"
, the data source will consider that there are no more pages.The data source keeps sending requests until the response does not contain this header or it reaches the value of the field Maximum number of requests.
1.1 POST Considerations¶
Using the data shown in POST API request data example.
In the tab Pagination, select Obtain the “next page” token from an HTTP header of the response.
Next, enter the name of the token in the HTTP header that will store the value used to generate the URL for the next page.
For instance, the server can include the header header-token
with the value indicating the next item to be retrieved, such as: "header-token: 1003"
2. Obtain the “Next Page URL” from the Body of the Response¶
Use this pagination method to connect to REST services that include the URL of the “next page” in the response (not in a header).
2.1 REST¶
In the tab Pagination, select Obtain the “next page URL” from the response. Then, enter the name of the Path to “next” URL in response that will contain the URL to the next page.
For example, if the API you are accessing returns a JSON document like this:
{
"name": "p_echo",
"elements": [{
"customer_id": "1"
}, {
"customer_id": "2"
}, {
"customer_id": "3"
}
],
"links": [{
"rel": "next",
"title": "Next interval",
"href": "http://rest-api.acme.com/my_service?$start_index=10&$count=10"
}
]
}
In Path to “next” URL in response, enter /links/href
. The syntax is the same for XML data sources. XML data sources do not take into account the namespaces to locate the element with the URL.
When a query involves this data source, the behavior of the data source is the following:
The data source sends a request to the URL of the Configuration tab.
If the response contains the element indicated in Path to “next” URL in response, the data source reads the value and sends a request to this URL.
If the value obtained from the response does not start with “http”, the data source will complete the URL of the next request by copying the protocol (HTTP or HTTPS) and the host name from the URL of the data source (that is, the field URL of the tab Configuration).
The data source keeps sending requests until the response does not contain the Path to “next” URL in response or the data sources reaches the value of the field Maximum number of requests.
See also Common behaviour
2.2 POST¶
Using the data shown in POST API request data example.
Here, we will send a request with that body to the POST API, and the response received will be similar to the JSON document shown before. Note that the names of the fields depend on the API implementation.
In the tab Pagination, select Obtain the “next page” token from the response. Then, enter the name of the Path to next page in the response that will be used to process the response and create the URL to the next page.
In Path to “next” page in response, enter /id-offset
. The syntax is the same for XML data sources. XML data sources do not take into account the namespaces to locate the element with the URL.
When a query involves this data source, the behavior of the data source is the following:
The data source sends a request to the URL of the Configuration tab.
If the response contains the element indicated in Path to “next” page in response, the data source reads the value and sends a request to this URL.
If the value obtained from the response does not start with “http”, the data source will complete the URL of the next request by copying the protocol (HTTP or HTTPS) and the host name from the URL of the data source (that is, the field URL of the tab Configuration).
The data source keeps sending requests until the response does not contain the has-more field or the data sources reaches the value of the field Maximum number of requests.
See also Common behaviour
2.3 Common Behavior¶
If you provided a value for Parameter in URL for page size and Page size, the data source will add the query parameter <Parameter in URL for page size>=<Page size> to the URL of all the requests it sends.
If Path to “next” URL in response is a field of an array, the data source will read the first element of the array. The data source cannot read the URL from an array if the URL is in the second, third… element of the array.
If in the responses of this API, the field name with the URL of the next page contains the character “/”, you have to escape this character with another “/”. For example, if the name of the field is “next/link”, in Path to “next” URL in response you have to enter this:
/pagination/next//link
3. Obtain a Token from the Response¶
Use this pagination method to connect to REST services that include a token in the response and this token has to be added to the URL to obtain the “next page”.
In the tab Pagination, select Token continuation and provide this information:
Parameter in URL for “next” token: parameter in the URL that indicates the page token.
Path to “next” token in response: path to the “next” token, in the JSON or XML of the response.
Parameter in URL for page size (optional): parameter in the URL that indicates the size of the page.
Page size (optional): number (greater than zero) of records you want each request to return. Some APIs limit the number of records that return per request.
When a query involves this data source, the behavior is the following:
The data source sends a request to the URL of the Configuration tab.
If the response contains the element indicated in Path to “next” token in response, the data source reads this value. Then, it sends a request to the URL of the Configuration tab, adding the parameter Parameter in URL for “next” token = <token read from the response>.
For example:
If URL is “http://rest-api.acme.com/my_service”
The value Parameter in URL for “next” token is “nextPageToken”
In the response to the first request, the value of Path to “next” token in response is “jkldz82719aa”.
The data source will send the second request to “http://rest-api.acme.com/my_service?nextPageToken=jkldz82719aa”
The data source will keep sending requests until it reaches the Maximum number of requests or the response does not include any token (that is, the element indicated in Path to “next” token in response does not exist in the response).
Considerations
If you provided a value for Parameter in URL for page size and Page size, the data source will add the query parameter <Parameter in URL for page size>=<Page size> to the URL of all the requests it sends.
If in the responses of this API, the field name with the token contains the character “/”, you have to escape this character with another “/”. For example, if the name of the field is “next/link”, in Path to “next” token in response, enter this:
/pagination/next//link
Although it is not common, some APIs return the same token as the previous page when there is no more data to return. For example, in the first page, the token is “token_XYZ”, in the second page is “token_ABC” and in the third and last page, “token_ABC” again, to indicate that the third page is the last one. By default, the data source will keep obtaining the last page indefinitely because it finds a token in the response. If you want the data source to stop making requests when the token is the same as in the previous page, execute this:
-- With this command, all the HTTP requests to "acme.com" with pagination will -- stop when the token of a page is the same as the token of the previous page. SET 'com.denodo.parser.connection.http.pagination.stopWhenPaginationTokenDoesNotChange.acme.com' = 'true';
In the command above, replace “acme.com” with the hostname of the data source. If the hostname has a subdomain, you have to include as well. For example, if the URL of the data source is like “https://roadrunner.acme.com/api/…”, the property has to be “com.denodo.parser.connection.http.pagination.stopWhenPaginationTokenDoesNotChange.roadrunner.acme.com”.
If you want all data sources to have this behavior, execute this instead:
-- The difference with the previous command is that this one does not -- include the hostname SET 'com.denodo.parser.connection.http.pagination.stopWhenPaginationTokenDoesNotChange' = 'true';
Although it is uncommon, some APIs expect the client application (in this case, Virtual DataPort) to URL-encode the continuation tokens returned by the API. By default, the data sources do not encode these tokens but you can configure them to do so. To do this, execute this:
-- With this command, all the HTTP requests to "roadrunner.acme.com" with pagination will -- encode the continuation token. SET 'com.denodo.parser.connection.http.connection.pagination.encodeNextToken.roadrunner.acme.com' = 'true';
In the command above, replace “roadrunner.acme.com” with the hostname of your data source.
If you want all the data sources to behave this way, execute this instead:
-- The difference with the previous command is that this one does not -- include the hostname SET 'com.denodo.parser.connection.http.connection.pagination.encodeNextToken' = 'true';
To go back to the default behavior (i.e. not URL-encoding these tokens), execute this to remove these configuration properties:
-- Setting a configuration property to NULL, removes the property. SET 'com.denodo.parser.connection.http.connection.pagination.encodeNextToken.roadrunner.acme.com' = NULL; SET 'com.denodo.parser.connection.http.connection.pagination.encodeNextToken' = NULL;
4. Pagination with Indices¶
Use this pagination method to connect to REST services that do not support the previous pagination methods but support one of these:
Indicating the page number as a parameter on the URL.
Indicating the range of rows as parameters on the URL (for example, obtain the records 50 to 100).
In the tab Pagination, select Paging indices and provide this information:
Parameter in URL for page size: parameter in the URL that will contain the number of records per page.
Page size: value in the URL of the parameter indicated in Parameter in URL for page size.
Parameter in URL for next records: parameter in the URL that indicates the index of the page.
Index of the first record: index of the first page (usually it will be 0 or 1).
Offset for the next requests: the number of records that will be omitted in the next request.
Maximum number of requests: maximum number of requests.
The request is built by setting the index of the next record to be retrieved in the URL.
Important
During the process of creating a base view over a data source with Pagination with Indices, always enter a tuple root. The goal of setting a tuple root instead of using the default one is that with this pagination method, the number of rows returned by the base view determines if the data source has to keep sending requests or stop. When the second request returns less rows than the first request, the data source stops sending requests.
This warning does not apply to DF data sources, only to JSON and XML.
Note
For requests 0 to N the URL is built like this:
URL?<Parameter in URL for page size>=<Page size>&<Parameter in URL for next records>=<Index of the first record + (Offset for the next requests * N)>
When a query involves this data source, the behavior is the following:
To send the first request, the data source adds the following parameters to the URL of the Configuration tab:
Parameter in URL for page size=Page size
Parameter in URL for next record=Index of first record
To send the following requests, the data source adds the following parameters to the URL of the Configuration tab:
Parameter in URL for page size=Page size
Parameter in URL for next record=Index of first record multiplied by the number of requests.
The data source will keep sending requests until it reaches the Maximum number of requests or the number of rows returned by the base view is zero.
Example
Let us say that you to invoke an endpoint of an API that has these parameters:
“start_index”: index of the first record of the entire result set you want to obtain.
“count”: number of records per response and you want to obtain a 100 records per request.
In this scenario, you will have to enter these values:
Parameter in URL for page size = count
Page size = 100
Parameter in URL for next records = start_index
Offset for the next requests = 5
Index of the first record = 0 (considering that the first page in this API is 0)
When you query a base view of this data source, the data source will send these requests:
Request #1: http://rest-api.acme.com/my_service?count=100&start_index=0
Request #2: http://rest-api.acme.com/my_service?count=100&start_index=105
Request #3: http://rest-api.acme.com/my_service?count=100&start_index=210
and it continue until a request returns less rows than the previous request or the data source reaches the Maximum number of requests.
Proxy Settings¶
In the Proxy tab, you can set a proxy configuration for this data source or use the Default configuration of the Server (see section Default Configuration of HTTP Proxy).
Authentication in HTTP Paths¶
The supported authentication methods for HTTP connections are:
Basic. The credentials are sent in plain text (RFC 2617 - HTTP Authentication: Basic and Digest Access Authentication).
Digest. The credentials are sent encrypted.
Mutual (two-way SSL). See section Mutual Authentication below.
NTLM. Uses the Microsoft NTLM Authentication (NT LAN Manager Authentication Protocol Specification) to connect to the service. Virtual DataPort supports NTLM v1 and NTLM v2.
OAuth 1.0a and OAuth 2.0. See section OAuth Authentication.
SPNEGO (Kerberos). See section SPNEGO (Kerberos) below.
If you select the check box Pass-through session credentials (available for the authentication methods “Basic”, “Digest”, “NTLM” and “SPNEGO (Kerberos)”), when a client executes a query that involves this data source, the credentials used to send a request to the service are the credentials of the user that executes the query; not the credentials of the fields “Login” and “Password”. When this option is selected, the credentials of the fields “Login” and “Password” are used only when creating base views over this data source, to send a request to the service and analyze the output of the URL.
The section SPNEGO (Kerberos) explains in detail the behavior of Virtual DataPort when the authentication method is “SPNEGO (Kerberos)” and “Pass-through session credentials” is selected.
Warning
Be careful when enabling the cache on views that involve data sources with pass-through credentials enabled. The appendix Considerations When Configuring Data Sources with Pass-Through Credentials explains the issues that may arise.
Mutual Authentication¶
When establishing an SSL/TLS communication (e.g. with “https”), the client (in this case, Denodo) verifies the identity of the service by checking if the certificate used by this service is signed by a certification authority (CA). With “mutual authentication” (also known as two-way SSL/TLS), the client (in this case, Denodo server) also uses a certificate for authentication instead of user and password or an OAuth token.
To use this feature, you need a key store file that contains the private key to access the service. This file has to be in the formats PKCS#12 or Java Key Store (JKS).
To enable this authentication method on an HTTP route, follow these steps:
In the “Edit HTTP connection” dialog, click the Authentication tab.
In the Authentication list, select Mutual (two-way SSL).
In Certificate password, enter the password of the file with the private key.
Click Load certificate and select the file with the private key.
If the certificate is valid, the tool will display the issuer of the certificate, the expiration date of the certificate, etc.
Note
If you want Virtual DataPort to validate the certificate sent by the service, select Check certificates in the Configuration tab. In order for this validation to succeed, the certificate used by the service has to be signed by a Certification Authority (CA). Otherwise, you have to import the certificate into the TrustStore of the Denodo server or the communication will fail.
OAuth Authentication¶
OAuth is an authorization framework that allows third-party applications (in this case, Virtual DataPort), to access resources on a server on behalf of a resource owner.
The main benefit is that you do not need to share your username and password with third-party applications in order to authorizing them to access your data.
The following subsections explain how to use the wizards that help you obtain the credentials needed to connect to a service with OAuth 1.0a or OAuth 2.0 authentication.
Note
Before creating the data sources in Virtual DataPort, you have to register Virtual DataPort as an application in the service that you want to access.
Note
We recommend creating a single data source for all the views
that retrieve data from the same OAuth-authenticated service. The reason
is that, if at any point, the OAuth credentials change, you will only
have to change them in one data source. To do this, you can create the
data source with an interpolation variable in the URL (http://service.com/@OBJECT_TYPE/ <http://service.com/@OBJECT_TYPE/>
)
OAuth 1.0a¶
Note
If you are not sure if your service uses OAuth 1.0a or 2.0, most likely it uses OAuth 2.0, not this one.
This section explains how to configure an “HTTP Client” route to retrieve data from a service with OAuth 1.0a authentication. The Tool provides the OAuth 1.0a credentials wizard to help you obtain these credentials.
Follow these steps:
In the “Edit HTTP connection” dialog, click the Authentication tab.
Select OAuth 1.0a in the Authentication list.
Enter the Client identifier and the Client shared secret provided by the service.
Select the Signature method. The HMAC-SHA1 signature is the most used, so usually is the right option.
If you already have the Access token and the Access token secret, enter them in the boxes below and click Ok.
If you do not have these tokens, click launch the OAuth 1.0a credentials… to open the wizard that will help you obtain them.
Enter the Temporary credential request URL, the Resource owner authorization URL and the Token request URL
The documentation of the service you are accessing must provide these details.
Select the Callback URL. When you get to the step 2 of the wizard, you will have to open a URL in your browser. In this URL, the service displays a page where you have to authorize Virtual DataPort to access your data. If you proceed, you will obtain the Verification code, which Virtual DataPort will use to send an HTTP request to the service. The response will contain the Access token and the Access token secret.
The Callback URL determines how the service will return the Verification code.
Note
Depending on the service, you cannot select any option. Some of them force you to use a specific redirect URL, others only allow oob, etc.
oob: with this option, the wizard will request the service to display the Verification code in your browser after the authentication process.
If you select the second or the third option, the service will redirect your browser to this URL and it will add the parameter
code
to it. The value of this parameter is the Verification code.The default URL (http://localhost:9090/oauth/1.0a/callbackURL.jsp) points to a JSP located in the Apache Tomcat embedded with the Denodo Platform, which will display the value of the
code
parameter in a box that makes it easier to copy it.If you have to indicate another callback URL, you will have to extract manually the value of the
code
parameter from the URL.Click Generate the authorization URL. Virtual DataPort will request a Temporary token and with it, it will generate the Authorization URL.
Click Open URL. If the browser is not launched, copy the URL and open it manually.
In this URL, you have to authorize the Virtual DataPort server to retrieve data from the service.
After authorizing Virtual DataPort to access your data, the service returns the Verification code. Enter this code in the Paste the verification code text field.
If the Callback URL is oob, you have to type the value. If you have selected the default URL, you can copy it and paste it into this box.
Click Obtain the OAuth 1.0a credentials. The Server will request the OAuth tokens using all the details you have provided and the Verification code.
Click Ok to close the wizard.
The wizard will fill the text areas “OAuth access token” and “OAuth access token secret”.
Click Ok to close the “Edit HTTP Connection” dialog and then, Save to create the data source.
To use this wizard independently, you can do so by clicking on OAuth 1.0a wizard on the menu Tools > OAuth credentials wizards of the Administration Tool.
You may need to use this wizard when using a custom wrapper whose input parameters are OAuth credentials.
OAuth 2.0¶
This section explains how to configure an “HTTP Client” route to retrieve data from a service with OAuth 2.0 authentication. The Tool provides the OAuth 2.0 credentials wizard to help you obtain these credentials.
Note
If you are not sure if your service uses OAuth 1.0a or 2.0, most likely it uses OAuth 2.0.
The Tool provides the OAuth 2.0 credentials wizard to help you obtain these credentials.
Follow these steps:
In the “Edit HTTP connection” dialog, click the Authentication tab.
Select OAuth 2.0 in the Authentication list.
Select the appropriate Authentication grant:
Authorization code grant. This is the safest option because you do not have to enter your user name and password for the service. You only have to obtain an “access token” and “refresh token”, which you can do with the help of the wizard of this dialog. An additional benefit of this grant is that generally - it depends on the service - you can limit the operations this data source will be able to do (e.g. only allow read access to the data). In addition, if the access token or the refresh token ever get compromised, they can be revoked without having to change the password of your user account in the service.
Resource owner password credentials
Client credentials grant
The second and third options are easier to configure because you do not have to obtain the access token nor the refresh token. However, they do not have the benefits of the first option.
Check the documentation of this service to see what options are available.
These options are described in detail in the standard (RFC 6749 - The OAuth 2.0 Authorization Framework).
Enter the Client identifier and the Client secret provided by the service.
Enter the User identifier and the User password (only if you selected Resource owner password credentials).
Select one of the options of Authentication method used by the authorization server. This controls how Virtual DataPort will send the credentials to the service when requesting a new OAuth access token. The options are:
Include the client credentials in the body of the request: Virtual DataPort will add the credentials to the body of the request, in the parameters
client_id
andclient_secret
.Send credentials using the HTTP Basic authentication scheme: Virtual DataPort will send the credentials of the user in the
Authentication
header of the HTTP request.
These two options are described in the section “2.3.1. Client Password” of the OAuth 2.0 specification (RFC 6749 - The OAuth 2.0 Authorization Framework).
Although the first option is more common, some services require the second one.
If you already have the OAuth access token, enter it in the Access token box and select the appropriate Request signing method. If you also have the Refresh token, enter it in the Refresh token box enter the value of the Token endpoint URL and, if you know it, the number of seconds until the access token expires.
If you do not have the access token and it will be provided at runtime instead of being stored in the data source, select Access token value is an interpolation variable and, in the box below, enter a name for the variable. At runtime, the queries to the base views of this data source will have to provide a value for this variable. This value will be the access token used to connect to the source. This option is useful if the source requires OAuth 2.0 authentication but does not fully implement the standard. In this case, you can develop a stored procedure that obtains this token and pass it to the base view.
If you do not have the access token and want to obtain it from the source, click launch the OAuth 2.0 credentials… to open the wizard that will help you obtain it.
Enter the Token endpoint URL.
Only if you selected Authorization code grant, enter the Authorization server URL.
Only if you selected Authorization code grant, select the Redirect URI. When you get to the step 2 of the wizard, you will have to open a URL in your browser. In this URL, the service displays a page where you have to authorize Virtual DataPort to access your data.
If you proceed, the service will redirect your browser to the Redirect URI and it will add several parameters to it. Virtual DataPort will use the values of these parameters to send an HTTP request to the service. The response will contain the Access token and maybe, the Refresh token.
Click the button for each scope you want to add and enter its name.
Scopes are “privileges” defined by the service, which control the data that the application can request.
For example, Twitter defines several scopes and depending on the scopes requested in this wizard, Virtual DataPort will be able to retrieve your tweets, but may not post new ones on your behalf.
Only do steps e. to h. if you selected the Authorization code grant in the previous dialog.
Usually, you can leave the Set the “state” request parameter selected. However, if the process of obtaining the OAuth credentials fails, check that the service allows setting this parameter.
Click on Generate the authorization URL.
Virtual DataPort will generate a URL with all the parameters you have provided.
Click on Open URL.
If the browser is not launched, copy the URL and open it manually.
In this URL, you have to authorize the Virtual DataPort server to retrieve data from the service.
After authorizing the application, the service will redirect you to a URL. Paste this URL in the text field of step 3.
Click on Obtain the OAuth 2.0 credentials.
The Server will request the OAuth credentials using all the details you have provided and the parameters of the URL you have pasted in the previous step.
Click Ok to close the wizard.
The wizard will fill the text areas and text fields with the information returned by the service.
Not all the services provide a Refresh token, so this text area may be empty.
Select the Request signing method. Virtual DataPort has to sign each request with the Access token. Usually, all OAuth services allow the “Authorization” request header method, which consists on adding a special HTTP header to the request. If the service does not support this method, you can select the other methods defined by the standard:
Form-encoded body parameter: send the token in the body of the request (only available with HTTP POST requests)
URL query parameter (“access_token”): the token is sent in the parameter
access_token
of the URL.Or, add the token as a query parameter with a name different from “access_token” (URL custom query parameter).
If you do not have the refresh token and it will be provided at runtime instead of being stored in the data source, select Refresh token value is an interpolation variable and, in the box below, enter a name for the variable. At runtime, the queries to the base views of this data source will have to provide a value for this variable. This value will be used to refresh the access token if necessary. This option is useful if the source requires OAuth 2.0 authentication but does not fully implement the standard. In this case, you can develop a stored procedure that obtains this token and pass it to the base view.
Some REST APIs with OAuth 2.0 authentication require clients to send additional parameters. Click extra parameters of the refresh token requests to add these parameters. Denodo will add them to these requests:
In the requests to obtain an access token for the first time. That is, it will add this parameter to the URL generated in the OAuth Credentials Wizard, when you click Obtain the OAuth 2.0 credentials.
In the requests the data source sends to refresh the access token. That is, when the current access token expires and the data source has to obtain a new one, using the refresh token.
For instance, when connecting to Microsoft services on the cloud with OAuth 2.0 authentication, you may need to add the parameter
resource
. Its value is the identifier of the application you want to connect to. This is a requirement of this API.In this dialog, if you select the check box Encrypted next to the value of the parameter, the data source stores this value encrypted.
If you select Pass-through session credentials, select one of the following pass-through strategies:
Important
The selected strategy is only relevant with pass-through session credentials when the user logged in to Virtual DataPort with OAuth authentication.
Token exchange flow (RFC 8693). The data source will use the Token exchange flow (defined in the RFC 8693) to retrieve the access token to establish the connection.
On-behalf-of flow (Azure AD). The data source will use the Microsoft proprietary On-behalf-of flow to retrieve the access token to establish the connection.
OAuth Token pass-through. The data source will connect to the source with the same access token the client application used to log in to Virtual DataPort.
Besides, there will be some fields with default values and it will show new fields:
Authentication grant will be selected as Resource owner password credentials.
Authentication method used by the authorization server will be selected as Include the client credentials in the body of the request.
Request signing method will be selected as “Authorization” request header.
Scope will appear in the end and you must enter the desired scopes to get a new access token in case the access token of the session expires.
Note
If the user will authenticate through Denodo SSO, it is important to configure the Denodo Security Token to include the original OAuth token (see Denodo Security Token Configuration File) to make sure that Virtual DataPort server retrieves the token that will be used in the pass-through strategy.
Click Ok to close Edit HTTP Connection and then, Save to create the data source.
Usually, you only need to launch the OAuth 2.0 wizard from the dialogs “Create JSON data source” or “Create XML data source”. However, if you need to use this wizard independently, you can do so by clicking on OAuth 2.0 wizard on the menu Tools > OAuth credentials wizards of the Administration Tool.
You may need to use this wizard when using a custom wrapper whose input parameters are OAuth credentials.
SPNEGO (Kerberos)¶
When the authentication method of the data source is “SPNEGO (Kerberos)”, Virtual DataPort will use a Kerberos ticket to add an authentication header to the HTTP requests sent to the service.
If you clear the check box Pass-through session credentials, the Server will use the values of the “Login” and “Password” boxes to connect to the Key Distribution Center (KDC) and request a Kerberos service ticket.
If you select the check box Pass-through session credentials, Virtual DataPort will use the credentials of the client to obtain a Kerberos service ticket, on behalf of the client that is executing the query that involves this data source. The exact behavior of Virtual DataPort depends on the authentication method used by the client:
The client connects to the Virtual DataPort server using Kerberos authentication: the Server will request a service ticket to the Key Distribution Center (KDC) on behalf of the client that executes the query, using the ticket-granting ticket (TGT) obtained when this client opened the connection to the Server. Then, it will use this service ticket to add an authentication header to the HTTP requests sent to the service.
The client connects to the Virtual DataPort server using standard authentication: the Server will request a service ticket to the KDC using the user name and password of the client that executes the query. Take into account the following:
If the Virtual DataPort server is running on Windows but the host does not belong to a Windows domain, define the system properties “java.security.krb5.realm” and “java.security.krb5.kdc” as explained in the section Enabling Kerberos Authentication Without Joining a Kerberos Realm of the Installation Guide.
If the Virtual DataPort server is running on Linux, you need the system to have a
krb5.ini
file. See the section Providing a Krb5 File for Kerberos Authentication of the Installation Guide for more information about how to check if there is already one in your system.
Connection Pool¶
By default, data sources with HTTP paths maintain a pool of connections to the source, this way requests will be executed faster. This connection pool can be disabled by setting the following properties:
-- This property indicates if connections are checked when they are retrieved from the pool.
-- (Default value: false)
SET 'com.denodo.parser.connection.http.testOnBorrow' = 'true';
-- This property indicates the time in milliseconds that a connection can remain idle at the pool available for being used to execute a new request.
-- When this timeout is reached, then the connection will be removed from the pool in the next checking. This property only applies when "testOnBorrow" is true.
-- (Default value: 10000)
SET 'com.denodo.parser.connection.http.minEvictableIdleTimeMillis' = 0;