You can translate the document:

Goal

This document describes how to connect to Google Cloud Storage from Denodo Virtual DataPort.

Content

Google Cloud storage is Google based object storage solution for the cloud. Cloud storage is optimized for storing massive amounts of unstructured data, such as text or binary data.

Virtual DataPort can connect to Google Cloud storage in order to use it as a data source and to import information.

Prerequisites

In order to access the files types that are not supported by the Denodo Distributed File System Custom Wrapper from the Google Cloud Storage into the Denodo, the following prerequisites to be performed in the Google Cloud Portal:

  1. Create oAuth Client ID credentials in the Google Cloud platform APIs & Services.
  2. Create a bucket and files into Google Cloud Storage.

Create an oAuth Client ID credentials

  • To create an oAuth Client credential, navigate to “APIs & Services” section from the Google Cloud Platform Portal and select “Credentials” Tab. Click on “+ Create Credentials -> oAuth Client ID” to open the “Create oAuth Client ID wizard” and provide the required details and click the “CREATE” button to define the Credentials inside the Google Cloud Platform.

  • Once the Application is created, it will display the “Your Client ID” and “Your Client Secret” as shown in the below image:

  • Once you close this pop up, the newly created Credentials will be displayed under the Auth2.0 Client IDs as shown in the below screenshot.

        

  • On selecting the application, you will see the Client ID, Client Secret and Creation Date of the Credentials. You also have the option to download the Credentials Information in JSON format by clicking the “Download JSON” option on top of the screen as shown below.

        

  • The downloaded JSON file has the following information that will be required to configure OAuth:
  • Client_id
  • Auth_uri
  • Token_uri
  • Client_secret

Create a Bucket and Files

  • After configuring the Credentials, click on the “Storage” section in the navigation pane. To create a new bucket, click on “Create Bucket” and follow the wizard to create the bucket in Google Cloud Storage.
  • Once the bucket is created, open the bucket and click on “UPLOAD FILES” to upload a file into this bucket.
  • Once the file is uploaded it will be displayed as shown below.

  • On selecting the file we would be able to see the information about these files as shown below. In order to access this file from the Denodo Platform we will use the REST API URL for this file. The REST API information of the Google Cloud services can be found in the Google Cloud Documentation.

Connecting to Google Cloud Storage files from Denodo

After completing the above steps, follow the steps given under this section to connect to Google Cloud Storage from the Denodo Platform.

 

  • Launch the Web Design Studio Tool and then select the type of data source needed depending on the type of file which you want to access from Google Cloud storage by navigating to New > Data Source in the contextual menu. In this example, a “JSON ” data source will be used.

Note: For versions prior to Denodo 8, you can use the Virtual DataPort Administration Tool to follow these steps.

  • Select the HTTP Client” option as the Data Route parameter.

  • Configure the “HTTP Client” data route to access the Google Cloud storage by expanding the “Configuration” icon.
  1. HTTP Method: GET.
  2. URL: Specify the URL to retrieve the files from the Google Cloud storage. This URL must be the REST API URL to access the storage account. The REST API URL information can be found from Google Documentation.

  • In the “Authentication” tab, choose the authentication as “OAuth 2.0” in the drop-down list.
  1. Specify the Client Identifier and Client secret generated in the “Create an oAuth Client ID credentials” section of Prerequisites.
  2. Launch the OAuth Credentials Wizard by clicking the link.

  • In the “OAuth 2.0 Credentials Wizard”, enter the “Authorization server URL” and  “Token endpoint URL” which is available in the “Downloaded JSON” file which we explained in the prerequisite section of this document.
  • Enter a redirect URI. The recommendation is to use the default (http://localhost:9090/oauth/2.0/redirectURL.jsp).
  • Enter the Scopes for accessing the resource for the application. The list of available scopes for an Cloud Storage API can be found here.
  • Click on Generate the authorization URL, the Virtual DataPort will generate an encoded URL according to the parameters provided and displayed next to the “Open URL” link.

  • Open the URL in any browser would open the consent screen to provide the consent for the application to access the Google Cloud storage based on the scope defined in the scope section of the field.

  •  Once  you approve the consent form, a new response page will be displayed in the browser.

  • Copy the generated URL from the response and paste it in the field “Paste the authorization response URL”. Then, click on the “OAuth 2.0 Credentials” text link from the “Obtain the OAuth 2.0 credentials” Section.

 

  • Once the OAuth 2.0 credentials have been obtained, click “Ok” to store them.

  • Then, click on “Test Connection” and if the connection is successful, click on “Save”.

  • Once the data source is created, click the “Create base view” to create a base view to introspect source metadata available through the Data Source.

  • Click “Save” to create the base view.
  • Now, the base view created on top of the JSON file stored in Google Cloud Storage is ready for the execution and to be combined with the rest of the sources.

Connecting to Google Cloud Storage through Denodo Distributed File System Custom Wrapper

Denodo also provides a Distributed File System Custom Wrapper through which the various formats of files stored in Google Cloud Storage can be accessed.

What is the Denodo Distributed File System Custom Wrapper?

The Distributed File System Custom Wrapper distribution contains five Virtual DataPort custom wrappers capable of reading several file formats stored in Azure Blob Storage, HDFS, Amazon S3, Azure Data Lake Storage, Azure Data Lake Storage Gen 2 and Google Cloud Storage.

Supported formats are:

  • Delimited text files
  • Sequence files
  • Map files
  • Avro files
  • Parquet files

The Denodo Distributed File System Custom Wrapper component is available to download for Denodo support users from the Denodo Connects section of the Denodo Support Site.

As a first step, download the Denodo Distributed File System Custom Wrapper distribution and unzip it. Import the Custom Wrapper into the Virtual Dataport Server by selecting the   denodo-dfs-customwrapper-${version}-jar-with-dependencies.jar file from the denodo-dfs-customwrapper-${version}\dist folder.

  • Launch the Web studio and navigate to the “File > Extension management” and create a new item by selecting the jar file.

  • Next, create a new Custom data source by clicking “New > Data source > Custom”.

  • As an example, we will consider accessing a delimited file stored in Google Cloud Storage. For this, choose the class name as “com.denodo.connect.dfs.wrapper.DFSDelimitedTextFileWrapper” while configuring the Custom data source..

  • Click Refresh Input Parameters option to load the input parameters section based on the Class name selected.
  • File System URL: Provide the Google Storage bucket path. For e.g
  • gs://<bucket name>
  • Expand the Custom core-site.xml section and provide the Custom core-site.xml file path. Also, You can use the core-site.xml, located in the denodo-dfs-customwrapper-${version}\conf folder of the distribution.
  • Click “Ok” to save the datasource connection.

Configuring authentication properties

  • To access the Cloud Storage using this Custom wrapper, the service account key file path needs to be saved in a file named “core-site.xml”. In order to generate this JSON file, refer to the Creating and Managing Service Account Section of the Google Cloud Storage Documentation.
  • Once the Service account key file is generated then copy the key file into the secure place and modify the below properties in the Custom core-site.xml file. You can use the core-site.xml, located in the conf folder of the distribution, as a guide.

<!-- Google Cloud Storage credentials -->

<property>

<name>google.cloud.auth.service.account.enable</name>        <value>true</value>

</property>

<property>          <name>google.cloud.auth.service.account.json.keyfile</name>        <value>C:/Users/denodo-287406-462cbdb59b2d.json</value>

</property>

  • Now, click on “Create base view” and provide the necessary parameters as follows,

  • Path: /<folderName_ifany>/<fileName.csv>
  • File Name Pattern: <filename_*.csv>. It is Optional field.
  • Separator: <provide the column separator>. The Default value is “,”.

  • Click “Create Base View” to create the base view.

  • Now, the base view created on top of the Delimited text file stored in Google Cloud Storage is ready for the execution and can be combined with the rest of the sources.

  • Similarly, you can access other types of files available in Google Cloud Storage by using the corresponding Class Name in the Data Source Configuration.

Note:

The user who is accessing Google Cloud Storage must be an authenticated user.

References

Google Cloud Storage Documentation: Creating Client Credentials ID 

Google Cloud Storage Documentation: Creating and Managing Service Account.

Google Cloud Storage Documentation: Create a Bucket

Google Cloud Storage Documentation: Scopes for Cloud Storage APIs

Virtual DataPort Administration Guide: OAuth Authentication

Virtual DataPort Administration Guide: JSON Data Sources

Denodo Distributed File System Custom Wrapper: User Manual

Questions

Ask a question

You must sign in to ask a question. If you do not have an account, you can register here