Integrating Google BigQuery with Denodo

Applies to: Denodo 8.0 , Denodo 7.0 , Denodo 6.0
Last modified on: 22 May 2020
Tags: JDBC data sources JSON data sources

Download document

You can translate the document:

Introduction

Google BigQuery is a cloud-based enterprise data warehouse that allows its users to store and query massive datasets. BigQuery provides a web UI and a command line tool, as well as different access methods such as a REST API and multiple client libraries (Java, .NET or Python).

In this article, we are going to configure the Denodo Platform to access BigQuery using a JDBC driver. We have created some sample data for this article from the public BigQuery catalog. The sample data contains only the table outpatient_charges_2014 from the medicare dataset.

Outpatient_charges_2014 table in Google BigQuery

We can execute queries from the web interface to get the data we need and compare the results to those retrieved by Denodo Virtual DataPort.

For instance, the execution of:

SELECT a.*

FROM [bigquery-156210:test.outpatient_charges_2014] as a

LIMIT 1000 

will look like:

Results of outpatient_charges_2014 in Google BigQuery

Connecting to BigQuery using the JDBC driver

First of all, we need to download the JDBC driver to connect to BigQuery. It can be downloaded from here: https://cloud.google.com/bigquery/providers/simba-drivers

Once the JDBC driver is downloaded, please follow these steps to configure a data source in Denodo:

  1. Unzip the contexts of the downloaded file in

 <DENODO_HOME>/lib/extensions/jdbc-drivers-external/bigquery

  1. Restart the VDP server to apply the changes.
  2. Now, we also need to generate the credentials to access our BigQuery project.

        

        Go to the Google Cloud Platform Dashboard and click on “Credentials”:

Creating the service account.

Once in “Credentials”, click on “Create Credentials”-> “Service Account”.

Creating the service account.

Fill the form providing the name of the service account to create your credentials.

Creating the service account.

Next, select the role or roles for the new account according to your needs.

Creating the service account.

The following step is to generate a key to access the data

Creating the service account.

For this particular case, we will generate a JSON key that will download a file with the key and the proper credentials.

Generation of the key file.

Copy that file to your desired location. Now we can go back to Denodo to finish the configuration.

  1. Open Denodo Design Studio

Connecting to Denodo Design Studio.

  1. Create a JDBC Data Source (File > New > Data Source > JDBC)

Creation of a data source.

  • Database Adapter : Google Big Query
  • Database URI: jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<PROJECT_ID>;OAuthType=0;OAuthServiceAcctEmail=<EMAIL>;OAuthPvtKeyPath=<PATH_TO_CREDENTIALS_FILE>;Timeout=45

                

The values for PROJECT_ID and EMAIL can be obtained from the credentials JSON file created above.

  1. Click on “Save” to create the data source.

  1. Create a new base view

Creation of a  base view.

  1. Denodo will automatically search for the available schemas and tables in BigQuery.

Schemas and tables available from the source.

  1. In this case, the base view could be our “outpatient_charges_2014”.

Creation of the base view.

Note: if you want to access Google BigQuery public Datasets, you will need to use the option “Create from Query” and provide the desired sentence.

  1. Click “Ok” and “Create selected” to save the view and the base view that we created looks like this:

The schema of the example base view.

Finally, the results of our execution will look like...

The results of the base view we created in our example.

Conclusion

Using a JDBC data source is the best option to integrate Google BigQuery data into the Denodo Platform.

It is also possible to use the Google REST API and create a JSON data source.

The main inconvenience of using the REST API is that an additional flatten view be needed to flatten the array returned by the API, and then the fields in the base view will have to be projected manually as the field output of a derived view. However, once this flatten view is created, it can be queried normally and it will access Google BigQuery directly without any third party software in the middle.

The JDBC driver is a third-party driver that may not support all the features included in the REST API. When working with this driver in the JDBC data source the tables in BigQuery cannot be introspected and the “Create from query” option must be used to develop base views.

References

Google BigQuery

BigQuery REST API

CData JDBC Driver for Google BigQuery

Questions

Ask a question
You must sign in to ask a question. If you do not have an account, you can register here

Featured content

DENODO TRAINING

Ready for more? Great! We offer a comprehensive set of training courses, taught by our technical instructors in small, private groups for getting a full, in-depth guided training in the usage of the Denodo Platform. Check out our training courses.

Training