This document describes how to access Cloudera from the Denodo Platform.
Cloudera Impala is a SQL engine provided with the Cloudera Hadoop distribution that provides fast interactive SQL queries directly on Hadoop data stored in HDFS or HBase. Impala provides a JDBC driver which Denodo can readily use to connect.
Connecting to Cloudera from Denodo
- From the Virtual DataPort Administration tool, create a new JDBC data source by selecting “File > New > Data source > JDBC”. This will open a connection wizard to create a data source with a JDBC driver.
- To create a connection, fill all the required fields:
- Name: ds_impalacloudera
- Database adapter: Impala 2.3
- Database URI: Use a connection string in the form of: jdbc:impala://<server>:<port>/<schema>
For example: jdbc:impala://localhost:21000/database
- Username: Enter the username to connect to Impala.
- Password: Enter the appropriate password.
- Once the details are filled, click on “Test Connection” and if the connection is successful, click on “Save”.
- Once the data source is created, create base views for that particular source. In order to do that, click on the "Create Base View" option.
- The Tool will then display a tree with the schemas of the database. Click on any schema to inspect its tables and their fields. To search a view or a schema, type its name in the “search” box located at the top. The list will only show the elements whose name contains the text you entered.
- To incorporate some of the tables into the Denodo virtual schema, you have to check the box near the tables or views you want to import and then click “Create selected”.
- When the importing process is finished, the new views are displayed.
- Now, the base views are ready for the execution and to be combined with the rest of the sources.
Virtual DataPort Administration Guide: JDBC Sources
Virtual DataPort Administration Guide: Impala