How to Connect to a Kerberized HDFS From the Denodo Lakehouse Accelerator Data Source¶
The connection to a kerberized HDFS is slightly different for the Denodo Lakehouse Accelerator (formerly known as Denodo Embedded MPP) data source than for regular ones. The following configuration applies for both Object Storage and Bulk Load configuration. In order to configure the connection, first start by configuring the target file system and the target uris, as shown in the picture:
Then, the authentication has to be configured. There are supported three different authentication mechanisms:
Simple: connect to the HDFS with the specified user.
Kerberos user & password: connects to the HDFS with the specified client principal and password.
Kerberos with keytab: connects to the HDFS with the specified client and keytab
Finally, a krb5.conf configuration file must exist on the host machine to be able to connect to Kerberos (see Providing a Krb5 File for Kerberos Authentication).
The Denodo Lakehouse Accelerator has some limitations when dealing with Kerberos secured catalogs:
It is not possible to configure two different catalogs accessing different HDFS where one is secured using Kerberos and the other is not. In this scenario, the Denodo Lakehouse Accelerator is not able to access the HDFS that does not use Kerberos.
It is not possible to create a base view from data in Delta Lake format inside an HDFS.
Kerberos Constrained Delegation is not supported.
