USER MANUALS

Kerberos

The Embedded MPP supports Kerberos authentication for the following components: - Kerberos for HDFS: Allows the MPP Hive connector to securely access data residing in a Hadoop cluster that utilizes Kerberos authentication. - Kerberos in Embedded Hive Metastore: Allows the Embedded Hive Metastore to securely access data residing in a Hadoop cluster that utilizes Kerberos authentication. - Kerberos in External Hive Metastore: Enables secure communication when connecting to an externally managed Hive Metastore instance.

Kerberos for HDFS

If the Parquet files are located in an HDFS cluster secured with Kerberos, you need to configure Kerberos-related properties in the Embedded MPP hive catalog in values.yaml.

  1. Configure Kerberos properties in values.yaml. Add the following properties to the presto.hive.additionalConfig section:

    presto:
      hive:
        additionalConfig: [
          hive.hdfs.authentication.type=KERBEROS,
          hive.hdfs.presto.principal=xxxx@REALM, # Replace xxxx@REALM with your principal
          hive.hdfs.presto.keytab=/opt/secrets/xxx.keytab # Replace xxx.keytab with the name of your keytab file
        ]
    

    This configuration ensures that the Embedded MPP connects to HDFS using the specified Kerberos principal, hive.hdfs.presto.principal, and its corresponding keytab file, hive.hdfs.presto.keytab.

  2. Kerberos Files:

    • Place the keytab file in the presto/secrets folder within your Denodo Embedded MPP Helm chart.

    • Place the krb5.conf file in the presto/conf/catalog/ folder.

    • Add the following property to the values.yaml to inform Java about the krb5.conf location:

    presto:
      jvm:
        additionalJVMConfig: [
          -Djava.security.krb5.conf=/opt/presto-server/etc/catalog/krb5.conf
        ]
    
  3. Enable HDFS Wire Encryption (if required):

    If the HDFS cluster has HDFS wire encryption enabled, you must add one more property to the presto.hive.additionalConfig section in values.yaml:

       hive.hdfs.wire-encryption.enabled=true
    

For more detailed information on Hive Security Configuration and Kerberos support, you can refer to Hive Security Configuration — Kerberos Support.

Kerberos in Embedded Hive Metastore

If the dataset is located in an HDFS with Kerberos, you need to configure Kerberos-related properties in the Embedded Hive Metastore:

  • You need to place the keytab file in the hive-metastore/secrets folder.

  • Add the Hadoop properties related to Kerberos in the hive-metastore/conf/core-site.xml. This is just an example, it may be necessary to add extra properties.

    Kerberos configuration in core-site.xml
     <property>
       <name>hadoop.security.authorization</name>
       <value>true</value>
     </property>
    
     <property>
       <name>hadoop.security.authentication</name>
       <value>kerberos</value>
     </property>
    
     <property>
       <name>hadoop.http.authentication.kerberos.keytab</name>
       <value>/opt/secrets/xxxx.keytab</value>
     </property>
    
     <property>
       <name>dfs.datanode.kerberos.principal</name>
       <value>hdfs/xxxxx@YYYYYYY</value>
     </property>
    

    This way the Embedded Hive Metastore connects to HDFS as the Kerberos principal dfs.datanode.kerberos.principal, using the keytab hadoop.http.authentication.kerberos.keytab.

  • Place the krb5.conf in hive-metastore/conf

  • Add the following volumeMount to to the additionalVolumeMounts property for metastore in the values.yaml.

    Kerberos volumeMount in the additionalVolumeMounts property for metastore in the values.yaml
    additionalVolumeMounts:
      - name: hive-metastore-vol
        mountPath: /etc/krb5.conf
        subPath: krb5.conf
    
  • Replace the command in templates/hive-metastore-template.yaml with:

    Kerberos command in hive-metastore-template.yaml
      command: ['sh', '-c', "kinit -k -t /opt/secrets/xxxx.keytab xxxx@YYYY; /opt/run-hive-metastore.sh"]
    

Important

The Kerberos ticket of the Embedded Hive Metastore needs to be renewed periodically. You can automatically run the kinit -k command by setting up a cron job.

Kerberos in External Hive Metastore

If the external Hive Metastore or the underlying HDFS uses Kerberos authentication, you have to configure additional properties to enable secure connections from the Embedded MPP.

  1. Configure Kerberos properties for the external Hive Metastore in values.yaml. Add the following properties to the presto.catalog section:

    catalog:
      external_hivems: |-
        connector.name=hive-hadoop2
        hive.metastore.uri=thrift://<external Hive Metastore host>:<external Hive Metastore port>
        ...
        hive.metastore.authentication.type=KERBEROS
        hive.metastore.service.principal=hive/_HOST@REALM
        hive.metastore.client.principal=primary@REALM
        hive.metastore.client.keytab=/opt/secrets/xxx.keytab
    

    This configuration ensures that the Embedded MPP authenticates to the external Hive Metastore using the Kerberos principal specified by hive.metastore.client.principal and its corresponding keytab file. It also verifies the identity of the Hive Metastore service against hive.metastore.service.principal.

  2. Configure Kerberos properties for HDFS access values.yaml (if required).

    If, in addition to the Hive Metastore, the Embedded MPP must also authenticate to HDFS using Kerberos (where your data files reside), configure these properties in the presto.catalog section:

    catalog:
      external_hivems: |-
        connector.name=hive-hadoop2
        hive.metastore.uri=thrift://<external Hive Metastore host>:<external Hive Metastore port>
        ...
        hive.hdfs.authentication.type=KERBEROS
        hive.hdfs.presto.principal=primary@REALM
        hive.hdfs.presto.keytab=/opt/secrets/xxx.keytab
    

    This ensures the Embedded MPP connects to HDFS using the Kerberos principal hive.hdfs.presto.principal, using the keytab hive.hdfs.presto.keytab.

    Note: If you use the same principal for both hive.metastore.client.principal and hive.hdfs.presto.principal, ensure that this principal has the necessary permissions to access both the external Hive Metastore and the HDFS filesystem. Otherwise, you may get Permission denied errors.

  3. Kerberos Files:

    • Place the keytab file in the presto/secrets folder within your Denodo Embedded MPP Helm chart.

    • Place the krb5.conf file in the presto/conf/catalog/ folder.

    • Add the following property to the values.yaml to inform Java about the krb5.conf location:

    presto:
      jvm:
        additionalJVMConfig: [
          -Djava.security.krb5.conf=/opt/presto-server/etc/catalog/krb5.conf
        ]
    
  4. Enable HDFS Wire Encryption (if required):

    If the HDFS cluster has HDFS wire encryption enabled, you must add one more property to the presto.catalog section in values.yaml:

    catalog:
      external_hivems: |-
        connector.name=hive-hadoop2
        hive.metastore.uri=thrift://<external Hive Metastore host>:<external Hive Metastore port>
        ...
    
        hive.hdfs.wire-encryption.enabled=true
    

For more detailed information on Hive Security Configuration and Kerberos support, you can refer to Hive Security Configuration — Kerberos Support.

Add feedback