External Hive Metastore¶
In case that you already have a Hive Metastore containing table definitions you want to access from the Denodo Embedded MPP, you can use that Hive Metastore as an external Metastore.
To do this, you can manually define a new catalog by creating the properties file in presto/conf/catalog/
,
e.g., presto/conf/catalog/external_hivems.properties
. The file name, external_hivems
, would be the catalog name.
To create the new catalog properties file, you can copy one of the catalogs shipped by default, depending on the files or tables you want to read:
copy the
hive.properties
for accessing Hive tables over Parquet files.copy the
delta.properties
for accessing Delta Lake tables.copy the
iceberg.properties
for accessing Iceberg tables.
Then, fill in the hive.metastore.uri
property with the URI of the external Hive Metastore.
Typically, you will have to add the hdfs-site.xml
and the core-site.xml
files to presto/conf/catalog
and
reference them in the hive.config.resources
property.
connector.name=hive-hadoop2
hive.metastore.uri=thrift://acme:9083
hive.config.resources=/opt/presto-server/etc/catalog/core-site-external.xml,/opt/presto-server/etc/catalog/hdfs-site-external.xml
# Bulk Data Load
hive.allow-drop-table=true
hive.non-managed-table-writes-enabled=true
# Avoids exceptions in partitioned tables
hive.parquet.use-column-names=true
Kerberos¶
If the external Hive Metastore uses Kerberos authentication, you will need to configure additional properties in the catalog properties file:
hive.metastore.authentication.type=KERBEROS
hive.metastore.service.principal=hive/_HOST@REALM
hive.metastore.client.principal=primary@REALM
hive.metastore.client.keytab=/opt/secrets/xxx.keytab
This way the Embedded MPP connects to the Hive Metastore as the Kerberos principal hive.metastore.client.principal
,
using the keytab hive.metastore.client.keytab
, and verifies that the identity of the Hive Metastore matches hive.metastore.service.principal
.
You need to place the keytab file in the presto/secrets
folder.
You also need to place the krb5.conf
in the presto/conf/catalog/
folder. And add the following configuration property
to the values.yaml
:
presto:
jvm:
additionalJVMConfig: [
-Djava.security.krb5.conf=/opt/presto-server/etc/catalog/krb5.conf
]
If additionally, the Embedded MPP must authenticate to HDFS using Kerberos, you will need to configure additional properties in the catalog properties file:
hive.hdfs.authentication.type=KERBEROS
hive.hdfs.presto.principal=primary@REALM
hive.hdfs.presto.keytab=/opt/secrets/xxx.keytab
This way the Embedded MPP connects to HDFS as the Kerberos principal hive.hdfs.presto.principal
,
using the keytab hive.hdfs.presto.keytab
.
You need to place the keytab file in the presto/secrets
folder.
If you use the same principal for both hive.metastore.client.principal
and hive.hdfs.presto.principal
make sure the principal has also access to HDFS, not just Hive Metastore, otherwise you will get a Permission denied
error.
If hadoop.rpc.protection=privacy
is required by the Hadoop Cluster then one more property must be added to the catalog properties file:
hive.hdfs.wire-encryption.enabled=true
You can find more information in Hive Security Configuration — Kerberos Support.