Azure Data Lake Gen 2¶
Before deploying the Denodo Embedded MPP on Azure Kubernetes Service check Denodo Embedded MPP Azure Checklist to make sure you have everything you need.
There are three options to deploy a Denodo Embedded MPP that will access Data Lake Storage gen2 datasets:
The recommended one: provide no credentials to the
cluster.sh deploy
command.cluster.sh deploy --credstore-password xxx
Used when the Denodo Embedded MPP will run in Azure Kubernetes Service and will access Data Lake Storage gen2 using Azure Managed Identities.
For this you need to add the following properties to the
presto/conf/catalog/core-site.xml
andhive-metastore/conf/core-site.xml
, before the Embedded MPP is deployed:core-site.xml using Azure Managed Identities¶<property> <name>fs.azure.account.auth.type</name> <value>OAuth</value> </property> <property> <name>fs.azure.account.oauth.provider.type</name> <value>org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider</value> </property> <property> <name>fs.azure.account.oauth2.msi.tenant</name> <value>MSI Tenant ID</value> </property> <property> <name>fs.azure.account.oauth2.msi.endpoint</name> <value>http://169.254.169.254/metadata/identity/oauth2/token</value> </property> <property> <name>fs.azure.account.oauth2.client.id</name> <value>Client ID</value> </property>
Provide no credentials to the
cluster.sh deploy
command.cluster.sh deploy --credstore-password xxx
But, you have to provide the OAuth2 client credentials in the
core-site.xml
files:presto/conf/catalog/core-site.xml
andhive-metastore/conf/core-site.xml
, before the Embedded MPP is deployed:core-site.xml using OAuth2 client credentials¶<property> <name>fs.azure.account.auth.type</name> <value>OAuth</value> </property> <property> <name>fs.azure.account.oauth.provider.type</name> <value>org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider</value> </property> <property> <name>fs.azure.account.oauth2.client.endpoint</name> <value>https://login.microsoftonline.com/<directory_id>/oauth2/token</value> </property> <property> <name>fs.azure.account.oauth2.client.id</name> <value>Client ID</value> </property> <property> <name>fs.azure.account.oauth2.client.secret</name> <value>Secret</value> </property>
Provide the Azure credentials for the Shared Key authentication method to the
cluster.sh deploy
command:--abfs-storage-account
: the name of the Storage Account--abfs-storage-key
: the access key that protects access to your Storage Account. If this access key is not specified in the command line,cluster.sh deploy
will prompt for it, keeping access keys out of the bash history