AWS Glue Data Catalog¶
If you already manage your tables within AWS Glue Data Catalog, the Denodo Embedded MPP can connect to it and use it as an external Metastore. This allows you to leverage your existing tables and access your Hive, Iceberg, and Delta Lake tables directly.
To connect to AWS Glue Data Catalog, you need to define a new catalog within your Denodo Embedded MPP configuration.
The recommended method is to use the presto.catalog property in your values.yaml file.
This approach simplifies management and version upgrades.
Once configured, this new catalog will be accessible from the From MPP Catalogs tab in the Denodo Embedded MPP data source,
enabling you to graphically explore and create base views over your AWS Glue Data Catalog tables.
![]()
Create AWS Glue Views From MPP Catalogs¶
Here are examples of how to define glue-hive, glue-iceberg, and glue-delta catalogs that connect to AWS Glue Data Catalog,
placed within the catalog section of your values.yaml:
# -- Additional catalogs. Uncomment and configure as needed
catalog:
# Example: Hive Catalog for AWS Glue
#glue-hive: |-
# connector.name=hive-hadoop2
# hive.metastore=glue
# hive.metastore.glue.region=your_aws_region # e.g., us-east-1
# hive.metastore.glue.catalogid=your_aws_account_id # Your AWS Account ID
# hive.metastore.glue.aws-access-key=YOUR_ACCESS_KEY
# hive.metastore.glue.aws-secret-key={ENV:AWS_SECRET}
# hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml
# hive.parquet.use-column-names=true
# hive.parquet-batch-read-optimization-enabled=true
# hive.pushdown-filter-enabled=true
# hive.quick-stats.enabled=true
# hive.skip-empty-files=true
# Example: Iceberg Catalog for AWS Glue
#glue-iceberg: |-
# connector.name=iceberg
# iceberg.catalog.type=HIVE
# hive.metastore=glue
# hive.metastore.glue.region=your_aws_region # e.g., us-east-1
# hive.metastore.glue.catalogid=your_aws_account_id # Your AWS Account ID
# hive.metastore.glue.aws-access-key=YOUR_ACCESS_KEY
# hive.metastore.glue.aws-secret-key={ENV:AWS_SECRET}
# hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml
# hive.parquet-batch-read-optimization-enabled=true
# hive.pushdown-filter-enabled=true
# Example: Delta Lake Catalog for AWS Glue
#glue-delta: |-
# connector.name=delta
# hive.metastore=glue
# hive.metastore.glue.region=
# hive.metastore.glue.region=your_aws_region # e.g., us-east-1
# hive.metastore.glue.catalogid=your_aws_account_id # Your AWS Account ID
# hive.metastore.glue.aws-access-key=YOUR_ACCESS_KEY
# hive.metastore.glue.aws-secret-key={ENV:AWS_SECRET}
# hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml
# hive.parquet-batch-read-optimization-enabled=true
# hive.pushdown-filter-enabled=true
As an alternative to using values.yaml, you can define a new catalog by creating a separate properties file directly in the presto/conf/catalog/
folder of the Embedded MPP Helm chart (e.g., presto/conf/catalog/glue_iceberg.properties). The file name, e.g., glue_iceberg, will become the catalog name in the Embedded MPP.
Supported Operations by Format¶
The following table summarizes the AWS Glue Data Catalog operations supported by the Denodo Embedded MPP for various table formats:
Operation |
Hive |
Iceberg |
Delta |
|---|---|---|---|
Read |
Yes |
Yes |
Yes |
Create/Insert |
Yes (*) |
Yes (**) |
No |
Update |
No |
Yes (***) |
No |
Delete |
No |
Yes (****) |
No |
(*) To support write operations into Hive tables, ensure that your Hive catalog configuration includes the following property:
hive.non-managed-table-writes-enabled=true
(**) To support write operations in Iceberg catalogs other than the predefined iceberg catalog, you must also configure Denodo to recognize these additional catalog names by executing:
SET 'com.denodo.util.jdbc.inspector.impl.PrestoJDBCInspector.iceberg.catalogNames' = 'iceberg, other_iceberg, another_iceberg';
(***) Update operations are supported from Denodo 9.3.0. Iceberg table updates require at least format version 2 and update mode must be merge-on-read
(****) Delete operations are supported from Denodo 9.2.0.
For Iceberg V1 tables, Denodo can only delete data in one or more entire partitions. All columns used in the WHERE clause
must all be identity transformed partition columns of the target table.
Authentication and AWS Credentials¶
The recommended way to connect to AWS Glue Data Catalog from Denodo Embedded MPP is without explicitly providing AWS access and secret keys. This is achieved by leveraging AWS IAM authentication methods:
EKS Pod Identities
IAM Roles for Service Accounts (IRSA)
IAM EC2 Instance Profile
In scenarios where the above IAM role-based authentication methods are not applicable (e.g., outside EKS or in specific custom setups), you can explicitly provide AWS credentials. In this case, you will need to provide both Glue and S3 credentials because the Embedded MPP needs access to the S3 files where the actual data resides, in addition to the Glue metadata.
You can provide these credentials either as:
Access and secret key
hive.metastore.glue.aws-access-keyandhive.metastore.glue.aws-secret-key, for Glue access.hive.s3.aws-access-keyandhive.s3.aws-secret-key, for S3 file access.
You can configure this credentials by adding new environment variables to the definition of the Kubernetes Secret
mpp-credentials. This secret is used to manage credentials for the Denodo Embedded MPP deployment. You can check more aboutmpp-credentialsconfiguration in the MPP Deployment page.The following command demonstrates how to define a new environment variable named AWS_SECRET within the secret:
kubectl create secret generic mpp-credentials --from-literal=AWS_SECRET=<credential>
Once you have defined the environment variable in the secret, you can reference it within your
values.yamlfile or catalog properties file using the following syntax:glue-iceberg: |- . . hive.metastore.glue.aws-secret-key=${ENV:AWS_SECRET} . .IAM Role (Assumed Role):
hive.metastore.glue.iam-role, for Glue access.hive.s3.iam-role, for S3 file access.
These properties need to be added to your catalog definition, in values.yaml or a .properties file.
AWS Privileges for AWS Glue Data Catalog¶
The IAM role or AWS credentials used by the Denodo Embedded MPP must have the appropriate AWS privileges to access the AWS Glue Data Catalog.
Reading from AWS Glue. The IAM role/user needs the following minimum permissions:
glue:GetDatabasesglue:GetDatabaseglue:GetTablesglue:GetTableglue:GetPartitionsglue:GetPartitionglue:BatchGetPartition
Writing to AWS Glue. In addition to the read permissions, the IAM role/user will also require these permissions:
glue:CreateTableglue:DeleteTableglue:UpdateTableglue:BatchCreatePartitionglue:UpdatePartitionglue:DeletePartition
Note: Ensure that the associated IAM role/user also has the necessary S3 permissions (s3:GetObject, s3:PutObject, etc.)
for the specific S3 buckets where your data files are stored.

