AWS Glue Data Catalog¶
If you already manage your tables within AWS Glue Data Catalog, the Denodo Embedded MPP can connect to it and use it as an external Metastore. This allows you to leverage your existing tables and access your Hive, Iceberg, and Delta Lake tables directly.
To connect to AWS Glue Data Catalog, you need to define a new catalog within your Denodo Embedded MPP configuration.
The recommended method is to use the presto.catalog property in your values.yaml file.
This approach simplifies management and version upgrades.
Once configured, this new catalog will be accessible from the From MPP Catalogs tab in the Denodo Embedded MPP data source,
enabling you to graphically explore and create base views over your AWS Glue Data Catalog tables.
![]()
Create AWS Glue Views From MPP Catalogs¶
Here are examples of how to define glue-hive, glue-iceberg, and glue-delta catalogs that connect to AWS Glue Data Catalog,
placed within the catalog section of your values.yaml:
# -- Additional catalogs. Uncomment and configure as needed
catalog:
# Example: Hive Catalog for AWS Glue
#glue-hive: |-
# connector.name=hive-hadoop2
# hive.metastore=glue
# hive.metastore.glue.region=your_aws_region # e.g., us-east-1
# hive.metastore.glue.catalogid=your_aws_account_id # Your AWS Account ID
# hive.metastore.glue.aws-access-key=YOUR_ACCESS_KEY
# hive.metastore.glue.aws-secret-key=YOUR_SECRET_KEY
# hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml
# hive.parquet.use-column-names=true
# hive.parquet-batch-read-optimization-enabled=true
# hive.pushdown-filter-enabled=true
# hive.quick-stats.enabled=true
# hive.skip-empty-files=true
# Example: Iceberg Catalog for AWS Glue
#glue-iceberg: |-
# connector.name=iceberg
# iceberg.catalog.type=HIVE
# hive.metastore=glue
# hive.metastore.glue.region=your_aws_region # e.g., us-east-1
# hive.metastore.glue.catalogid=your_aws_account_id # Your AWS Account ID
# hive.metastore.glue.aws-access-key=YOUR_ACCESS_KEY
# hive.metastore.glue.aws-secret-key=YOUR_SECRET_KEY
# hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml
# hive.parquet-batch-read-optimization-enabled=true
# hive.pushdown-filter-enabled=true
# Example: Delta Lake Catalog for AWS Glue
#glue-delta: |-
# connector.name=delta
# hive.metastore=glue
# hive.metastore.glue.region=
# hive.metastore.glue.region=your_aws_region # e.g., us-east-1
# hive.metastore.glue.catalogid=your_aws_account_id # Your AWS Account ID
# hive.metastore.glue.aws-access-key=YOUR_ACCESS_KEY
# hive.metastore.glue.aws-secret-key=YOUR_SECRET_KEY
# hive.config.resources=/opt/presto-server/etc/catalog/core-site.xml
# hive.parquet-batch-read-optimization-enabled=true
# hive.pushdown-filter-enabled=true
As an alternative to using values.yaml, you can define a new catalog by creating a separate properties file directly in the presto/conf/catalog/
folder of the Embedded MPP Helm chart (e.g., presto/conf/catalog/glue_iceberg.properties). The file name, e.g., glue_iceberg, will become the catalog name in the Embedded MPP.
Supported Operations by Format¶
The following table summarizes the AWS Glue Data Catalog operations supported by the Denodo Embedded MPP for various table formats:
Operation |
Hive |
Iceberg |
Delta |
|---|---|---|---|
Read |
Yes |
Yes |
Yes |
Create/Insert |
Yes (*) |
No |
No |
Update |
No |
No |
No |
Delete |
No |
No |
No |
(*) To support write operations into Hive tables, ensure that your Hive catalog configuration includes the following property:
hive.non-managed-table-writes-enabled=true
Authentication and AWS Credentials¶
The recommended way to connect to AWS Glue Data Catalog from Denodo Embedded MPP is without explicitly providing AWS access and secret keys. This is achieved by leveraging AWS IAM authentication methods:
EKS Pod Identities
IAM Roles for Service Accounts (IRSA)
IAM EC2 Instance Profile
In scenarios where the above IAM role-based authentication methods are not applicable (e.g., outside EKS or in specific custom setups), you can explicitly provide AWS credentials. In this case, you will need to provide both Glue and S3 credentials because the Embedded MPP needs access to the S3 files where the actual data resides, in addition to the Glue metadata.
You can provide these credentials either as:
Access and secret key
hive.metastore.glue.aws-access-keyandhive.metastore.glue.aws-secret-key, for Glue access.hive.s3.aws-access-keyandhive.s3.aws-secret-key, for S3 file access.
IAM Role (Assumed Role):
hive.metastore.glue.iam-role, for Glue access.hive.s3.iam-role, for S3 file access.
These properties need to be added to your catalog definition, in values.yaml or a .properties file.
AWS Privileges for AWS Glue Data Catalog¶
The IAM role or AWS credentials used by the Denodo Embedded MPP must have the appropriate AWS privileges to access the AWS Glue Data Catalog.
Reading from AWS Glue. The IAM role/user needs the following minimum permissions:
glue:GetDatabasesglue:GetDatabaseglue:GetTablesglue:GetTableglue:GetPartitionsglue:GetPartitionglue:BatchGetPartition
Writing to AWS Glue. In addition to the read permissions, the IAM role/user will also require these permissions:
glue:CreateTableglue:DeleteTableglue:UpdateTableglue:BatchCreatePartitionglue:UpdatePartitionglue:DeletePartition
Note: Ensure that the associated IAM role/user also has the necessary S3 permissions (s3:GetObject, s3:PutObject, etc.)
for the specific S3 buckets where your data files are stored.

