Delta Lake¶
Delta Lake is table format that extends Parquet data files with a file-based transaction log for ACID transactions and scalable metadata handling.
The Denodo Embedded MPP is distributed with a predefined catalog named delta
connected to the Embedded Hive Metastore for accessing Delta Lake tables.
connector.name=delta
# Embedded Hive Metastore
hive.metastore.uri=thrift://hive-metastore:9083
To query Delta Lake tables you have to manually register those tables in the Embedded MPP’s Metastore through the
CREATE TABLE
sentence. Since the schema and the data file list are located in the Delta Log at the table’s location,
you need to provide a dummy column as the schema of the Delta Lake table, to avoid the no columns error in the Metastore:
CREATE TABLE delta.default.orders (
dummy bigint
) WITH (
external_location = 'abfs://<file_system>@<account_name>.dfs.core.windows.net/<path>/<file_name>',
format = 'PARQUET'
);
The WITH
clause of the CREATE TABLE
can also be used to set other properties on the table. See Delta Lake Tables Properties.
Once the Delta Lake table is registered, you can use the embedded data source in Denodo to create a base view on top of
the table using the From MPP Catalogs
tab.

Explore Delta Lake tables¶
Query Delta Lakes tables directly¶
Another option is to query the table directly using the table location as the table name without registering it in the Metastore.
SELECT * FROM
delta."$path$"."abfs://<file_system>@<account_name>.dfs.core.windows.net/<path>/<file_name>";
Features¶
The Denodo Embedded MPP provides the following features when treating with Delta Lake tables:
Create base views over existing Delta Lake tables in the Embedded or External Metastore (
From MPP Catalogs
tab of the Embedded MPP data source)Querying
Note
Delta protocol version (3, 7) is supported from Denodo Embedded MPP 20241007
Limitations¶
Graphically explore Delta Lake datasets, create tables in the MPP and base views in Denodo (
From object storage
tab of the Embedded MPP data source)Bulk data load
Caching: full cache mode
Remote tables
Supported Operation by Metastore Type¶
Operation |
Embedded Hive Metastore |
External Hive Metastore |
AWS Glue Data Catalog |
---|---|---|---|
Read |
Yes |
Yes |
Yes |
Create/Insert |
No |
No |
No |
Update |
No |
No |
No |
Delete |
No |
No |
No |