Bulk Data Load with the Denodo MPP¶
The standard JDBC data source connected to the Embedded MPP can be used to perform bulk data loads from Denodo.
In this case, Denodo:
First generates temporary files containing the data to insert in Parquet file format or Iceberg table format (Delta Lake table format not supported),
Then uploads those files to the specific path configured in this section.
Finally, Denodo will make the necessary operations to make sure the database table takes the data from the path provided.
For more information see Bulk Data Load on Databases using Hadoop-compatible storage like in the MPP at Denodo.
Before setting up the Bulk Data Load in Denodo, you have to create a new schema in the Denodo Embedded MPP that sets the location where the files created by Denodo will be located.
To do this you can use the Denodo stored procedure CREATE_SCHEMA_ON_SOURCE:
CALL CREATE_SCHEMA_ON_SOURCE(
'admin_denodo_mpp',
'embedded_mpp',
'hive',
'test',
'<filesystem_schema>://<host>/<folders>');
To configure the Bulk Data Load, check Use Bulk Data Load APIs
of the embedded_mpp
data source in its Read & Write
tab
and at least fill in the parameters:
HDFS URI
Server time zone
Catalog
Schema
Then, depending on the chosen file system, you may need to add some Hadoop properties
to configure authentication,
see Support for Hadoop-compatible routes section.
In the example, properties to configure access to Azure Data Lake Storage are:
Finally, click the Test bulk load
button to check that everything is working fine.