USER MANUALS


Troubleshooting Usage

This section provides information on how to resolve the most common problems while using the Denodo Embedded MPP.

To identify and troubleshoot common errors during the deployment of the Denodo Embedded MPP and its use from Denodo Virtual DataPort see Denodo Embedded MPP Troubleshooting.


PKIX path building failed: unable to find valid certification path to requested target

Cause

The server certificate you are trying to connect is missing from the truststore of the client’s JVM. The problem is that the server certificate is self-signed or it is signed by a private authority that does not exist within the client’s truststore.

Solution

Make sure you have imported the certificate of the Denodo Embedded MPP into the Denodo server’s truststore. See the instructions in the SSL/TLS section.

KeyStores with multiple certificates are not supported (Solved from Denodo Embedded MPP 20241007)

Cause

The error occurs when a certificate with multiple SubjectAlternativeName (SAN) elements is used:

 SubjectAlternativeName [
     DNSName: *.acme.com
     DNSName: acme.com
     DNSName: b.acme.com
 ]
Solution

Use a certificate with a single element in the SubjectAlternativeName (SAN) field, as shown below:

 SubjectAlternativeName [
     DNSName: *.acme.com
 ]

Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;)

Cause

The most common cause is that the IAM Role used by the Embedded MPP does not has the required permission to access the S3 data.

Solution

To troubleshoot 403 issues from MPP to S3 data check the following documentation:

Query exceeded per-node user memory limit of xGB, Query exceeded per-node total memory limit of xGB and Query exceeded distributed user memory limit of xGB

Cause
  • Query exceeded per-node user memory limit of xGB: The max amount of user memory a query can use on a worker, the query.max-memory-per-node, has as default value the JVM max memory * 0.1. The JVM max memory that the Embedded MPP allocates is the 80% of the memory machine (the memoryPerNode configured in the values.yaml). But this error means that your Embedded MPP needs more memory than JVM max memory * 0.1 to handle queries.

  • Query exceeded per-node total memory limit of xGB: The max amount of user and system memory a query can use on a worker, the query.max-total-memory-per-node, has as default value the JVM max memory * 0.2. The JVM max memory that the Embedded MPP allocates is the 80% of the memory machine (the memoryPerNode configured in the values.yaml). But this error means that your Embedded MPP needs more memory than JVM max memory * 0.2 to handle queries.

  • Query exceeded distributed user memory limit of xGB: The max amount of user memory that a query can use across all MPP Workers in the cluster, the query.max-memory, has as default value the 20GB. But this error means that your Embedded MPP needs more memory than 20GB to handle queries.

Solution

You can run the EXPLAIN (TYPE DISTRIBUTED) with these big queries, from an external JDBC client like DBeaver, to examine the query plan and try to optimize it by gathering statistics for each view involved in the queries, by adding filter conditions to reduce the amount of data to be processed, etc.

If this recommendation does not work well in your scenario, you will probably need to increase the available memory by configuring the memory settings, query.max-memory-per-node, query.max-total-memory-per-node and query.max-memory and apply the configuration change to the cluster by executing helm upgrade prestocluster prestocluster/.

For more information on how to configure the memory settings see the See Memory Settings for Queries section.

In addition to adjusting the memory settings, sometimes, the only solution to handle large queries is to use instances with more memory or adding more nodes to the cluster.

Delta protocol version (2,5) is too new for this version of Delta Standalone Reader/Writer (1,2). (Solved from Denodo Embedded MPP 20241007)

Cause

There is a limitation in Presto, it does not support to read Delta Lake tables with protocol > 1.

Solution

While this limitation is addressed the workaround is to downgrade the protocol version of the Delta Lake table to 1.2:

ALTER TABLE <table_identifier>
SET TBLPROPERTIES('delta.minReaderVersion' = '1', 'delta.minWriterVersion' = '2'

Abandoned queries: Query … has not been accessed since…

Cause

This means that the client of the Embedded MPP, that is Denodo, is not processing the query results or is processing them slowly, so the Embedded MPP assumes that the client has left.

Solution

You can increase query.client.timeout for the MPP coordinator, default value is 5 minutes, 5.00m, in the values.yaml, and apply the configuration change to the cluster by executing helm upgrade prestocluster prestocluster/.

Additional properties in values.yaml
presto:
  coordinator:
    additionalConfig: [
      query.client.timeout=10.00m
    ]

But, in most cases, this is an indication that you need to review your query to identify where the bottleneck is and take actions to improve your query performance as explained in Detecting Bottlenecks in a Query.

hive-metastore:9083: java.net.SocketTimeoutException: Read timed out

Cause

A read timeout occurs querying the Metastore, probably because the query involves very big tables or with too many partitions.

Solution

If this only happens with some queries, you can increase the Metastore request timeout at the values.yaml file:

metastore request timeout in values.yaml
 presto
   hive
     hiveMetastoreTimeout=60s

   delta
     hiveMetastoreTimeout=60s

   iceberg
     hiveMetastoreTimeout=69s

Otherwise, if the timeout error occurs on every query, check the connection from the Presto pod to the Hive-Metastore pod.

Error fetching next … returned an invalid response: JsonResponse{statusCode=500, statusMessage=Server Error, headers={connection=[close]}, hasValue=false} [Error: ]’

Cause

This means that the HTTP header size exceeds its limits. Default value is 4kB.

Solution

You can increase the HTTP header limits for the MPP coordinator and the MPP workers to 64kB or bigger if needed , in the values.yaml, and apply the configuration change to the cluster by executing helm upgrade prestocluster prestocluster/.

Additional properties in values.yaml
presto:
  coordinator:
    additionalConfig: [
      http-server.max-request-header-size=64kB,
      http-server.max-response-header-size=64kB
    ]

  worker:
    additionalConfig: [
      http-server.max-request-header-size=64kB,
      http-server.max-response-header-size=64kB
    ]

org.apache.parquet.io.PrimitiveColumnIO cannot be cast to class org.apache.parquet.io.GroupColumnIO

Cause

The Embedded MPP is reading a Hive table with complex/compound structures and the Hive table schema is not compatible with the Parquet schema.

Solution

Check the schema in the Parquet files and compare it with the schema declared in the Hive table in the Embedded MPP. There are multiple tools available to inspect the schema of a Parquet file. One of the most common is called parquet-tools.

Unable to establish connection: Unrecognized connection property ‘protocols’

Cause

Denodo is loading an older version of the Presto driver.

Solution

Remove Presto driver backups from $DENODO_HOME/lib/extensions/jdbc-drivers/presto-0.1x, leaving only the presto-jdbc.jar.

Registering Iceberg tables: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.facebook.presto.hive.s3.PrestoS3FileSystem not found

Cause

Registering an Iceberg table using s3a://, but in the location field of the Iceberg table metadata the URI protocol used is s3:// or vice-versa

Solution

Change the register_table procedure of the Iceberg table so that it uses the same protocol as the one in the location field of the Iceberg table metadata.

Register iceberg table sentence s3a
presto:
  CALL iceberg.system.register_table(
 'default',
 'denodo_iceberg_table',
 's3a://bucket/path/to/iceberg/table/')

Solve problems with an OIDC provider and IRSA in Amazon EKS

Cause

Problems can be caused by many different reasons, some of which are detailed in the AWS documentation.

Solution

For troubleshooting issues related to an OIDC provider and IRSA in Amazon EKS, please refer to the following documentation:

Add feedback