Ping Connection

In long term operations, some proxies and load balancers might close the connection when no data has been sent or received during certain amount of time.

The Data Catalog provides a mechanism to keep the connection alive in these scenarios.

When this feature is enabled and the connection is idle during a configurable amount of time, the Data Catalog will write noop data in the connection periodically.

The long term operations involved in the ping connection feature are:

  1. Metadata synchronization.

  2. Query execution.

  3. Query export.

Note

Exporting data using ping connection could run slow when a large amount of data is transferred.

Ping Connection Configuration

This feature is disabled by default and the following parameters are supported in the Data Catalog configuration:

  • connection-ping.enabled=false

  • connection-ping.core-pool-size=20

  • connection-ping.max-pool-size=40

  • connection-ping.queue-capacity=50

  • connection-ping.polling-interval=15000

Enable the ping timeout feature by setting connection-ping.enabled=true.

The parameter connection-ping.polling-interval indicates the number of milliseconds that the connection could be inactive.

When this time is consumed the backend will write noop data in the response to prevent the connection timeout.

In addition, this feature uses a pool of threads:

  1. If the number of threads is less than the connection-ping.core-pool-size, the backend will create a new thread to run a new task.

  2. If the number of threads is equal (or greater than) the connection-ping.core-pool-size, the backend will put the task into the queue.

  3. If the queue is full, and the number of threads is less than the connection-ping.max-pool-size, the backend will create a new thread to run tasks in.

  4. If the queue is full, and the number of threads is greater than or equal to connection-ping.max-pool-size, the backend will reject the task.