Connection Pools Overview
Connecting to a database server consists of several steps each of which takes time to complete. A physical channel such as a socket or a named pipe must be established, the initial handshake with the server must occur, the connection string information must be parsed, the connection must be authenticated by the server, checks must be run for enlisting in the current transaction, and so on.
To minimize the cost of opening connections, Denodo utilizes an optimization technique called connection pooling which is a standard practice widely used by the industry.
A connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. In connection pooling, after a connection is created, it is placed in the pool and it is used again so that the work to create a new connection does not have to be repeated.
For JDBC connections, Denodo uses the Apache Commons DBCP library. For distributed transactions, the XAPool library is used.
A detailed explanation on how the connection pool works for JDBC sources, including the meaning of the different parameters, can be found in the The Pool of Connections of the JDBC Data Sources section of the Virtual DataPort VQL Guide.
General Stale Connection Strategies
A connection pool is a commonly used optimization method which reduces the time taken to create new connections, as already open connections can be reused. Nevertheless, there could be situations where the connection can be broken or in a stale status. In these types of situations, the client application has no control over the network.
For this reason, Connection Pools have some strategies to overcome and defend against any issue that may happen in the network in order to ensure no broken or stale connections are used.
- Attempt to use a connection and allow the underlying transport protocols to throw an exception if it is no longer valid. The application layer, in this case, the Denodo Virtual DataPort Server, catches the exception and handles it in a logical manner.
- Prior to using a connection from the pool, a test is executed and based on the outcome it uses the connection or gets a new one.
- Have the pool system test idle connections periodically to confirm if they are still valid.
Denodo Solutions to Handle Stale Connections on JDBC data sources
- Allow the underlying transport protocols to throw an exception if a connection is no longer valid: The connection pool mechanism of the Denodo Virtual DataPort server does this by default. When obtaining a connection from the pool, it executes a validation query. If the underlying transport protocol alerts the application that the message cannot reach the destination or some other error, Denodo will discard that connection and retrieve another one from the pool.
However, there may be various network components that can close a connection. In some cases, neither the server nor client is aware that the connection has been closed. For example, a firewall can close connections without sending a notification to either end of the connection. In addition, when the client sends a message through that connection, it is silently discarded so it does not reach the destination. This prevents the application from knowing if the query failed.
- Add a timeout for the validation query: With this option, if a ping query is not responding in a specified amount of time, the connection is discarded. This option is not enabled by default.
It can be configured at a global server level for all the JDBC data sources, but can also be configured specifically for every data source.
The following two parameters will affect all JDBC data sources:
com.denodo.vdb.misc.datasource.JDBCDataSource.pool.pingQueryTimeout
com.denodo.vdb.misc.datasource.JDBCDataSource.pool.destroyObjectTimeout
To specify the timeouts for a specific data source from a database:
com.denodo.vdb.misc.datasource.JDBCDataSource.pool.pingQueryTimeout.<databaseName>.<dataSourceName>
com.denodo.vdb.misc.datasource.JDBCDataSource.pool.destroyObjectTimeout.<databaseName>.<dataSourceName>
The timeouts for all these parameters are specified in milliseconds.
These settings are changed through a SET command in a Denodo VQL Shell. For example:
SET ‘com.denodo.vdb.misc.datasource.JDBCDataSource.pool.pingQueryTimeout’ = ‘9000000’
- Test the idle connections periodically with a validation or ping query: If the connection is not responding to the validation query it is discarded. This is a way of keeping the connections busy and avoiding having them unused for a long period of time. This can be configured through VQL using the following properties of a JDBC data source (TESTWHILEIDLE, TIMEBETWEENEVICTION, NUMTESTPEREVICTION). If we know when a network element is closing the connection, we should adjust the time to minimize the number of tests over the idle connections. For example, a connection test every 15 minutes would look like this:
CREATE OR REPLACE DATASOURCE JDBC ds_name
[....]
TESTWHILEIDLE = true
TIMEBETWEENEVICTION = 900000
NUMTESTPEREVICTION = 20
MINEVICTABLETIME = 1800000
[...]
You can get more information on the different properties in the CREATE DATA SOURCE JDBC statement in the The Pool of Connections of the JDBC Data Sources section of the Virtual DataPort VQL Guide. It also includes recommendations for the configuration in environments with a firewall.
Recommended Solution
The best strategy will depend on the specific case. The second strategy of testing the connection before using it with a ping query corresponds to the first Denodo solution, it is enabled by default in Denodo JDBC data sources and is the recommended solution in most scenarios.
Validating idle connections could be a good fit if we know when the connection is going to be dropped by a network component. In that case, we can adjust the validation of idle connections to that number so we avoid unnecessary ping query executions. In any case, note that the ping or validation query should be an immediate query causing practically no impact in terms of performance. Nevertheless, this ping query introduces a certain amount of overhead and especially if the connection dropping cannot be predicted or it happens too often the recommended solution would be relying on a ping query timeout.
References
The Pool of Connections of the JDBC Data Sources
Recommended Settings of the Connection Pool in Environments with a Firewall
The information provided in the Denodo Knowledge Base is intended to assist our users in advanced uses of Denodo. Please note that the results from the application of processes and configurations detailed in these documents may vary depending on your specific environment. Use them at your own discretion.
For an official guide of supported features, please refer to the User Manuals. For questions on critical systems or complex environments we recommend you to contact your Denodo Customer Success Manager.