USER MANUALS

JDBC Data Sources

To create a JDBC data source, use the statement CREATE DATASOURCE JDBC.

Syntax of the CREATE DATASOURCE JDBC statement
CREATE [ OR REPLACE ] DATASOURCE JDBC <name:identifier>[EMBEDDED_MPP]
    [ ID = <literal> ]
    [ FOLDER = <literal> ]
    [ VAULT_SECRET = <literal> ]
    DRIVERCLASSNAME = <literal>
    DATABASEURI = <literal>
    [
        <credentials>
      | USE_KERBEROS ( <kerberos_credentials> )
      | <credentials> USE_KERBEROS_AT_RUNTIME ( <kerberos_credentials> )
      | <credentials> USE_KERBEROS_AT_INTROSPECTION ( <kerberos_credentials> )
      | USE_OAUTH ( <oauth_configuration> )
      | <credentials> USE_OAUTH_AT_RUNTIME ( <oauth_configuration> )
      | <credentials> USE_OAUTH_AT_INTROSPECTION ( <oauth_configuration> )
      | WITH PASS-THROUGH SESSION CREDENTIALS ( <pass_through_options> )
      | WITH PASS-THROUGH SESSION CREDENTIALS (
         [
            PASS-THROUGH_KERBEROS_TARGET_PRINCIPAL = <literal>
            | CONSTRAINED_DELEGATION_PROPERTY_NAME = <literal>
         ]
         ) [
            USE_KERBEROS_AT_INTROSPECTION ( <kerberos_credentials> )
            | USE_AWS_IAM_CREDENTIALS_AT_INTROSPECTION ( <aws_iam_credentials> )
         ]
      | WITH PASS-THROUGH SESSION CREDENTIALS (
         [ REQUIRE_OAUTH ]
         [ USE_OAUTH_AT_INTROSPECTION ]
         [
            USING_TOKEN_PASS_THROUGH
            | USING_TOKEN_EXCHANGE_FLOW
            | USING_OBO_FLOW
         ] ( <oauth_configuration> )
      )
      | USE_AWS_IAM_CREDENTIALS ( <aws_iam_credentials> )
      | <credentials> USE_AWS_IAM_CREDENTIALS_AT_RUNTIME ( <aws_iam_credentials> )
      | <credentials> USE_AWS_IAM_CREDENTIALS_AT_INTROSPECTION ( <aws_iam_credentials> )
      | <gcp_credentials>
    ]
    [ WITH PROXY_CONNECTIONS ]
    [ CLASSPATH = <literal> ]
    [ DATABASENAME = <literal> DATABASEVERSION = <literal>]
    [ ISOLATIONLEVEL =
        {   TRANSACTION_NONE
          | TRANSACTION_READ_COMMITTED
          | TRANSACTION_READ_UNCOMMITTED
          | TRANSACTION_REPEATABLE_READ
          | TRANSACTION_SERIALIZABLE
        }
    ]
    [ IGNORETRAILINGSPACES = { TRUE | FALSE } ]
    [ FETCHSIZE = <integer> ]
    [ <data load configuration> ]
    [ CREDENTIALS_VAULT (
        STATUS { ON | DEFAULT }
        [ PROVIDER
          { CYBERARK (
              APPLICATION_ID = <literal>
              { AGENT
              | AGENT_LESS (
                  CLIENT_KEY = <literal> [ ENCRYPTED ]
                  [ CLIENT_KEY_PASSWORD = <literal> [ ENCRYPTED ] ]
                )
              }
            )
          }
        ]
      )
    ]
    [
        <pool configuration 1>
      | <pool configuration 2>
      | <pool configuration 3>
    ]
    [ <external data configuration> ]
    [ PROPERTIES ( <literal> = <literal> [ VAR ] [, <literal> = <literal> [ VAR ] ]* ) ]
    [ KERBEROSPROPERTIES ( <literal> = <literal> [, <literal> = <literal> ]* ) ]
    [ OAUTHPROPERTIES ( <literal> = <literal> [, <literal> = <literal> ]* ) ]
    [ TRANSFER_RATE_FACTOR = <double> ]
    [ PROCESSING_UNITS = <integer> ]
    [ CPUS_PER_PROCESSING_UNIT = <integer> ]
    [ INTERNAL_TRANSFER_RATE = <double> ]
    [ <data infrastructure> ]
    [ DESCRIPTION = <literal> ]
    [ SOURCECONFIGURATION ( [ <source configuration property>
                            [, <source configuration property> ]* ] ) ]

<pass_through_options> ::=
    [ <credentials> ]
    [ USE_KERBEROS ]
    [ PASSTHROUGH_KERBEROS_TARGET_PRINCIPAL = <literal> ]
    [ CONSTRAINED_DELEGATION_PROPERTY_NAME = <literal> ]

<credentials> ::=
    USERNAME = <literal> USERPASSWORD = <literal> [ ENCRYPTED ]
  | <vault_credentials>

<kerberos_credentials> ::=
    KRB_USERNAME = <literal> KRB_USERPASSWORD = <literal> [ ENCRYPTED ]
  | KRB_USERNAME = <literal> KRB_KEYTAB = <literal>

<aws_iam_credentials> ::=
    AWS_ACCESS_KEY_ID = <literal> AWS_SECRET_ACCESS_KEY = <literal> [ ENCRYPTED ] [ AWS_DATABASE_USER = <literal> ]
  | AWS_ACCESS_KEY_ID = <literal> AWS_SECRET_ACCESS_KEY = <literal> [ ENCRYPTED ] AWS_IAM_ROLE_ARN = <literal> [ AWS_DATABASE_USER = <literal> ]
  | AWS_IAM_ROLE_ARN = <literal> [ AWS_DATABASE_USER = <literal> ]
  | [ AWS_DATABASE_USER = <literal> ]

<gcp_credentials> ::=
  GCP (
    OAUTH_TYPE = <literal>
    PRIVATE_KEY = <literal> [ ENCRYPTED ]
    PROJECT_ID = <literal>
    SERVICE_ACCOUNT_EMAIL = <literal>
  )

<vault_credentials> ::=
    FROM_VAULT
  | FROM_VAULT (VAULT_SECRET = <literal>, FIELD_AT_SECRET = DEFAULT)

<oauth_configuration> ::=
   TOKEN_ENDPOINTURL = <literal>
   CLIENT_IDENTIFIER = <literal>
   CLIENT_SECRET = <literal> [ ENCRYPTED ]
   OAUTH_USER = { <literal> | <vault_credentials> }
   OAUTH_PASSWORD = { <literal> [ ENCRYPTED ] | <vault_credentials> }
   SCOPE = <literal>
   [ OAUTH_TOKEN_PROPERTY_NAME = <literal> ]

<pool configuration 1> ::=
  VALIDATIONQUERY = <literal>
  INITIALSIZE = <integer>
  MAXACTIVE = <integer>

<pool configuration 2> ::=
  VALIDATIONQUERY = <literal>
  INITIALSIZE = <integer>
  MAXACTIVE = <integer>
  EXHAUSTEDACTION = <integer>

<pool configuration 3> ::=
  VALIDATIONQUERY = <literal>
  INITIALSIZE = <integer>
  MAXIDLE = <integer>
  MINIDLE = <integer>
  MAXACTIVE = <integer>
  EXHAUSTEDACTION = <integer>
  TESTONBORROW = <boolean>
  TESTONRETURN = <boolean>
  TESTWHILEIDLE = <boolean>
  [ <pool sub-configuration 1> ]

<pool sub-configuration 1> ::=
  TIMEBETWEENEVICTION = <integer>
  NUMTESTPEREVICTION = <integer>
  MINEVICTABLETIME = <integer>
  [ <pool sub-configuration 2>]

<pool sub-configuration 2> ::=
  POOLPREPAREDSTATEMENTS = <boolean>
  MAXOPENPREPAREDSTATEMENTS = <integer>

<data infrastructure> ::=
  DATA_INFRASTRUCTURE (
     [ PROVIDER = <literal> ]
     [ REGION  = <literal>  ]
  )

<source configuration property> ::=
    ALLOWLITERALASPARAMETER = <property value>
  | DELEGATE_BINARY_ORDERBY_COLLATION = <property value>
  | DELEGATE_ORDERBY_COLLATION_MODIFIER = <property value>
  | DELEGATEAGGREGATEFUNCTIONS = {
        DEFAULT
      | ( <function:identifier> [, <function:identifier> ]* ] )
      }
  | DELEGATEALLOPERATORS = <property value>
  | DELEGATEANALYTICFUNCTIONSLIST = {
        DEFAULT
      | ( <function:identifier> [, <function:identifier> ]* ] )
      }
  | DELEGATEARRAYLITERAL = <property value>
  | DELEGATECOMPOUNDFIELDPROJECTION = <property value>
  | DELEGATEGROUPBY = <property value>
  | DELEGATEHAVING = <property value>
  | DELEGATEINNERJOIN = <property value>
  | DELEGATEINTERSECTION = <property value>
  | DELEGATEINVALIDNUMBERLITERALSASNULL = <property value>
  | DELEGATEJOIN = <property value>
  | DELEGATELEFTFUNCTION = <property value>
  | DELEGATELEFTLITERAL = <property value>
  | DELEGATELITERALEXPRESSION = <property value>
  | DELEGATEMIXEDAGGREGATEEXPRESSION = <property value>
  | DELEGATENATURALOUTERJOIN = <property value>
  | DELEGATENOTCONDITION = <property value>
  | DELEGATE_OFFSET_RESTRICTION = <delegate offset restriction value>
  | DELEGATEOPERATORSLIST = {
        DEFAULT
      | ( <operator:identifier> [, <operator:identifier> ]* ] )
      }
  | DELEGATEORCONDITION = <property value>
  | DELEGATEORDERBY = <property value>
  | DELEGATEPROJECTION = <property value>
  | DELEGATEREGISTERLITERAL = <property value>
  | DELEGATERIGHTFIELD = <property value>
  | DELEGATERIGHTFUNCTION = <property value>
  | DELEGATERIGHTLITERAL = <property value>
  | DELEGATESCALARFUNCTIONS = {
        DEFAULT
      | ( <function:identifier> [, <function:identifier> ]* ] )
      }
  | DELEGATESELECTDISTINCT = <property value>
  | DELEGATESELECTION = <property value>
  | DELEGATEUNION = <property value>
  | SUPPORTSAGGREGATEFUNCTIONSOPTIONS = <property value>
  | SUPPORTSBRANCHOUTERJOIN = <property value>
  | SUPPORTSEQOUTERJOINOPERATOR = <property value>
  | SUPPORTSEXPLICITCROSSJOIN = <property value>
  | SUPPORTSFULLEQOUTERJOIN = <property value>
  | SUPPORTSFULLNOTEQOUTERJOIN = <property value>
  | SUPPORTSFUSINGINUSINGANDNATURALJOIN = <property value>
  | SUPPORTSJOINONCONDITION = <property value>
  | SUPPORTSNATURALJOIN = <property value>
  | SUPPORTSPREPAREDSTATEMENT = <property value>
  | SUPPORTSUSINGJOIN = <property value>

<property value> ::=
    true
  | false
  | DEFAULT

<delegate offset restriction value> ::=
    DEFAULT
  | 'NONE'
  | 'FETCH'
  | 'ORDER_BY'
  | 'FETCH_ORDER_BY'
  | 'NO_ORDER_BY'
  | 'FETCH_NO_ORDER_BY'

<data load configuration> ::=
    DATA_LOAD_CONFIGURATION (
      [ USE_FOR_QUERY_OPTIMIZATION = { false | DATA_MOVEMENT | PARALLEL_PROCESSING } ]
      [ BATCHINSERTSIZE = <integer> ]
      [ BULK_LOAD_CONFIGURATION (
           <workdir configuration>
         | <athena configuration>
         | <redshift configuration 1>
         | <redshift configuration 2>
         | <hdfs configuration>
         | <databricks configuration>
         | <oracle configuration>
         | <SQL Server configuration>
         | <db2 configuration>
         | <gcp bulk credentials>
        )
      ]
      [ TARGET_CATALOG = <literal> [ ESCAPE ] ]
      [ TARGET_SCHEMA = <literal> [ ESCAPE ] ]
      [ USE_EXTERNAL_TABLES (
         ON_MOVE_READ = <boolean>,
         ON_MOVE_WRITE = <boolean>
        )
      ]
      [ CREATE_TABLE_TEMPLATES ( <create_table_template> [, <create_table_template> ]* )
        <default_create_table_templates> ]
    )

<external data configuration> ::=
    EXTERNAL_DATA_CONFIGURATION (
        FILE_SYSTEM { HDFS | S3 | ADLS }
        AUTHENTICATION ( <aws_authentication> | <azure_authentication> )
        [HADOOP_CUSTOM_PROPERTIES = ( <literal> = <literal> [, <literal> = <literal> ]* ) ]
        EXTERNAL_DATA_ROUTES (
            <external_data_routes_list>
        )
    )

<aws_authentication> ::=
    [ AWS_ACCESS_KEY_ID = { <literal> | <vault_credentials> }]
    [ AWS_SECRET_ACCESS_KEY = { <literal> [ ENCRYPTED ] | <vault_credentials> } ]
    [ AWS_IAM_ROLE_ARN = <literal> ]
    [ AWS_VAULT_SECRET = <literal> ]

<azure_authentication> ::= <azure_authentication_shared_key>   |
                           <azure_authentication_oauth_client> |
                           <azure_authentication_oauth_managed_identity>

<azure_authentication_shared_key> ::=
    AZURE_AUTH_TYPE = SHARED_KEY
    SHARED_KEY = <literal> [ ENCRYPTED ]

<azure_authentication_oauth_client> ::=
    AZURE_AUTH_TYPE = OAUTH_CLIENT_CREDENTIALS
    OAUTH_CLIENT_ID = <literal> | <vault_credentials>
    [ OAUTH_CLIENT_SECRET = <literal> [ ENCRYPTED ] | <vault_credentials> ]
    OAUTH_TOKEN_ENDPOINT = <literal>
    OAUTH_VAULT_SECRET = <literal>

<azure_authentication_oauth_managed_identity> ::=
    AZURE_AUTH_TYPE = OAUTH_MANAGED_IDENTITY
    [ OAUTH_CLIENT_ID = <literal> | <vault_credentials> ]
    [ OAUTH_TENANT_ID = <literal> ]
    [ OAUTH_TOKEN_ENDPOINT = <literal> ]

<external_data_routes_list> ::=
    <external_data_routes_template> |
    <external_data_routes_list> COMMA <external_data_routes_template>

<external_data_routes_template> ::= <identifier_name> EQ <literal>

<create_table_template> ::=
 <template_name:identifier> = <template_definition:literal>
                              [ DEFAULT( <parameter_default_value> [, <parameter_default_value> ]* ) ]

Note: You can use parameters within the template definition. Syntax: @{parameter_name}

<parameter_default_value> ::= <parameter_name:literal> = <parameter_value:literal>

<default_create_table_templates> ::=
      DEFAULT CREATE_TABLE_TEMPLATE <template_name:identifier>
    | DEFAULT CREATE_TABLE_TEMPLATES (
          [ DATA_MOVEMENT =    { DATA_SOURCE_DEFAULT | <data_movement_template_name:identifier> }, ]
          [ CACHE =            { DATA_SOURCE_DEFAULT | <cache_template_name:identifier> }, ]
          [ CACHE_MANAGEMENT = { DATA_SOURCE_DEFAULT | <cache_management_template_name:identifier> }, ]
          [ REMOTE_TABLE =     { DATA_SOURCE_DEFAULT | <remote_table_template_name:identifier> }, ]
          [ SUMMARY =          { DATA_SOURCE_DEFAULT | <summary_template_name:identifier> } ]
      )

To modify a JDBC data source, use ALTER DATASOURCE JDBC.

Syntax of the ALTER DATASOURCE JDBC statement
ALTER DATASOURCE JDBC <name:identifier>
    [ ID = <literal> ]
    [ FOLDER = <literal> ]
    [ VAULT_SECRET = <literal> ]
    [ DRIVERCLASSNAME = <literal> ]
    [ DATABASEURI = <literal> ]
    [
        <credentials>
      | USE_KERBEROS ( <kerberos_credentials> )
      | <credentials> USE_KERBEROS_AT_RUNTIME ( <kerberos_credentials> )
      | <credentials> USE_KERBEROS_AT_INTROSPECTION ( <kerberos_credentials> )
      | WITH PASS-THROUGH SESSION CREDENTIALS ( <pass_through_options> )
      | WITH PASS-THROUGH SESSION CREDENTIALS ( )
          [   USE_KERBEROS_AT_INTROSPECTION ( <kerberos_credentials> )
            | USE_AWS_IAM_CREDENTIALS_AT_INTROSPECTION ( <aws_iam_credentials> ) ]
      | USE_AWS_IAM_CREDENTIALS ( <aws_iam_credentials> )
      | <credentials> USE_AWS_IAM_CREDENTIALS_AT_RUNTIME ( <aws_iam_credentials> )
      | <credentials> USE_AWS_IAM_CREDENTIALS_AT_INTROSPECTION ( <aws_iam_credentials> )
      | <gcp_credentials>
      [ WITH PROXY_CONNECTIONS ]
    ]
    [ CLASSPATH = <literal> ]
    [
      DATABASENAME = <literal>
      DATABASEVERSION = <literal>
    ]
    [ ISOLATIONLEVEL =
        TRANSACTION_NONE
      | TRANSACTION_READ_COMMITTED
      | TRANSACTION_READ_UNCOMMITTED
      | TRANSACTION_REPEATABLE_READ
      | TRANSACTION_SERIALIZABLE
    ]
    [ IGNORETRAILINGSPACES = { true | false } ]
    [ FETCHSIZE = <integer> ]
    [ BATCHINSERTSIZE = <integer> ]
    [ CREDENTIALS_VAULT (
        STATUS { ON | DEFAULT }
        [ PROVIDER
          { CYBERARK (
              APPLICATION_ID = <literal>
              { AGENT
              | AGENT_LESS (
                  CLIENT_KEY = <literal> [ ENCRYPTED ]
                  [ CLIENT_KEY_PASSWORD = <literal> [ ENCRYPTED ] ]
                )
              }
            )
          }
        ]
      )
    ]
    [
        <pool configuration 1>
      | <pool configuration 2>
      | <pool configuration 3>
    ]
    [ <external data configuration> ]
    [ PROPERTIES ( <literal> = <literal> [ VAR ] [, <literal> = <literal> [ VAR ] ]* ) ]
    [ KERBEROSPROPERTIES ( <literal> = <literal> [, <literal> = <literal> ]* ) ]
    [ USEEXTERNALTABLES (
          ONMOVEREAD = <boolean>,
          ONMOVEWRITE = <boolean>
      )
    ]
    [ DATAMOVEMENT_TARGET = { true | false } ]
    [ TARGET_CATALOG = <literal> [ ESCAPE ] ]
    [ TARGET_SCHEMA = <literal> [ ESCAPE ] ]
    [ TRANSFER_RATE_FACTOR = <double> ]
    [ PROCESSING_UNITS = <integer> ]
    [ CPUS_PER_PROCESSING_UNIT = <integer> ]
    [ INTERNAL_TRANSFER_RATE = <double> ]
    [ CREATE_TABLE_TEMPLATES ( <create_table_template> [, <create_table_template> ]* ) ]
    [ <default_create_table_templates> ]
    [ <data infrastructure> ]
    [ DESCRIPTION = <literal> ]
    [ SOURCECONFIGURATION ( [ <source configuration property>
                            [, <source configuration property> ]* ] ) ]

<pool configuration 1> ::= (see CREATE DATASOURCE JDBC)

<pool configuration 2> ::= (see CREATE DATASOURCE JDBC)

<pool configuration 3> ::= (see CREATE DATASOURCE JDBC)

<pool sub-configuration 1> ::= (see CREATE DATASOURCE JDBC)

<pool sub-configuration 2> ::= (see CREATE DATASOURCE JDBC)

<source configuration property> ::= (see CREATE DATASOURCE JDBC)

<credentials> ::= (see CREATE DATASOURCE JDBC)

<create_table_template> ::= (see CREATE DATASOURCE JDBC)

<default_create_table_templates> ::= (see CREATE DATASOURCE JDBC)

Explanation of some of the parameters of these statements:

  • OR REPLACE: If present and a data source with the same name exists, the current definition is substituted with the new one.

  • FOLDER: name of the folder where the data source will be stored.

  • DRIVERCLASSNAME: The driver class to be used for connection to the data source.

  • DATABASEURI: The connection URL to the database.

  • The authentication methods available to connect to a database are the following:

    1. <credentials>: provide the USERNAME and PASSWORD to connect to the database to execute queries and for the introspection process (i.e. to display the tables/views of the database in the “Create base view” dialog of the data source).

    2. USE_KERBEROS ( <kerberos_credentials> ): use Kerberos to connect to the database to execute queries and for the introspection process (i.e. to display the tables/views of the database in the “Create base view” dialog of the data source).

    3. <credentials> USE_KERBEROS_AT_RUNTIME ( <kerberos_credentials> ): use Kerberos to connect to the database to execute queries, but login and password for the introspection process.

    4. <credentials> USE_KERBEROS_AT_INTROSPECTION ( <kerberos_credentials> ): use login and password to connect to the database to execute queries, but use Kerberos for the introspection process.

    5. WITH PASS-THROUGH SESSION CREDENTIALS ( <pass_through_options> ): use login and password or AWS IAM credentials for the introspection process and the credentials of the client that connected to the Virtual DataPort server to execute queries. The credentials used to run queries can be Kerberos or login/password depending on the authentication the client used to connect to the Virtual DataPort server. If <pass_through_options> specifies USE_KERBEROS, the Server will use the login/password to create the Kerberos ticket.

    6. USE_AWS_IAM_CREDENTIALS ( <aws_iam_credentials> ): use AWS IAM credentials to connect to the database to execute queries and for the introspection process (i.e. to display the tables/views of the database in the “Create base view” dialog of the data source).

    7. <credentials> USE_AWS_IAM_CREDENTIALS_AT_RUNTIME ( <aws_iam_credentials> ): use AWS IAM credentials to connect to the database to execute queries, but login and password for the introspection process.

    8. <credentials> USE_AWS_IAM_CREDENTIALS_AT_INTROSPECTION ( <aws_iam_credentials> ): use login and password to connect to the database to execute queries, but use AWS IAM credentials for the introspection process.

      If you create a data source with this option, but you want to query a view of this data source with other credentials than the ones used to connect to the Server, add the parameters USERNAME and PASSWORD to the CONTEXT. These two parameters are only taken into account when the data source has been created with the option WITH PASS-THROUGH SESSION CREDENTIALS.

      For example, if view1 has been created with the option WITH PASS-THROUGH SESSION CREDENTIALS and you execute this:

      SELECT *
      FROM view1
      CONTEXT(
          USERNAME = 'admin'
        , PASSWORD = 'd4GvpKA5BiwoGUFrnH92DNq5TTNKWw58I86PVH2tQIs/q1RH9CkCoJj57NnQUlmvgvvVnBvlaH8NFSDM0x5fWCJiAvyia70oxiUWbToKkHl3ztgH1hZLcQiqkpXT/oYd' ENCRYPTED
        , DOMAIN = 'ACME_DOMAIN')
      

      the Server will connect to the Web service with the username admin, password password and domain acme_domain, ignoring the credentials used by the user to connect to the Server.

      It is mandatory to add the token ENCRYPTED and enter the password encrypted. To encrypt the password, execute the statement ENCRYPT_PASSWORD. For example:

      ENCRYPT_PASSWORD 'my_secret_password';
      
    1. WITH PASS-THROUGH SESSION CREDENTIALS () USE_KERBEROS_AT_INTROSPECTION ( <kerberos_credentials> ): use Kerberos authentication for the introspection process and the credentials of the client that connected to the Virtual DataPort server to execute queries. The credentials used to run queries can be Kerberos or login/password depending on the authentication the client used to connect to the Virtual DataPort server.

    Important

    There are important implications of using “pass-through session credentials”. To read about them, search “pass-through credentials” on the section Importing JDBC Sources of the Administration Guide.

  • WITH PROXY_CONNECTIONS: if present, the data source will use the feature “proxy authentication” of Oracle. The section How Oracle Proxy Authentication Works of the Administration Guide explains how this feature works.

  • CLASSPATH: Path to the JAR file containing the JDBC driver for the specified source (optional).

  • DATABASENAME and DATABASEVERSION: Name and version of the database to be accessed.

  • ISOLATIONLEVEL: sets the desired isolation level for the queries and transactions executed in the database. If not present, the data source uses the default isolation level of the database.

  • IGNORETRAILINGSPACES: If true, the Server removes the space characters at the end of text type values of the results returned by these data source’s views.

  • FETCHSIZE: gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed.

  • With the clause CREDENTIALS_VAULT, you can customize the Credentials Vault configuration at the Data Source level, which is particularly useful when you need to manage credentials for multiple databases with varying configurations. By allowing different configurations for different data sources, you can achieve a more secure and efficient credential management process.

  • BATCHINSERTSIZE: when the data source has to insert several rows into the database of this data source, it can insert them in batches. This number sets the number of queries per batch.

    This value is used only when inserting rows into the database of this data source as a result of moving data from another data source into this one. See more about this in the section Data Movement of the Administration Guide.

  • The section The Pool of Connections of the JDBC Data Sources below explains the parameters of “<pool configuration 1>”, “<pool configuration 2>” and “<pool configuration 3>”.

  • Parameters of the pool of stored procedures:

    • POOLPREPAREDSTATEMENTS: if true, the pool of prepared statements is enabled.

    • MAXOPENPREPAREDSTATEMENTS: maximum number of opened prepared statements. Only taken into account if POOLPREPAREDSTATEMENTS is true.

  • USEEXTERNALTABLES: options regarding the use of the database’s proprietary APIs to read and write data from/to this database.

    Writing data into a database using its proprietary API is called “Bulk data load”.

    There are two reasons that the Server may write data using the bulk data load APIs of the database:

    1. When performing a data movement. The section Data Movement of the Administration Guide explain what they are.

    2. When loading the cache of a table.

    The section Bulk Data Load of the Administration Guide explains in detail how this process works.

    • ONMOVEREAD (only taken into account when the database is Netezza): if true, when the Execution Engine reads data from the Netezza database to perform a data movement, it will do so using its “External tables” feature. Setting this to true is equivalent to selecting the check box “Use external tables for data movement” of the “Read settings”, on the “Read & Write” tab of the data source.

    • ONMOVEWRITE: if true, when the Execution Engine writes data to this database to perform a data movement, it does so using its proprietary API. Setting this to yes is equivalent to selecting the check box “Use Bulk Data Load APIs” of the “Write settings”, on the “Read & Write” tab of the data source.

  • DATAMOVEMENT_TARGET: if true, the data source can be the target of a data movement. Setting this to true is equivalent to selecting the check box “Can be data movement target” of the “Write settings”, on the “Read & Write” tab of the data source.

  • BULK_LOAD_CONFIGURATION: settings of the bulk data load API of the database. The settings you can indicate depend on the database adapter so it is better to change them from the administration tool.

  • CREATE_TABLE_TEMPLATES: property to configure the SQL commands that VDP executes for creating the cache tables, data movements tables, remote tables, summaries, temporary views, and materialized views. The DATA_SOURCE_DEFAULT is the default table creation command used by Denodo when the user did not set a custom table creation command. See section Cache Table Creation Templates for more information.

Conventions to modify the table creation templates using an ALTER DATASOURCE command:

  • If the ALTER DATASOURCE command does not include the CREATE_TABLE_TEMPLATES token, the VDP server will keep the current values for the templates.

  • If the ALTER DATASOURCE command includes an empty section: CREATE_TABLE_TEMPLATES (), the VDP server will remove all the data source templates, and the CREATE_TABLE_TEMPLATES section will not appear in the data source VQL.

  • If the ALTER DATASOURCE command includes the CREATE_TABLE_TEMPLATES section specifying one or more templates, those will replace the existing ones in the data source.

  • Data source configuration parameters (SOURCECONFIGURATION). Virtual DataPort allows indicating specific characteristics of the underlying data sources, so that they are taken into account when executing statements on them. See section Data Source Configuration Properties for further details.

  • PROPERTIES: list of name/value pairs that will be passed to the JDBC driver when creating connection with this database. If the token VAR is present, the property will be considered an expression instead of a literal.

  • KERBEROSPROPERTIES: list of name/value pairs that will be passed to the JDBC driver when creating connection with this database. The properties on this list are meant to configure the Kerberos authentication mechanism between the Virtual DataPort server and the database. See the section Connecting to a JDBC Source with Kerberos Authentication of the Administration Guide for more details about Kerberos authentication in JDBC data sources.

  • The following optional parameters specify information about the data source. The cost optimizer uses these values when evaluating the cost of an execution plan. The default values are usually correct so you should not specify these parameters unless you have a deep knowledge of the cost optimizer.

    • TRANSFER_RATE_FACTOR: relative measure of the speed of the network connection between the Denodo server and the data source. Use the default value (1) if the data source is accessible through a conventional 100 Mbps LAN. Use higher values for faster networks and lower values for data sources accessible through a WAN.

    • PROCESSING_UNITS: In parallel databases, the number of SPUs.

    • CPUS_PER_PROCESSING_UNIT: In parallel databases, the number of CPUs per SPU.

    • INTERNAL_TRANSFER_RATE: transfer rate in kilobytes per millisecond.

The Pool of Connections of the JDBC Data Sources

The JDBC data sources of Denodo are created with a connection pool. A connection pool is a cache of connections to a database. At run time, when a query sent by an application involves sending a query to this data source, the execution engine requests a connection to the pool of the data source. If there are idle connection in the pool, it returns one. If there are no idle connections, the pool creates one. Then, the execution engine executes the query on this connection and once the query finishes, it returns the connection back to the pool. This connection is now available for the next query.

Using a connection pool significantly improves the performance of the queries involving a database. The reason is that creating a new connection is costly in terms of time and resources. For each new connection, Denodo has to establish a network connection with the database, the database has to authenticate the client, allocate new memory to the new connection, etc. If the connection is already created, this process is avoided.

Connection pools provide the following benefits:

  • Reduce the number of times a new connection has to be created.

  • Reduce the time of getting a connection to execute a query.

  • Provide a way of limiting the amount of connections opened to a database.

These are the reasons new JDBC data sources have a connection pool and in general, it should not be disabled.

The exact behavior of the connection pool depends on its settings.

Parameters of the Connection Pool of a JDBC Data Source

The connection pool of a JDBC data source is configured with the following parameters. Most of them cannot be changed from the administration tool, only with the statement ALTER DATA SOURCE JDBC.

  • MAXACTIVE (default value: 20): maximum number of connections that can be opened at a given time. These are the connections currently used to run queries plus the connections idle in the pool. Note that the MAXACTIVE connection limit is applied to the XA and non-XA connection pools independently. So, a default value of 20 will result in a total max connection pool limit of 40 (20 for XA connection and 20 for non-XA connections).

    You can change this parameter graphically, in the dialog in the Connection Pool Configuration, field Maximum number of active connections.

    • If negative (e.g. -1), there is no limit to the number of connections in the pool at one time.

    • Enter 0 to disable the connection pool of the data source. In this case, the other parameters of the connection pool are ignored.

      In this case, for each query, the data source opens a new connection to the database and once the query finishes, it closes the connection. If the query is part of a transaction (the client executed BEGIN earlier but did not execute COMMIT or ROLLBACK yet), the execution engine keeps the connection open until the transaction finishes. That is because as transactions in Denodo are distributed, the transaction needs to be confirmed or rolledback in all the databases involved in the query. See more about transactions in the section Transactions in Virtual DataPort.

    Denodo uses a different connection pool for transactional queries (XA pool). Denodo offers additional property parameters to control and limit the XA connection pool specifically. When both the MAXACTIVE parameter and the XA.MAXACTIVE parameters are set, the lower limit will apply. To change that, you can specify a property to modify this behavior at data source level:

    -- This command changes the maximum number of XA connections to 10 for the data source 'dataSourceName' in the database 'databaseName'.
    
    SET 'com.denodo.vdb.misc.datasource.JDBCDataSource.pool.databaseName.dataSourceName.xa.maxActive' = '10';
    

    MAXACTIVE is ignored if EXHAUSTEDACTION is set to 2 (see the explanation below).

  • MAXIDLE (default value: -1): maximum number of connections that can sit idle in the pool at any time.

    • If -1 or lower, there is no limit to the number of connections that may sit idle at one time.

    • If 0 or higher and the parameters TIMEBETWEENEVICTION and MINEVICTABLETIME are greater than 0, the connections that have been idle for more than a certain time will be closed.

  • MINIDLE (default value: 0): minimum number of connections that are idle in the pool of connections. Useful to guarantee that there are always idle connections in the pool so a query never has to wait for the pool to open a new connection. In order to satisfy MINIDLE, the pool will never create more than MAXACTIVE connections. With XA pool configuration the value of MINIDLE is the same as INITIALSIZE.

  • EXHAUSTEDACTION (default value: 1): specifies the behavior of the pool when the pool is empty (all the connections are running queries). The possible values are:

    • 0: the data source returns an error and the query will fail.

    • 1: the pool waits until a connection is available, or the maximum waiting time is reached. If the maximum wait time is reached, the query fails.

      The default maximum wait time is 30 seconds but it can be changed. If the maximum wait time is -1, the data source will wait indefinitely until a connection is available. To change this property, execute the following command from the VQL Shell and restart the Virtual DataPort server.

      -- This represents 20 seconds because the value of this property is in milliseconds.
      
      SET 'com.denodo.vdb.misc.datasource.JDBCDataSource.pool.maxWait' = '20000';
      

      After changing the maximum wait time and restarting, the existing JDBC data sources will still have the same maximum wait time but if they are modified, their maximum wait time will change to the new value.

    • 2: the data source will create a new connection if there are no idle connections. This makes MAXACTIVE meaningless.

    If XA connection pool is enabled only supports the value 1 (EXHAUSTED_BLOCK).

  • INITIALSIZE (default value: 4): number of connections with which the pool is initialized.

    This value is ignored when the data source is created with the authentication option “Pass-through session credentials”. With this option, the Server will create one pool of connections for each user account that connects to this database and initially, these pools will only have one connection regardless of the value of INITIALSIZE. This is to prevent creating too many unnecessary connections.

    This property can be changed to specify a different value for XA connections.

    -- This command changes the number of XA connections with which the pool is initialized to 1 for the data source 'dataSourceName' in the database 'databaseName'.
    
    SET 'com.denodo.vdb.misc.datasource.JDBCDataSource.pool.databaseName.dataSourceName.xa.initialIdle' = '1';
    
  • VALIDATIONQUERY (default value: depends on the adapter): SQL query executed by the connection pool to check if a connection is still valid; also known as “ping query”. It is only used when at least one of TESTONBORROW, TESTONRETURN or TESTWHILEIDLE are true.

    You can change this parameter graphically, in the dialog in the Connection Pool Configuration, field Ping query.

    When you create a JDBC data source using the administration tool and select a database adapter other than Generic, the data source is automatically created with the appropriate validation query for that database. However, if you select Generic, you have to provide a valid ping query. This query is executed often so its footprint has to be as low as possible. In addition, the table queried by the pool has to exist and the user needs to have privileges to run it.

    Examples of ping queries:

    SELECT 1
    SELECT 1 FROM dual
    SELECT 1 FROM view LIMIT 0
    

    The ping query can contain the interpolation variable USER_NAME. At runtime, this variable will be replaced with the user name of the Virtual DataPort user that runs the query.

    For example, let us say that the ping query of a JDBC data source is CALL DBMS_SESSION.SET_IDENTIFIER('@{USER_NAME}').

    If the user scott executes a query that involves this data source, Virtual DataPort will execute the query CALL DBMS_SESSION.SET_IDENTIFIER('scott') over the connection returned by the pool, to check that this connection is still valid and not stale.

    If TESTONBORROW is true, the ping query is executed every time before the actual query is sent to the database. Being able to put in this query the username that is currently running the query can be useful to execute a statement on the database or for auditing purposes.

  • TESTONBORROW (default value: true): if true and the parameter VALIDATIONQUERY is not empty, the pool will execute the VALIDATIONQUERY on the selected connection before returning it to the execution engine. If the VALIDATIONQUERY fails, the pool will discard the connection and select another one. If there are no more idle connections, it will create one.

    This property can be changed to specify a different value for XA connections.

    -- This command disables the ping query for XA connections for the data source 'dataSourceName' in the database 'databaseName'.
    
    SET 'com.denodo.vdb.misc.datasource.JDBCDataSource.pool.databaseName.dataSourceName.xa.testOnBorrow' = 'false';
    
  • TESTONRETURN (default value: false): if true and the parameter VALIDATIONQUERY is not empty, the pool will execute the VALIDATIONQUERY on the connections returned to the pool. If the VALIDATIONQUERY fails, the pool will discard the connection. This parameter is not supported in XA connection pools.


The pool can be configured to periodically check that the connections in the pool are still valid and/or to close the connections that have sat idle in the pool for more than a certain period of time. To check if a connection is still valid, the pool executes the VALIDATIONQUERY. If it fails, the pool discards the connection. The main reasons for the validation query to fail are:

  1. The database has a mechanism that closes connections after a certain period of inactivity.

  2. A firewall placed in between the Denodo server and the database automatically closes any connection after a certain period of inactivity.


The main goals of enabling this option are:

  1. Make sure that the connections of the pool are valid at all times. That way, when a connection is needed, the connections of the pool are always valid.

  2. Close connections that have not been used in a while thus freeing resources in the database.

The task of verifying connections and closing the idle ones is performed by the “connection eviction thread” (there is one per pool of connections). Periodically, this thread awakes and perform these checks.

The parameters that control the connection eviction thread are:

  • TIMEBETWEENEVICTION (default value: -1): how long in milliseconds the eviction thread should sleep before “runs” of examining idle connections. If negative, the eviction thread is not launched.

    Eviction runs contend with the execution engine for access to the pool, so if this thread runs too frequently, performance issues may result.

    With XA connections this parameter has a fixed value of 300000 milliseconds.

  • MINEVICTABLETIME (default value: 1800000 - 30 minutes): minimum amount of time in milliseconds that a connection may sit idle in the pool before it is eligible for eviction.

    When is less than 1, no connections will be evicted from the pool due to idle time. In this case, the eviction thread will only discard connections that return an error when executing the validation query.

    Even if a connection has been idle for more than MINEVICTABLETIME, it will not be closed if the number of idle connections was less than MINIDLE after closing that connection.

    This parameter has no effect if TIMEBETWEENEVICTION is less than 1.

    With XA connections this parameter has a fixed value of 600000 milliseconds.

  • TESTWHILEIDLE (default value: false): if true and the parameter VALIDATIONQUERY is not empty, the connections will be validated by the connection eviction thread. To validate a connection, the thread runs the validation query on the connection. If it fails, the pool drops the connection from the pool.

    This parameter has no effect if TIMEBETWEENEVICTION is less than 1.

    This parameter is not supported in XA connection pools.

  • NUMTESTPEREVICTION (default value: 3): number of connections examined in each run of the connection eviction thread. The reason for not examining all the connections of the pool in each run of the thread is because while a connection is being examined, it cannot be used by a query.

    This parameter has no effect if TIMEBETWEENEVICTION is less than 1.

    With XA configuration, if this parameter is not used, all the objects in the pool are eligible for eviction.

Add feedback