Access using Flight SQL¶
Virtual DataPort implements the Arrow Flight SQL Protocol, allowing connections from clients interacting with this protocol independently of its underlying programming language.
Access using Flight SQL JDBC driver¶
JDBC (Java Database Connectivity) is a Java API that allows executing statements on relational databases, regardless of the DBMS. The Flight SQL JDBC driver is a JDBC implementation that uses the Flight SQL protocol and can interact with any database supporting Flight SQL.
Get the Flight SQL JDBC driver from:
From your installation of the Denodo Platform (
<DENODO_HOME>/tools/client-drivers/jdbc/
).
Java Version
The driver is compatible with JDK 8+. On JDK 9+, the following JVM parameter is required:
--add-opens=java.base/java.nio=ALL-UNNAMED
Driver Class Name
The class name of this driver is this:
org.apache.arrow.driver.jdbc.ArrowFlightJdbcDriver
Connection URL
The syntax of the database URL is this:
jdbc:arrow-flight-sql://<hostname>[:<port>][?<paramName>=<paramValue>[&<paramName>=<paramValue>]*]
The URI scheme must be
jdbc:arrow-flight-sql://
.<hostname>
: It is the hostname of the Flight SQL service.<port>
: Default port is 9994. It can be modified setting this property:
com.denodo.vdb.vdbinterface.server.VDBFlightImpl.port=port-value
For example:
jdbc:arrow-flight-sql://denodo-server.acme.com:9994?useEncryption=false&catalog=acme_db
Note
Note that URI values must be URI-encoded if they contain characters such as !, @, $, etc. Any
URI parameters that are not handled by the driver are passed to the Flight SQL service as gRPC headers. For example,
the previous URI will connect without encryption to a Flight SQL service running on denodo-server.acme.com
on port
9994
.
Connection Parameters¶
Additional options can be passed as query parameters. Parameter names are case-sensitive. The supported parameters are:
Parameter of the URL |
Description |
---|---|
autoCommit |
If This is useful to make sure that an application does not start transactions inadvertently. Even with this parameter set to
true, an application can start and
finish transactions by executing the
statements Take into account the limits on the duration of a transaction:
See more about the limits on the duration of transaction in the section Transactions in Virtual DataPort of the VQL Guide. Default value: |
catalog |
Database against which the user is logged in. Default value: |
disableCertificateVerification |
When TLS is enabled, whether to verify the server certificate. Default value: |
password |
The password for user/password authentication. Default value: |
queryTimeout |
Maximum time (in milliseconds) the execution of a query can take. After this period, it will throw an error. This parameter is optional. If it is not set, the query timeout has the default value (900000 milliseconds). If 0, the query does not have a timeout. This parameter sets the default timeout for all the queries. In addition, you can change the timeout for a single query by adding the parameter ‘QUERYTIMEOUT’ = ‘<value>’ to the CONTEXT clause of the query. See more about this in the section CONTEXT Clause of the VQL Guide. Default value: |
timePrecision |
Precision used for columns with
time type. Valid values are:
Default value: |
timeStampPrecision |
Precision used for columns with
timestamp and timestamptz types.
Valid values are:
Default value: |
trustStore |
When TLS is enabled, the path to the
certificate store. If not provided,
the default Java truststore is used
(usually Default value: |
trustStorePassword |
Password to the certificate truststore. Default value: |
useEncryption |
Wheter to use TLS (the default is an encrypted connection). Default value: |
user |
The username for user/password authentication. Default value: |
useSystemTrustStore |
When TLS is enabled, whether to use the system certificate store. Default value: |
Access using Flight SQL Python driver¶
Python Flight SQL driver allows to connect with Virtual DataPort from native Python applications.
This driver implements the specifications PEP 249 – Python Database API Specification v2.0, which is an API for accessing to databases, and the ADBC: Arrow Database Connectivity, a set of APIs for accessing to Arrow-native databases.
Connection parameters¶
The supported parameters are:
Parameter |
Description |
---|---|
adbc.flight.sql.client_option.tls_root_certs |
Override the root certificates used to validate the server’s TLS certificate. Default value: |
adbc.flight.sql.client_option.tls_skip_verify |
Disable verification of the server’s TLS certificate. Default value: |
adbc.flight.sql.rpc.call_header.autoCommit |
If This is useful to make sure that an application does not start transactions inadvertently. Even with this parameter set to
true, an application can start and
finish transactions by executing the
statements Take into account the limits on the duration of a transaction:
See more about the limits on the duration of transaction in the section Transactions in Virtual DataPort of the VQL Guide. Default value: |
adbc.flight.sql.rpc.call_header.database |
Database against which the user is logged in. Default value: |
adbc.flight.sql.rpc.call_header.queryTimeout |
Maximum time (in milliseconds) the execution of a query can take. After this period, it will throw an error. This parameter is optional. If it is not set, the query timeout has the default value (900000 milliseconds). If 0, the query does not have a timeout. This parameter sets the default timeout for all the queries. In addition, you can change the timeout for a single query by adding the parameter ‘QUERYTIMEOUT’ = ‘<value>’ to the CONTEXT clause of the query. See more about this in the section CONTEXT Clause of the VQL Guide. Default value: |
adbc.flight.sql.rpc.call_header.timePrecision |
Precision used for columns with
timestamp and timestamptz types.
Valid values are:
Default value: |
adbc.flight.sql.rpc.call_header.timeStampPrecision |
Precision used for columns with
time type. Valid values are:
Default value: |
password |
The password for user/password authentication. |
user |
The username for user/password authentication. |
Additional information can be found at ADBC Flight SQL Driver documentation
Notes¶
Better performance is obtained using ADBC methods from the driver such
fetchallarrow
orfetch_arrow_table
. Note that these methods returns results which allow conversions to pandas data frames.We recommend to specify the parameter
adbc.flight.sql.rpc.call_header.timePrecision
withmilliseconds
value for avoiding problems withtime
types. The typetime
withmicroseconds
precision is not supported, returning an error when introspecting a view or executing a query involving columns with that type.
Examples¶
Opening a connection.
from adbc_driver_flightsql.dbapi import connect
def get_connection():
return connect(
"grpc://denodo-server.acme.com:9994",
db_kwargs={
"username": 'user',
"password": 'password',
"adbc.flight.sql.rpc.call_header.database": 'acme_db',
"adbc.flight.sql.rpc.call_header.timePrecision": 'milliseconds',
}
, autocommit=True)
Opening a connection with TLS skipping certificate validation.
from adbc_driver_flightsql.dbapi import connect
def get_connection():
return connect(
"grpc+tls://denodo-server.acme.com:9994",
db_kwargs={
"username": 'user',
"password": 'password',
"adbc.flight.sql.client_option.tls_skip_verify": 'true',
"adbc.flight.sql.rpc.call_header.database": 'acme_db',
"adbc.flight.sql.rpc.call_header.timePrecision": 'milliseconds',
}
, autocommit=True)
Opening a connection with TLS specifying a custom certificate.
from adbc_driver_flightsql.dbapi import connect
def get_connection():
cert_file = open("/<certificate-route>/my-certificate.crt", "r")
crt = cert_file.read()
cert_file.close()
return connect(
"grpc+tls://denodo-server.acme.com:9994",
db_kwargs={
"username": 'user',
"password": 'password',
"adbc.flight.sql.client_option.tls_root_certs": crt,
"adbc.flight.sql.rpc.call_header.database": 'acme_db',
"adbc.flight.sql.rpc.call_header.timePrecision": 'milliseconds',
}
, autocommit=True)
Example executing a query and converting results to a pandas data frame.
with conn.cursor() as cur:
cur.execute("SELECT * FROM VIEW")
data = cur.fetchallarrow()
dataframe = data.to_pandas()
print(dataframe.head)