USER MANUALS

ClickHouse

To configure a ClickHouse data source to perform bulk data loads, follow these steps:

  1. In Design Studio, edit the JDBC data source and click the tab Read & Write and select Use bulk data load APIs.

  2. In the File System drop-down, select the file system that will be used to upload the data files.

Note

The available options are S3 and HDFS/GCS/Azure Blob Storage.

S3

  1. In S3 URI field, enter the URI of the S3 bucket to which Virtual DataPort will upload the data file. For example: s3://my-bucket/

  2. Configure the authentication. There are two ways:

    1. Using AWS access keys:

      • Enter the AWS access key ID and the AWS secret access key.

      • Or, to obtain the credentials from the credentials vault, select Obtain credentials from password vault.

      Optionally, you can enter the AWS IAM role ARN. The AWS user will assume the role to get the necessary privileges to interact with the bucket. The role ARN is the Amazon Resource Name of the IAM role you want to assume. Specify the role ARN like arn:aws:iam::<awsAccountId>:role/<roleName>.

    2. Or automatically obtain the credentials of AWS S3 from the AWS instance where this Virtual DataPort server is running. To do this, select Use Denodo AWS instance credentials. Optionally, you can also specify an AWS IAM role ARN to get the necessary privileges.

HDFS/GCS/Azure Blob Storage

  • HDFS

    1. In URI, enter the URI to which Virtual DataPort will upload the data file. For example: hdfs://acme-node1.denodo.com/user/admin/data/

    2. If the access to the HDFS is restricted by the Apache Knox gateway, in the section Apache Knox configuration, do this:

      • Authentication: select Basic authentication (this is the authentication method that is currently supported).

      • Base URI: enter the base URI of Apache Knox. For example: https://acme.com:8443/gateway/default.

      • User: specify the user name.

      • Password: introduce the password.

  • GCS

    1. In URI, enter the URI to which Virtual DataPort will upload the data file. For example: gs://my-bucket/

    2. In the Hadoop properties section set the properties hmac.access.key.id and hmac.access.key.secret with the HMAC key and HMAC secret provided by GCS.

  • Azure Blob Storage

    1. In URI, enter the URI to which Virtual DataPort will upload the data file. For example: https://<account\>.blob.core.windows.net/<container\>/<path\>

    2. In the Hadoop properties section set the property azure.access.key with the access key provided by Azure Blob Storage.

Note

Bulk data loads using Azure Blob Storage are only supported for ClickHouse 23.5 and higher.

Add feedback