S3 Path¶
Use this type of path to obtain the data from a file or a set of files located in a S3 bucket.
Find information about the Filters tab in Compressed or Encrypted Data Sources; filters work the same way for any type of path (local, HTTP, FTP…).
Configuration¶
In URI, enter the path you want to obtain the data from. It can point to a file or a directory and you can use interpolation variables (see section Paths and Other Values with Interpolation Variables).
In Custom properties you can set the same properties that you would put in the Hadoop configuration files like core-site.xml to configure the S3A Hadoop connector.
Paths Pointing to a Directory¶
When you create a base view over a data source that points to a directory, Virtual DataPort infers the schema of the new view from the first file in the directory and it assumes that all the other files have the same schema.
Only for delimited-file data sources: if the URI points to a directory and you enter values in URI Pattern and File name pattern, the data source will only process files whose full paths (including directories and file name) match the regular expression entered in their respective boxes.
For example, if you only want to process files with the
extension log, enter .*\.log in the File name pattern box.
If you also want to filter for logs placed in folders named logs_directory,
enter .*/logs_directory in the URI Pattern box.
URI Pattern filters the full directory path. Do not include the file name; instead, use the File name pattern for that purpose.
Note
For XML data sources, if a Validation file has been provided, all files in the directory have to match that Schema or DTD.
Authentication¶
There are two ways to configure the credentials:
AWS IAM credentials: specify the AWS access key ID and the AWS secret access key. Optionally, you can enter the AWS IAM role ARN. The AWS user will assume the role to get the necessary privileges to connect to the bucket. The role ARN is the Amazon Resource Name of the IAM role you want to assume. Specify the role ARN like
arn:aws:iam::<awsAccountId>:role/<roleName>.Denodo AWS instance credentials: automatically obtains the credentials of the database from the AWS instance where this Virtual DataPort server is running. Optionally, you can also specify an AWS IAM role ARN to get the necessary privileges to connect to the database.
