You can translate the question and the replies:

Denodo to S3 connection at a bucket level ( not an object level)

Hello Denodo Team, I have a need to connect from Denodo to a S3 bucket. The S3 bucket receives daily incremental files of a specific object ( same structure for all files ). For example, let us say that S3 bucket contains Campaign data. I will continue to get incremental Campaign data on daily basis like campaign20200501 , campaign20200502 , campaign20200503 etc. I was able to establish the connection from Denodo to a S3 object. However my need is to connect at a bucket level and then create a base Campaign view of the Campaign data. My requirement is that whenever I call the base Campaign view, it returns me all the Campaign data up to this point in time. My question is whether Denodo has such a capability. One alternative I have is to use Athena to create such a view and then have Denodo connect to the Athena view. However I would hate to add a layer, if Denodo can already do this. Please let me know if Denodo has such capability.
user
26-04-2020 17:37:53 -0400
code

3 Answers

Hi, I believe you are using a [Distributed File System Custom Wrapper](https://community.denodo.com/docs/html/document/denodoconnects/7.0/Denodo%20Distributed%20File%20System%20Custom%20Wrapper%20-%20User%20Manual) to access the files in Delimited files in S3 through Denodo Virtual DataPort. If the schema of all the files are same and if they follow a pattern, like in your case, campaign<date>, you could access more than multiple files using the metacharacter "*". The Metacharacter matching can be done even if your files are organized as folders and files, say every month is a folder and each days file with date available in side each. So for demonstration purpose I am going to have the below structure for my S3 File storage, ``` My Bucket Campaigns Jan Campaign20200101.csv Campaign20200102.csv Campaign20200103.csv . . Feb Campaign20200102.csv . . ``` For the case above, I used the below when creating the base view, as the configuration for the S3 Connection like URL, Bucketname, access keys would all setup during datasource creation. ``` Path = "/Campaigns/" File name pattern: (.*)/Campaign(.*).csv ``` with the above, my base view as able to cycle through all the folders under the Campaigns and read the files with pattern Campaign<date>.csv to produce the results for all the files. Note: The Distributed File System Custom Wrapper is a freely distributed component that would require access to the Denodo Support Site to download. Hope this helps!
Denodo Team
27-04-2020 05:57:58 -0400
code
Thank you and I appreciate the quick response. I have a follow up question. In this example, in future, if we decide to add a few columns at the end of Campaign file, does this still work? Or should I create a new base view for the new structure ( and potentially with a new file name format) and then do a derived view to join these two format?
user
27-04-2020 09:50:05 -0400
Hi, I tried to add new columns by adding new columns at the end of the file, I needed to create a new base view to accommodate the new schema and union-ed the previous view with the new view. Since the Virtual DataPort Union's are [extended](https://community.denodo.com/docs/html/browse/7.0/vdp/administration/creating_derived_views/creating_union_views/creating_union_views#creating-union-views), the schema change was adjusted by filling the fields with "Null" that are missing in the schema. I also observed, that if i add the extra field to all the files, then all I need to do was to "[Source Refresh](https://community.denodo.com/docs/html/browse/7.0/vdp/administration/creating_data_sources_and_base_views/source_refresh/source_refresh)" and the schema was adjusted accordingly, it even let me choose a different folder which contains all the new extended files. Hope this helps!
Denodo Team
11-05-2020 08:48:34 -0400
code
You must sign in to add an answer. If you do not have an account, you can register here