You can translate the question and the replies:

JSON Parsing in Denodo

Hi support Quick question about Json parsing. I know there are currently no native functions (please let me know if that is incorrect) for parsing json on the read, like you would have natively with HIve or Presto (using something like get_json_object). I could write my own function, but before doing that, I am curious if there are any best practices that you would suggest using in the tool. Ideally we would have denodo parse it on the fly, without the user having to write any code to parse it. I know there are 'flatten' options and 'json' type connections, but I have not gotten those to work. Here is what the data footprint looks like from my end: 1. We get data each day from a rest API. That data is just one big json dictionary file. That json string gets stored as one text type columm in Hive. 2. We have processes that fetch that new data and parse out all the columns into fields.. then that new parsed data is added to a table with all the rest of the parsed data. 3. There are a few hundred million rows of parsed data in the cleansed, 'final' table. Ideally I would like it if I could just have denodo look at that json field from hive and auto-parse all the events into its own table. Then it would save me from having to ETL the data around. Is anything like this a possibility? Thanks!
user
22-08-2016 12:13:27 -0400

3 Answers

Hi, Yes, VDP is able to parse the JSON field from Hive. I suggest creating a Hive data source using the Denodo Hive JDBC driver and creating the base view from a SQL Query. When using a Hive data source, you can use get\_json\_object, json\_tuple and LATERAL VIEW in your query to parse the string. I recommend setting the value of the “Delegate SQL Sentence as Sub Query” parameter to false in the base view (Options > Search Methods > Wrapper Source Configuration). You can take a look at the following article on connecting to Hive Data Sources https://community.denodo.com/tutorials/browse/bigdata/1hive and read the “Creating Base Views from SQL Queries” section in the Administration Guide. Hope this helps!
Denodo Team
23-08-2016 07:27:32 -0400
Thanks for the info.. we have used this method befre in the past. What i am more interested in is the pre-parsing of JSON objects directly from a field in a database, on read, from Denodo. I can get it to work when I give it a JSON type data connection, but it wants a file. Here would be the best possible functionaity: 1. There is one field in a database called "properties". That properties has a json string which has about 100-200 key/value pairs. 2. When I introspect the element from HIVE and bring in the table, Denodo would see this field has JSON. 3. Denodo would unseat all the key/value pairs into columns and I can save the table. 4. The table would then be able to be queried like any other table with X number of columns from the user. Is something like this possible? Thanks!
user
25-08-2016 18:15:17 -0400
Hi, If you import the JSON field as a JSON data source, you can parse it into the two desired key/value columns. Then, you can create a new join view with the key/value columns and the rest of the columns from the HIVE table. You can now query the join view like any other table. Hope this helps!
Denodo Team
29-08-2016 11:09:31 -0400
You must sign in to add an answer. If you do not have an account, you can register here