I am running a simple query in Hive that produces the following output (with a few other additional columns.
|------|-----------------------------------------------------------|
| col1 | col2 |
|------|-----------------------------------------------------------|
| A | {"variable1":123,"variable2":456,"variable3":789} |
|------|-----------------------------------------------------------|
| B | {"variable1":222,"variable2":333,"variable3":444} |
--------------------------------------------------------------------
I need to be able to parse the json string and pull out the values for each token during the SELECT statement itself so that I can perhaps incorporate a WHERE statement to return only the parts of the string that are valuable to me.
So my ultimate output might look like this:
|------------------------------------------|
| col1 |variable1 | variable2 | variable3 |
|------------------------------------------|
| A | 123 | 456 | 789 |
|------------------------------------------|
| B | 222 | 333 | 444 |
--------------------------------------------
I have tried using various functions in including SPLIT and GET_JSON_OBJECT using the argument structure specified in the examples yet all return errors such as:
No matching method for class org.apache.hadoop.hive.ql.udf.UDFJson
with (struct<...>, string). Possible choices: _FUNC_(string, string)
Could someone please tell if what I am trying to do is feasible, or explain where I am going wrong?
Thanks in advance