1
votes

So I have a hive external table with schema looks like this :

{
.
.
`x` string,
`y` ARRAY<struct<age:string,cId:string,dmt:string>>,
`z` string
}

So basically I need to query a column(column "y") which is array of nested json, I can see data of column "y" from hive, but data in that column seems invisible to presto, even though presto knows schema of this field, like this:

array(row(age varchar,cid varchar,dmt varchar))

As you can see presto already knows this field is array of row.

Notes:

1.The table is a hive external table.

2.I get schema of field "y" by using ODBC driver, but data is just all empty, however I can see something like this in hive : [{"age":"12","cId":"bx21hdg","dmt":"120"}]

3.Presto queries hivemetastore for schema.

4.Table was stored as parquet format.

So how can I see my data in field "y" please?

1
Are you trying to query through an ODBC driver or using the Presto CLI?Dave Cameron
Presto allows you to join on Arrays using the unnest function, but I'm not clear if that is the result you are intending?Dave Cameron

1 Answers

0
votes

Please try the below. This should work in Presto.

"If the array element is a row data type, the result is a table with one column for each row field in the element data type. The result table column data types match the corresponding array element row field data types"

select
    y,age,cid,dmt
from
    table
    cross join UNNEST(y)  AS  nested_data(age,cid,dmt)