I have developed a hive query that uses lateral views and get_json_object to unpack some json. The query works well enough using a jdbc client (dbvisualizer) but when run as spark sql from a java application, on the same data, it returns nothing.
If I create an array to mimic the expected unpacked json, the query works
array('Plan A','Plan B','Plan C','Plan D')
but if I do the unpacking with this statement it doesn't:
split(regexp_replace(get_json_object('{"product_offer":[{"productName":"Plan A"},{"productName":"Plan B"},{"productName":"Plan C"},{"productName":"Plan D"}]}', '$.product_offer.productName'), '\\[|\\]|"', ''), ',' )
When run individually in dbvisualizer they produce the same output and are equivalent when compared.
This is a cutdown version of the query that works with a test table with one row, escaped for java:
"SELECT pageid , exp.* , exp2.* , exp3.* from stephen \n" +
"lateral view outer posexplode( split(regexp_replace(get_json_object('{\"product_offer\":[{\"productName\":\"P lan A\"},{\"productName\":\"Plan B\"},{\"productName\":\"Plan C\"},{\"productName\":\"Plan D\"}]}', '$.product_offer.productName'), '\\\\[|\\\\]|\"', ''), ',' ) ) exp as seqn, product_name\n" +
"lateral view outer posexplode( array('aaaa','bbb','ccc','ddd')) exp2 as seqo, product_offer\n" +
"lateral view outer posexplode( array('Delete','Add','Add','Delete')) exp3 as seqa, product_action\n" +
"where (seqn=seqo)\n" +
" and (seqn=seqa)\n"
I am using java 8, hive version '2.2.0 rda840b0f8fa99cab9f004810cd22abc207493cae'
No errors are logged.