1
votes

I am new to hive and was trying to execute below simple sub-query example on 2 external tables I created on top of my HDFS file.

Both item_id and k_item_id are of same time bigint. Can anyone help me understand exact cause of error?

hive> SELECT item_id FROM hivedev.ext_c_oi_item2 WHERE 
             item_id NOT IN (SELECT k_item_id FROM hivedev.ext_c_ksn2);

Error:

FAILED: ParseException line 1:71 cannot recognize input near 'SELECT' 'k_item_id' 'from' in expression specification

1

1 Answers

2
votes

Hive does not support subqueries in an IN clause. Instead, you can do a a left outer join and filter on the right side being null to implement the same logic:

SELECT ext_c_oi_item2.item_id
FROM hivedev.ext_c_oi_item2
LEFT OUTER JOIN hivedev.ext_c_ksn2
ON ext_c_oi_item2.item_id = ext_c_ksn2.k_item_id
WHERE ext_c_ksn2.k_item_id IS NULL;