0
votes

My query looks like:

SELECT * FROM a WHERE a.col NOT IN (SELECT col FROM B)

When I execute the query I get the following error:

FAILED: SemanticException [Error 10052]: In strict mode, cartesian product is not allowed. If you really want to perform the operation, set hive.mapred.mode=nonstrict

Where is the cartesian product in my query and how can I avoid this error?

2
Your question looks contradictory. Here NOT IN = cartesian product, so I think you should rather ask how to avoid NOT IN. - Al Kepp
Try joining instead - select a.* from a left join b on a.col = b.col where.col is null. - Andrew
...or a NOT EXISTS clause with correlated sub-query (which will be rewritten as a "join" against an in-memory hashmap) - Samson Scharfrichter
@stevenl, add the execution plan to your post - David דודו Markovitz

2 Answers

0
votes

you can bypass not in completely and use 'except'

SELECT * FROM a WHERE a.col except (SELECT * FROM a WHERE a.col IN (SELECT col FROM B))
0
votes

you can achieve your task using left join

select a.* from a left outer join b on a.col = b.col where.col is NULL