3
votes

I'm trying run this a query, with a JOIN without ON property.

I'm running the query like:

hive -v -f  my_file.hql

I got this message:

In strict mode, cartesian product is not allowed. If you really want to perform the operation, set hive.mapred.mode=nonstrict

I update the hql file with:
set hive.mapred.mode=nonstrict on top of it.

But then I got this message:

SET hive.mapred.mode=nonstrict Query returned non-zero code: 1, cause: Cannot modify hive.mapred.mode at runtime. It is in the listof parameters that can't be modified at runtime

How I can solve this issue?

ps: I want to make this cartesian product.

How I make it happen? Where I can set this variable hive.mapred.mode works ?

1
have you tried setting it in hive-site.xml? - spijs
I don't have access to the hive-site.xml (using a shared cluster) @spijs - Alvaro Joao
Do you have a good reason why you don't want to use ON? Apparently Hive cannot optimize converting a where clause to a ON. - spijs
I'm just using another table that does not necessary match perfectly with each other that's why does not make sense for me in this case use the ON on it. @spijs - Alvaro Joao
Have you tried to set it in Hive interactive shell? If that also is not possible, it probably means that this has to be set before turning on hive-server meaning you probably can't figure this out without contacting your cluster admin. - spijs

1 Answers

12
votes

As you already know a cartesian product is not allowed in strict mode (and for good reasons). In your use case it seems like you don't have permissions to make changes to these type of hive settings.

To hack around this problem what you could do is the following. First create two new tables

create table new_1 as SELECT *,1 as join_key from table1;
create table new_2 as SELECT *,1 as join_key2 from table2;

Then join these tables on this join_key. The result will be the cartesian product since it will match each row of table1 with each row of table2.

select * from new_1 join new_2 on join_key=join_key2

Just found out that using --hiveconf solves the problem:

hive -v -f  my_file.hql --hiveconf hive.mapred.mode=nonstrict

will allow the nonstrict mode specifically for this query.