I have some doubts about skew join in hive .
1.when will hive use a common join to process the data , because I only see map join after I set blow properties
- set hive.optimize.skewjoin=true;
- set hive.mapjoin.smalltable.filesize=2;
2.why dosn`t skew join work with left join
below is table and sql:
tmp.skew_large_table 字段 imei,imsi,mac,phone,data_date;
total rows:290,0808
skew key : 868407035454956 670081
-----------
tmp.test_skew_small_table 字段 imei,package,data_date
total rows:857,6164
skew key : 868407035454956 10461
-----------
sql:
select a.*,b.*
from tmp.skew_large_table a
join
tmp.test_skew_small_table b
on a.imei=b.imei;
set hive.auto.convert.join=false;
is more obvious way to switch-off map-join – leftjoin