12
votes

I have a table pos.pos_inv in hdfs which is partitioned by yyyymm. Below is the query:

select DATE_ADD(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),5), 
       to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),yyyymm 
   from pos.pos_inv inv 
      INNER JOIN pos.POSActvyBrdg Brdg ON Brdg.EIS_POSActvyBrdgId = Inv.EIS_POSActvyBrdgId 
      where to_date(from_unixtime(unix_timestamp(Inv.nrmlzdwkenddt, 'MM/dd/yyyy'))) 
       BETWEEN DATE_SUB(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),6) 
        and DATE_ADD(to_date(from_unixtime(unix_timestamp(Inv.actvydt, 'MM/dd/yyyy'))),6) 
        and inv.yyyymm=201501

I have provided the partition value for the query as 201501, but still i get the error"

 Error while compiling statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "inv" Table "pos_inv"

(schema)The partition, yyyymm is int type and actvydt is date stored as string type.

3
jeff, which version of hive you are using , I think there are some open jira issues (issues.apache.org/jira/browse/HIVE-4905, issues.apache.org/jira/browse/HIVE-10454 ) related to this issue when partition predicate column is used in the combination of Join, if you are using older version of hive try setting mapred mode to nonstrict and run the query set hive.mapred.mode=nonstrict - Aditya
If i use set hive.mapred.mode=nonstrict , then the whole data set is scanned rather than the particular partition value. - jeff
jeff, I thought its a workaround just for now , may be this patches are updated in latest version , which version of hive is yours , it it possible to try in the higher version? - Aditya
Workaround: Wrap the offending table in a subquery with the partition predicate, (select * from table where partition > value) this avoids the error and avoids scanning the entire table. - justin cress
If you you use the nonstrict workaround for queries sent with beeline from the command-line (like beeline -u "jdbc:..." -e "select ...") you'll need to add it on the command-line: --hivevar hive.mapred.mode=unstrict - DouglasDD

3 Answers

19
votes

This happens because hive is set to strict mode. this allow the partition table to access the respective partition /folder in hdfs .

  set hive.mapred.mode=unstrict;  it will work 
1
votes

In your query error it is said: No partition predicate found for Alias "inv" Table "pos_inv".

So you must put the where clause for the fields of the partitioned table (for pos_inv), and not for the other one (inv), as you've done.

1
votes

set hive.mapred.mode=unstrict allows you access the whole data rather than the particular partitons. In some case read whole dataset is necessary, such as: rank() over