1
votes

I am trying to add partition to a hive table (partitioned by date)

My problem is that the date needs to be fetched from another table.

My query looks like :

ALTER TABLE my_table ADD IF NOT EXISTS PARTITION(server_date = (SELECT max(server_date) FROM processed_table));

When i run the query hive throws the following error:

Error: Error while compiling statement: FAILED: ParseException line 1:84 cannot recognize input near '(' 'SELECT' 'max' in constant (state=42000,code=40000)

1

1 Answers

1
votes

Hive does not allow to use functions/UDF's for the partition column.

Approach 1:

To achieve this you can run the first query and store the result in one variable and then execute the query.

server_date=$(hive -e "set hive.cli.print.header=false; select max(server_date) from processed_table;") hive -hiveconf "server_date"="$server_date" -f your_hive_script.hql

Inside your script you can use the following statement:

ALTER TABLE my_table ADD IF NOT EXISTS PARTITION(server_date =${hiveconf:server_date});

For more information on the hive variable substitution, you can refer link

Approach 2:

In this approach, you will need to create a temporary table if the partition data you are expecting is already not loaded in any other partitioned table.

Considering your data doesn't have the server_date column.

  1. Load the data into temporary table
  2. set hive.exec.dynamic.partition=true;
  3. Execute the below query:

INSERT OVERWRITE TABLE my_table PARTITION (server_date) SELECT b.column1, b.column2,........,a.server_date as server_date FROM (select max(server_date) as server_date from ) a, my_table b;