I created a Hive table with Non-partition table and using select query I inserted data into Partitioned Hive table.
- By following above link my partition table contains duplicate values. Below are the setps
This is my Sample employee dataset:link1
I tried the following queries: link2
But after updating a value in Hive table,
Updating salary of Steven with EmployeeID 19 to 50000.
INSERT OVERWRITE TABLE Unm_Parti_Trail PARTITION (Department = 'A') SELECT employeeid,firstname,designation, CASE WHEN employeeid=19 THEN 50000 ELSE salary END AS salary FROM Unm_Parti_Trail;
the values are getting duplicated.
7 Nirmal Tech 12000 A
7 Nirmal Tech 12000 B
Nirmal is placed in Department A only but it is duplicated to department B.
Am I doing anything wrong?
Please suggest.