//Hive-1.2.1000.2.6.1.0-129 We are trying to INSERT OVERWRITE test5 table with multiple partitions.According to document(https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML) INSERT OVERWRITE will overwrite any existing data in the table or partition. But we are still getting some old data after INSERT OVERWRITE query is fired. Below is the sample execution and output.
//Spark-2.1.1 We are getting same out put when running through HiveContext in Spark-2.1.1
CREATE TABLE dbtest.test5 (emp_id INT) PARTITIONED BY (depart_id INT,depart_name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'externalpath';
INSERT INTO TABLE dbtest.test5 PARTITION (depart_id,depart_name) SELECT emp_id,depart_id,depart_name from dbtest.tempTableHive1;
4 123 Dev
5 123 Dev
6 123 Test
7 567 Test
INSERT INTO TABLE dbtest.test5 PARTITION (depart_id,depart_name) SELECT emp_id,depart_id,depart_name from dbtest.tempTableHive2;
4 123 Dev
5 123 Dev
1 123 Dev
2 123 Dev
6 123 Test
3 123 Test
7 567 Test
INSERT OVERWRITE TABLE dbtest.test5 PARTITION (depart_id,depart_name) SELECT emp_id,depart_id,depart_name from dbtest.tempTableHive3;
8 123 Dev
9 123 Dev
10 123 Dev
6 123 Test
3 123 Test
7 567 Test
Is there any wrong with code or it is apache hive problem?