I am trying to load data into hive from RDBMS, using sqoop.
Once I populate the hive table with data, and try to run a count(*), the query runs forever and ever. Also if I drop the (external) hive table and delete everything from the hdfs directory and then create a similar, the table gets pre populated with old data(same as in dropped table)even after I delete everything from my hdfs directory and in-fact the trash is also cleared.
Still, the data gets populated and a count(*) runs indefinitely on it.
UPDATE 1
Its a stand alone sandbox hortonworks(2.4) environment. I dropped the table from hive and also removed related files from HDFS. I have a script to create and load data.
drop table employee;
and the I run following commands
hadoop fs -rm -r /user/hive/warehouse/intermidiateTable/* ,and,
hadoop fs -rm -r .Trash/Current/user/hive/warehouse/intermidiateTable/*
and then i create the table using same query as this:
create external table employee (id int, name string, account_no bigint, balance bigint, date_field timestamp, created_by string, created_date string,batch_id int, updated_by string, updated_date string)
row format delimited
fields terminated by ','
lines terminated by '\n'
location '/user/hive/warehouse/intermidiateTable';
and when i do select query the table gets populated with older data. Als0, a select count(*) runs indefinitely.
Recommend a solution somebody.