0
votes

I am trying to load data into hive from RDBMS, using sqoop.

Once I populate the hive table with data, and try to run a count(*), the query runs forever and ever. Also if I drop the (external) hive table and delete everything from the hdfs directory and then create a similar, the table gets pre populated with old data(same as in dropped table)even after I delete everything from my hdfs directory and in-fact the trash is also cleared.

Still, the data gets populated and a count(*) runs indefinitely on it.

UPDATE 1

Its a stand alone sandbox hortonworks(2.4) environment. I dropped the table from hive and also removed related files from HDFS. I have a script to create and load data.

drop table employee;

and the I run following commands

 hadoop fs -rm -r /user/hive/warehouse/intermidiateTable/* ,and,
 hadoop fs -rm -r  .Trash/Current/user/hive/warehouse/intermidiateTable/*

and then i create the table using same query as this:

create external table employee (id int, name string, account_no bigint, balance bigint, date_field timestamp, created_by string, created_date string,batch_id int, updated_by string, updated_date string)
            row format delimited
            fields terminated by ','
            lines terminated by '\n'
            location '/user/hive/warehouse/intermidiateTable';

and when i do select query the table gets populated with older data. Als0, a select count(*) runs indefinitely.

Recommend a solution somebody.

1
Can you share exact shell and HQL (Hive Query Language) commands you use? Did you use DROP statement or only deleted files? Give also characteristics of the data - format, number of rows, number of files, total size. Plus cluster basic details Hadoop mode, cluster size, node characteristics. You can add this info to the question. - Ivan Georgiev
Kindly check UPDATE 1 in question @IvanGeorgiev - Ishant Sharma
Do describe formatted employee in HIVE and get file path and make sure you are deleting right `files, and run a dfs -ls command see if any files are their? - Ravinder Karra
Tried all this before even posting the question. I don't know how the hive table gets populated magically. I cleared trash, tried different locations, every thing. - Ishant Sharma

1 Answers

0
votes

If you are creating external table inside warehouse directory itself, then what is the purpose of declaring table as 'external',no?

Aren't external table supposed to be outside the warehouse directory so you have control over data files rather than hive itself.