0
votes

I am creating an external table using Hive shell and loading some data in it. When I run the show table command, it shows the table name. But when I run a select query to display the data from that table, it does not give any output.

I also tried to find the table in the /user/Hive/Warehouse location in HDFS but it does not show it.

I am using the default Derby database and have not made any changes to the hive-site.xml file.

Update

I was using the incorrect file to input the data. That file was a .JSON file. Now I am trying to create an external table using Hive shell and load some data in it. It gives an error saying 'Execution error return code 1 from rg.apache.hadoop.hiveql.exec.DDLTask. MetaException(message:hdfs:/localhost:9000out_sa/part-r-00000s not a directoryor unable to create one)

Below is the query and the data that I am trying to load in a String column.

Query

CREATE EXTERNAL TABLE twitter_Data (Comments STRING) Location 'out_sa/part-r-00000';

Sample Data

RT @arjenvanberkum: The impacts of #BigData that you may not have heard of |
Descarga los PDFs de los Cursos de Google AdWords, Analytics, Community y SEO. Infórmate! 
RT @cookovernewz: The Secret Ingredient In The Text Analytics ROI Recipe - Forbes 
RT @cookovernewz: The Secret Ingredient In The Text Analytics ROI Recipe - Forbes 
The Secret Ingredient In The Text Analytics ROI Recipe - Forbes 
1
Please show your example Dataset and the query you ran. Also, if you're using the default derby database, it's not clear what filesystem you're looking for data on (how did Hive get linked to HDFS without a config change) - OneCricketeer
I have added the query and the error and have updated my post with the error I am getting now. - Rich

1 Answers

0
votes

First, the error seems straightforward to me. The LOCATION must be a directory of all files that adhere to the provided schema. It cannot be a single file.

Second, the file does not appear to be JSON. It is plain lines of text.

Then, it's not clear how you are linking that to HDFS, so I suggest giving the full path to the namenode

Try this

CREATE EXTERNAL TABLE IF NOT EXISTS twitter_Data(
    Comments STRING
) 
ROW FORMAT DELIMITED 
LINES TERMINATED BY '\n';
STORED AS TEXT
LOCATION 'hdfs://namenode.example.com:9000:/out_sa/';

If you want to do tweet/text analysis, I might suggest Spark rather than just Hive, though.