1
votes

I have raw files in HDFS in the format

name=ABC age=10 Location=QWERTY
name=DEF age=15 Location=IWIORS

How do I import data from these flat files into a Hive table with columns 'name' and 'location' only.

1
what have you done so far ? - Ashish Ratan
I have a constraint where data is being published (realtime via flume) in the above format. I want to do batch analysis on the data in Hive, and that is why I need to import. - user1771840
so basically what u want to ask? are you facing problem in making key value pair? or are you facing problem in hive insertion ? - Ashish Ratan
ok, i dont have idea about hive :( - Ashish Ratan

1 Answers

1
votes

You can do the following.

In table declaration, use:

ROW FORMAT DELIMITED
        FIELDS TERMINATED BY ' ' --space
        MAP KEYS TERMINATED BY '='

Also your table will have a single column with data type as Map.

So when you can retireve data from the single column using the key.

Other option: Write your own SerDe. Link below explain the process for JSON data. I am sure you can customize it for your requirements: http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/