partition column in hive

Question

I have to partition the table in hive with a column which is also part of the table.

For eg:

Table: employee

Columns: employeeId, employeeName, employeeSalary

I have to partition the table using employeeSalary. So I write the following query:

 CREATE TABLE employee (employeeId INT, employeeName STRING, employeeSalary INT) PARTITIONED BY (ds INT);

I just used the name "ds" here as it did'nt allow me to put the same name employeeSalary.

Is this right what I am doing? Also while inserting values into the table, I have to use a comma separated file. Now the file consists of row like: 2019,John,2000

as one row. If I have to partition using salary my first partition would be all people for salary 2000. So the query would be

LOAD DATA LOCAL INPATH './examples/files/kv2.txt' OVERWRITE INTO TABLE employee PARTITION (ds=2000);

Again after 100 entries with salary as 2000, I have next 500 entries with salary as 4000. So I would again fire the query:

LOAD DATA LOCAL INPATH './examples/files/kv2.txt' OVERWRITE INTO TABLE employee PARTITION (ds=4000);

PLEASE LET ME KNOW IF I AM RIGHT...

QuinnG QuinnG · Accepted Answer · 2011-03-15T21:10:04

Here's how to create a hive table with a partition on the column you specified

CREATE TABLE employee (employeeId INT, employeeName STRING) PARTITIONED BY (employeeSalary INT);

The partition column is specified in the PARTITIONED BY section.
In the Hive shell you can run describe employee; and it will show all the columns in the table. With your CREATE TABLE you should see 4 columns, not the 3 you are trying to get.

For your load command, you will want to specify all the partitions to write into. (I'm not very familiar with these, mostly basing off of http://wiki.apache.org/hadoop/Hive/LanguageManual/DML#Syntax

So something like

LOAD DATA LOCAL INPATH './examples/files/kv2.txt' OVERWRITE INTO TABLE employee PARTITION (employeeSalary=2000, employeeSalary=4000);

partition column in hive

3 Answers