I need to enable Sequence File with Block Compression data. Below is the table which will be stored as SequenceFile.
create table lip_data_quality
( buyer_id bigint,
total_chkout bigint,
total_errpds bigint
)
partitioned by (dt string)
row format delimited fields terminated by '\t'
stored as sequencefile
location '/apps/hdmi-technology/b_apdpds/lip-data-quality'
;
And in the above table, I am getting data in Compressed Form like this by enabling these commands-
set mapred.output.compress=true;
set mapred.output.compression.type=BLOCK;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.LzoCodec;
So my question is that's all I need to enable BLOCK Compression with Sequence File? Or is there anything else I need to do? I was following this article Hadoop
Any suggestion will be appreciated.
Update:-
I am loading the data in the above table like this by putting everything in a .hql file and running that hql file from the shell command prompt. And changing the partition date everytime while running the below hql file.
set mapred.output.compress=true;
set mapred.output.compression.type=BLOCK;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.LzoCodec;
insert overwrite table lip_data_quality partition (dt='20120712')
SELECT query here which will give the output for the above table.