3
votes

I'm trying to change the existing Hive external table delimiter from comma , to ctrl+A character by using Hive ALTER TABLE statement

ALTER TABLE table_name SET SERDEPROPERTIES ('field.delim' = '\u0001');

After the DDL I could see changes

show create table table_name

But when I select from Hive, the values are all NULL (underlying files in HDFS are changed to have ctrl+A delimiter).

I have repaired the table also by using msck.

Only way to see the data is dropping and re-creating the external table, can anyone please help me to understand the reason.

Table Syntax :

CREATE EXTERNAL TABLE `table_name`(
  col1,
  col2,
  col3)
PARTITIONED BY ( 
  `ing_year` int, 
  `ing_month` int, 
  `ing_day` int)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
  LINES TERMINATED BY '\n' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://location/'
3
What do you mean by "But when I select from HDFS the values are all NULL" ? Don't you mean Hive or Impala ? - cheseaux
In HIVE , Alter table is changing the delimiter but not able to select values properly. it returns null. - William R
It would also help to see the statement you used to create the table - cheseaux
table is created long back , now I am trying to change the delimiter from comma to ctrl+A. - William R

3 Answers

7
votes

An ALTER TABLE command on a partitioned table changes the default settings for future partitions.

But it will not apply to existing partitions, unless that specific command supports the CASCADE option -- but that's not the case for SET SERDEPROPERTIES; compare with column management for instance

So you must ALTER each and every existing partition with this kind of command

ALTER TABLE table_name PARTITION (ing_year=2016,ing_month=8,ing_day=31)
SET SERDEPROPERTIES ('field.delim' = '\u0001');

So now it's time for you to run a SHOW PARTITIONS, apply a couple of RegEx on the output to generate the list of commands, run these commands, and be happy ever after.

Side note: I can tell you it was REALLY painful to rename a column before the CASCADE stuff was finally implemented...

1
votes

You can not ALTER SERDER properties for an external table. What you could do is to remove link between your table and the external source. Example if is an Hbase table, you can do: 1) ALTER TABLE MY_HIVE_TABLE SET TBLPROPERTIES('hbase.table.name'='MY_HBASE_NOT_EXISTING_TABLE') MY_HBASE_NOT_EXISTING_TABLE must be a nott existing table. 2) DROP TABLE MY_HIVE_TABLE; 3) Recreate your hive table by specifing your new SERDE Properties CREATE EXTERNAL TABLE MY_HIVE_TABLE( MY_colums ) ROW FORMAT SERDE '...' WITH SERDEPROPERTIES ( ...) TBLPROPERTIES ( 'hbase.table.name'='z_app_qos_hbase_temp:MY_HBASE_GOOD_TABLE');

-1
votes

Put this command for change SERDEPROPERTIES.

alter table airlinetestcsv 
SET SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde';