We recently upgraded our server from CDH 5 to CDH 6 . When inserting data to TIMESTAMP columns using SPARK in parquet tables there is difference how data is inserted.
CDH 5:
HIVE:
If we insert 2019-01-30
to TIMESTAMP column of parquet table and select data from Hive value is '2019-01-30 00:00:00 0'
CDH 6:
HIVE:
If we insert 2019-01-30
to TIMESTAMP column of parquet table and select data from HIVE value is '2019-01-30 04:00:00'
IMPALA:
If we insert 2019-01-30
to TIMESTAMP column of parquet table and select data from IMPALA value is '2019-01-30 04:00:00'
Please let me know if there is any spark properties we can use . My primary goal is to match HIVE value in CDH5 vs CDH6 and If possible when we select from IMPALA if should be 2019-01-30 00:00:00'