0
votes

We have an external hive table with Parquet files backed in the S3 location as shown below. enter image description here

Our EMR runs in PDT/PST timezone. We copy this data from hive to vertica using the vertica copy command. Our Vertica cluster is also in PDT/PST.

On 3rd Nov (time changed to PST) when the EMR ran the copy command for the above data, the timestamp lags by an hour in vertica as shown below

enter image description here

Datatype used in hive and vertica for date field is "timestamp".

Can anyone please explain why this issue is happening and how to fix it?

1

1 Answers

0
votes

I have the suspicion that Hive, with TIMESTAMP, implies TIMESTAMP WITH TIMEZONE. I know for a fact that Vertica treats the two as two distinct data types.

Worth trying to investigate down that path ...