0
votes

I have tried almost every solution from SO but still same issue. I have a CSV file in S3 and a table in Redshift.

Table structure is as below:

like_id => inetger
p_id => integer
c_id => integer
date_added => timestamp (without time zone)

CSV file contains data as:

1|1234|9876|2012-08-12 03:30:00
...
...
1500|4545|87545|2019-01-15 08:10:00

When i run the COPY commnad in RedShift, it gives an error of Invalid timestamp format or value [YYYY-MM-DD HH:MI:SS] , i have tried almost all time date format from RedShift but still stuck. COPY command is below:

copy likes from 's3://<myBucketPath>/like.csv'
credentials 'aws_iam_role=<IAM-Role-Here>'
delimiter '|' region 'us-west-2'
timeformat 'YYYY-MM-DD HH:MI:SS';
//timeformat 'MM/DD/YYYY HH:MI:SS'
//dateformat as 'auto'
//timeformat as 'auto'

NOTE: I exported my table from MySQL in CSV format and date_added column's data type is datetime

1
Can you show us your COPY command? - John Rotenstein
@JohnRotenstein sir, copy command is same as your's but i added the region also and also check by changing time and date formats. - Muhammad Hashir Anwaar
What do you mean by "check by changing time and date formats"? Please edit your question and include your COPY command. - John Rotenstein
@JohnRotenstein sir, i have updated the question with copy command... pelase check. - Muhammad Hashir Anwaar
@JohnRotenstein, Even if i use AWS GLUE's ETL job. it recognize my date_added column as String and in target table i choose the timestamp datatype . in result , job skip the date_added column and add NULL to date_added column in database table. - Muhammad Hashir Anwaar

1 Answers

1
votes

To reproduce your situation, I did the following:

  • Created a CSV file containing:
1|1234|9876|2012-08-12 03:30:00
1500|4545|87545|2019-01-15 08:10:00
  • Launched an Amazon Redshift cluster and created a table:
CREATE TABLE foo (like_id INT, p_id INT, c_id INT, date_added TIMESTAMP WITHOUT TIME ZONE)
  • Loaded the data from Amazon S3 into Redshift with:
COPY foo
FROM 's3://my-bucket/bar/'
IAM_ROLE 'arn:aws:iam::111111111111:role/my-role'
  • Selected the data:
SELECT * FROM foo

The result was:

1     1234   9876  2012-08-12 03:30:00
1500  4545  87545  2019-01-15 08:10:00

So, it seems to have loaded it by default without any need for particular settings.

I know it is a TIMESTAMP WITHOUT TIMEZONE because I can do this:

SELECT date_added + INTERVAL '15 minutes' FROM foo

2012-08-12 03:45:00
2019-01-15 08:25:00