0
votes

I am trying to load a data file which has a multi-character delimeter('|~|') to Amazon Redshift DB using the COPY command. Redshift COPY command does not allow for multi-character delimiters.

My data looks like this -

John|~|23|~|Los Angeles|~|USA
Jade|~|27|~|New York|~|USA

When I try to use multi-characters in the COPY command I get "COPY delimiter must be a single character;" error.

My COPY command looks like this -

copy test_data from 's3://abcd/testFile' 
credentials 'aws_access_key_id=<redacted>;aws_secret_access_key=<redacted>' 
delimiter '|~|' 
null as '\0'
acceptinvchars
ignoreheader as 1
MAXERROR 1;

I cannot replace or edit the source files since they are very large(>100GB), so I need a solution within the AWS Redshift paradigm.

1

1 Answers

2
votes

If you can't edit the source files, and you can't use a multi-character delimiter, then use | as the delimiter and add additional (fake) columns that will be loaded with ~.

You can then either ignore these columns, or use CREATE TABLE AS to copy the data to a new table but without those columns.

Or, use CREATE VIEW to make a version of that table without the fake columns.