0
votes

We have fifteen embedded newline characters in the field of a source S3 file. The field size in target table in Redshift is VARCHAR(5096). The field length in the source file is 5089 bytes. We are escaping each of the fifteen newline characters with a backslash \ as required by the ESCAPE option of the copy command. Our expectation with the ESCAPE option is that the backslash \ that has been inserted by us before each newline character will be ignored before loading the target in Redshift. However, when we use copy command with the ESCAPE option we are getting

err_code:1204 - String length exceeds DDL length."

Is there a way in which the added backslash \ characters are not counted for target column loads in Redshift?

Note: When we truncated the above source field in the file to 4000 bytes and inserted the backslash \ before the newline characters, the copy command with ESCAPE option successfully loaded the field in Redshift. Also, the backslash \ characters were not loaded in Redshift as expected.

2

2 Answers

0
votes

You cold extend your VARCHAR length to allow for more characters.

Or, you could use the TRUNCATECOLUMNS options to load as much as possible without generating an error.

0
votes

Our understanding w.r.t the above issue was incorrect. The backslashes "\" that we had inserted were not causing the error "err_code:1204 - String length exceeds DDL length.". The "escape" option with the copy command was actually not counting the inserted backslash characters towards the target limit and was also removing them from the loaded value properly.

The actual issue that were facing was that some of the characters that we were trying to load were multibyte UTF8 characters. Since, we were incorrectly assuming them to be of length 1 byte, the size of the target field was proving to be insufficient. We increased the length of the target field from varchar(5096) to varchar(7096), after which all data was loaded successfully.