2
votes

Have an AWS Glue crawler which is creating a data catalog with all the tables from an S3 directory that contains parquet files.

I need to copy the contents of these files/ tables to the Redshift table. I have a few tables where the Parquet file data size cannot be supported by Redshift. VARCHAR(6635) is not sufficient.

In the ideal scenario, would like to truncate these tables.

How do I use the COPY command to load this data into Redshift? If I use spectrum, I can only user INSERT INTO from the external table to Redshift table, which I understand is slower than a bulk copy?

1

1 Answers

0
votes

You can use string instead of varchar(6635) (Can be edited in the catalog as well ) , if not can you elaborate more on this, Of the files are in parquet then , Most of the Data conversion parameters that copy provides cannot be used like Escape, null as etc ..

https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html