1
votes

I am trying to recursively unload data from redshift into S3 bucket in different folder for each run. Is there a way to do this from redshift unload command ?

unload ('select * from venue') to 's3://mybucket/tickit/venue_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest;

I want s3 bucket to have this sub folders :

s3://mybucket/tickit/venue_12/venue_0000_part_00 s3://mybucket/tickit/venue_21/venue_0000_part_00 s3://mybucket/tickit/venue_34/venue_0000_part_00

this venue_* where * is out from select query in unload command

2

2 Answers

0
votes

you have to design a metadata table with list of tables and add condition if required in metadata table. 1. with help of python/other scripting language read meta data and execute unload command for respective tables with specific buckets (generally tablename as bucketname or any other format you prefer) 2. Execute above code to unload data to respective buckets

0
votes

Using a python script running on an EC2 instance or anywhere else, you can achieve that.

For example, if you want to export every new 10000 rows of data, you can add that to you UNLOAD query, and keep track of the current offset to be unloaded using the Python program.

unload ('select * from venue WHERE ID > offset and ID < offset+10000') to 's3://mybucket/tickit/venue_' iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole' manifest;  

And execute the above query in desired intervals, and store the data in desired filename/file path.