0
votes

I am creating a glue job(Python shell) to export data from redshift and store it in S3. But how would I automate/trigger the file in S3 to download to the local network drive so the 3rd party vendor will pick it up.

Without using the glue, I can create a python utility that runs on local server to extract data from redshift as a file and save it in local network drive, but I wanted to implement this framework on cloud to avoid having dependency on local server.

AWS cli sync function wont help as once vendor picks up the file, I should not put it again in the local folder.

Please suggest me the good alternatives.

1
Do you have an existing SFTP server that you need to use, or could you use a new server to provide file transfer to the 3rd party vendor?jscott
Can the 3rd party simply grab it from S3? Security can be handled either by using their own AWS credentials, or a pre-signed URL. All systems these days know how to download from a URL.John Rotenstein
@JohnRotenstein I have to check with interface team as they handle the sftp transfer to vendor. Sounds like I can use pre-signed URL if they can able to download. ThanksPhani
@jscott Interface team maintains copying the file from the local folder to SFTP server, to send file to 3rd party. So I have little control on it.Phani

1 Answers

1
votes

If the interface team can use S3 API or CLI to get objects from S3 to put on the SFTP server, granting them S3 access through an IAM user or role would probably be the simplest solution. The interface team could write a script that periodically gets the list of S3 objects created after a specified date and copies them to the SFTP server.

If they can't use S3 API or CLI, you could use signed URLs. You'd still need to communicate the S3 object URLs to the interface team. A queue would be a good solution for that. But if they can use an AWS SQS client, I think it's likely they could just use the S3 API to find new objects and retrieve them.

It's not clear to me who controls the SFTP server, whether it's your interface team or the 3rd party vendor. If you can push files to the SFTP server yourself, you could create a S3 event notification that runs a Lambda function to copy the object to the SFTP server every time a new object is created in the S3 bucket.