1
votes

I wish to transfer data in a database like MySQL[RDS] to S3 using AWS Glue ETL. I am having difficulty trying to do this the documentation is really not good. I found this link here on stackoverflow:

Could we use AWS Glue just copy a file from one S3 folder to another S3 folder?

SO based on this link, it seems that Glue does not have an S3 bucket as a data Destination, it may have it as a data Source. SO, i hope i am wrong on this. BUT if one makes an ETL tool, one of the first basics on AWS is for it to tranfer data to and from an S3 bucket, the major form of storage on AWS.

So hope someone can help on this.

2

2 Answers

1
votes

You can add a Glue connection to your RDS instance and then use the Spark ETL script to write the data to S3.

You'll have to first crawl the database table using Glue Crawler. This will create a table in the Data Catalog which can be used in the job to transfer the data to S3. If you do not wish to perform any transformation, you may directly use the UI steps for autogenerated ETL scripts.

I have also written a blog on how to Migrate Relational Databases to Amazon S3 using AWS Glue. Let me know if it addresses your query.

https://ujjwalbhardwaj.me/post/migrate-relational-databases-to-amazon-s3-using-aws-glue

0
votes

Have you tried https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-copyrdstos3.html?

You can use AWS Data Pipeline - it has standard templates for full as well incrementation copy to s3 from RDS.