AWS S3 to RDS Serverless Aurora (PostgreSQL) programatically

Question

I'm looking for a solution that can daily ingest 6 to 8 tables in to RDS. The tables in question have specific key relations, so this should be incorporated in the database.

Currently I'm having a hard time finding an optimal solution to load the data for those 6-8 tables in RDS programatically. Which kind of services are currently optimal for doing that?

Lambda

Data is slightly too big for Lambda's memory footprint.

Datapipeline

Not clear how this would work with serverless Aurora and this also requires a scheduled ec2 instances (breaks with the serverless pattern).

Load S3 Data into Amazon RDS MySQL Table - AWS Data Pipeline

Glue?

Glue seems to be more tailored towards Redshift.

So I'm a bit lost on what the best solution design would be for this. Help would be appreciated.

Richard Rublev Richard Rublev · Accepted Answer · 2020-05-27T13:35:34

You should try AWS Date Pipeline. Briefly,these are the steps:

Create Role and Attach S3 Bucket Policy
Setup Cluster Parameter Group
Edit Parameter Groups to use Role
Reboot Aurora instance

This, Loading Data into an Amazon Aurora MySQL, is for MySQL.

Loading data with PostgreSQL should be very similar.

AWS S3 to RDS Serverless Aurora (PostgreSQL) programatically

Lambda

Datapipeline

Glue?

2 Answers