Currently I’m listening events from AWS Kinesis and writing them to S3. Then I query them using AWS Glue and Athena.
Is there a way to import that data, possibly with some transformation, to an RDS instance?
There are several general approaches to take with regards to that task.
When connecting GLUE to RDS a couple of things to keep in mind (mostly on the networking side:
For some examples of code targetting a relational database, see the following tutorials
One approach for Postgres:
Install the S3 extension in Postgres:
psql=> CREATE EXTENSION aws_s3 CASCADE;
Run the query in Athena and find the CSV result file location in S3 (S3 output location is in Athena settings)
Create your table in Postgres
Import from S3:
SELECT aws_s3.table_import_from_s3(
'newtable', '', '(format csv, header true)',
aws_commons.create_s3_uri('bucketname', 'reports/Unsaved/2021/05/10/aa9f04b0-d082-328g-5c9d-27982d345484.csv', 'us-east-1')
);
If you want to convert empty values to null, you can use this: (format csv, FORCE_NULL (columnname), header true)
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Importing.html