4
votes

I am trying to load data from AWS S3 to Aurora Postgres. I have followed this link https://aws.amazon.com/blogs/database/stream-data-into-an-aurora-postgresql-database-using-aws-dms-and-amazon-kinesis-data-firehose/ but without any success. Has anybody loaded it or know about some link where I can find the steps to achieve it.

1

1 Answers

7
votes

Though this is an old post, I am hoping it will help someone who comes looking. Most of it is from the official AWS doc. I am adding some notes to help along.

Here is the drill:

Connect to the RDS Postgres instance.

Run this command on the psql prompt to create the necessary extension to import data from s3:

CREATE EXTENSION aws_s3 CASCADE;

Create the database:

CREATE DATABASE students;

Create the table (assuming three columns as below):

CREATE TABLE t1 (col1 varchar(80), col2 varchar(80), col3 varchar(80));

Note that your csv file in the S3 bucket should have the same number of columns unless you want to skip some (which is beyond the scope of this post).

To add an IAM role for a PostgreSQL DB cluster using the console:

  1. Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/.
  2. Choose the PostgreSQL DB cluster name to display its details.
  3. On the Connectivity & security tab, in the Manage IAM roles section, choose the role to add under Add IAM roles to this instance.
  4. Under Feature, choose s3Import.
  5. Choose Add role.

    SELECT aws_commons.create_s3_uri('your_bucket_name', 'full/path/to/file.csv','aws_region' ) AS s3_uri \gset
    SELECT aws_s3.table_import_from_s3('t1', '', '(format csv)', :'s3_uri' );

Don't prepend the bucket name to the full/path/to/file.csv. Use the correct aws_region (like us-east-1) and don't add any Availability Zone here.

You should get something like:

1000 rows imported into relation "t1" from file ...

Test with a sample query:

SELECT * FROM t1 LIMIT 5;

HTH.