0
votes

We have a Redshift cluster that needs one table from one of our RDS / postgres databases. I'm not quite sure the best way to export that data and bring it in, what the exact steps should be.

In piecing together various blogs and articles the consensus appears to be using pg_dump to copy the table to a csv file, then copying it to an S3 bucket, and from there use the Redshift COPY command to bring it in to a new table-- that's my high level understanding, but am not sure what the command line switches should be, or the actual details. Is anyone doing this currently and if so, is what I have above the 'recommended' way to do a one-off import into Redshift?

1

1 Answers

2
votes

It appears that you want to:

  • Export from Amazon RDS PostgreSQL
  • Import into Amazon Redshift

From Exporting data from an RDS for PostgreSQL DB instance to Amazon S3 - Amazon Relational Database Service:

You can query data from an RDS for PostgreSQL DB instance and export it directly into files stored in an Amazon S3 bucket. To do this, you use the aws_s3 PostgreSQL extension that Amazon RDS provides.

This will save a CSV file into Amazon S3.

You can then use the Amazon Redshift COPY command to load this CSV file into an existing Redshift table.

You will need some way to orchestrate these operations, which would involve running a command against the RDS database, waiting for it to finish, then running a command in the Redshift database. This could be done via a Python script that connects to each database (eg via psycopg2) in turn and runs the command.