32
votes

I am very new to AWS, and Postgresql.

  1. I have created a Postgresql db (using rds on aws)
  2. I have uploaded several documents to multiple s3 buckets
  3. I have a EC2 (Amazon Linux 64 bit) running

I tried to use a data pipeline, but nothing seems to be available (template) for Postgres. I can't figure out how to connect to my RDS instance and import/export data from postgres.

I assumed that I could use EC2 to grab from my S3 bucket and import into Postgres in lieu of no data pipeline template being available. If it is possible I have no idea how.. Please advise if possible..

4
Ideally Amazon would want to extend the COPY command to support S3 URLs for file paths, but I won't be surprised if for now you have to fetch the data from S3 via a client that then feeds it to PostgreSQL via COPY.Craig Ringer
Yes, with RDS, you will generally need to load from an EC2 instance. Only the custom database systems like Redshift tend to allow direct loading from s3.datasage
Thank you both. Can either of you provide any examples on how to accomplish this? I am able to connect EC2 to S3 and EC2 to RDS. I have an idea on how to pull the file from S3 down to EC2, but at that point I have no idea how to load the file from EC2 into the Postgresql DB (running on res). Please help or point me to instructions. I have been searching online for several days now..user3044239

4 Answers

25
votes

S3 -> RDS direct load is now possible for PostgreSQL Aurora and RDS PostgreSQL >= 11.1 as aws_s3 extension.

Parameters are similar to those of PostgreSQL COPY command

psql=> SELECT aws_s3.table_import_from_s3(
 'table_name', '', '(format csv)',
 'BUCKET_NAME', 'path/to/object', 'us-east-2'
);

Be warned that this feature does not work for older versions.

24
votes

I wish AWS extends COPY command in RDS Postgresql as they did in Redshift. But for now they haven't and we have to do it by ourselves.

  1. Install awscli on your EC2 box (it might have been installed by default)
  2. Configure your awscli with credentials
  3. Use aws s3 sync or aws s3 cp commmands to download from s3 to your local directory
  4. Use psql command to \COPY the files into your RDS (requires \ to copy from client directory)

Example:

aws s3 cp s3://bucket/file.csv /mydirectory/file.csv
psql -h your_rds.amazonaws.com -U username -d dbname -c '\COPY table FROM ''file.csv'' CSV HEADER'
13
votes

The prior answers have been superseded by more recent events at AWS.

There is now excellent support for S3-to-RDS-database loading via the Data Pipeline service (which can be used for many other data conversion tasks too, this is just one example).

This AWS article is for S3-to-RDS-MySQL. Should be very similar for RDS-Postgres.

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-copys3tords.html

0
votes

if you can launch the psql client and connect to RDS on EC2 instance, you should be able to use the following command:

\copy customer_orders from 'myfile.csv' with DELIMITER ','