0
votes

We are having trouble copying files from S3 to Redshift. The S3 bucket in question allows access only from a VPC in which we have a Redshift cluster. We have no problems with copying from public S3 buckets. We tried both, key-based and IAM role based approach, but result is the same: we keep getting 403 Access Denied by S3. Any idea what we are missing? Thanks.

EDIT: Queries we use: 1. (using IAM role):

copy redshift_table from 's3://bucket/file.csv.gz' credentials 'aws_iam_role=arn:aws:iam::123456789:role/redshift-copyunload' delimiter '|' gzip;
  1. (using access keys):

    copy redshift_table from 's3://bucket/file.csv.gz' credentials 'aws_access_key_id=xxx;aws_secret_access_key=yyy' delimiter '|' gzip;

S3 policy for IAM Role (first query) and IAM user (second query) is:

{
    "Version": "2012-10-17",
    "Statement": [
        {
     "Sid": "Stmt123456789",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::bucket/*"
            ]
        }
    ]
}

Bucket has a policy denying access from anywhere other than VPC (redshift cluster is in this VPC):

{
    "Version": "2012-10-17",
    "Id": "VPCOnlyPolicy",
    "Statement": [
        {
            "Sid": "Access-to-specific-VPC-only",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::bucket/*",
                "arn:aws:s3:::bucket"
            ],
            "Condition": {
                "StringNotEquals": {
                    "aws:sourceVpc": "vpc-123456"
                }
            }
        }
    ]
}

We have no problem loading from publicly accessible buckets and if we remove this bucket policy we can copy the data with no problems.

The bucket is in the same region as redshift cluster.

When we run IAM role (redshift-copyunload) through the policy simulator it returns "permission allowed".

2
You need to provide more info. Provide the SQL query you use, the bucket name, region & the policies. One question though: is the bucket you try to fetch things from in the same region as your Redshift cluster? - Nicolas
this question has an answer here: stackoverflow.com/a/40540465/1517410 - ketan vijayvargiya

2 Answers

0
votes

Enable "Enhanced VPC Routing" on your Redshift. Without the "Enhanced VPC Routing" your Redshift traffic will be coming via Internet and your S3 bucket policy will deny access. See here: https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-enabling-cluster.html

0
votes

1 Check encription of bucket. According doc : https://docs.aws.amazon.com/en_us/redshift/latest/dg/c_loading-encrypted-files.html The COPY command automatically recognizes and loads files encrypted using SSE-S3 and SSE-KMS.

2 Check kms: rules on you key|role

3 If files from EMR, check Security configurations for S3.