I am doing an evaluation of AWS database services to pick the most effective one, the objective is to load data from a json file from an S3 bucket into Redshift every 5 minutes.
I am currently trying to use AWS Data Pipeline for the automation of ETL. I have been following this AWS tutorial "Copy Data to Amazon Redshift Using the AWS Data Pipeline Console", all simple and clear.
https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-copydata-redshift-create.html
I set up a cluster on Redshift and a bucket on S3, created all the roles and policies required with all the permissions needed.
Now after creating the pipeline, and pressing Activate a warning comes up saying
Errors/Warnings:
Object:Ec2Instance
WARNING: Could not validate S3 Access for role. Please ensure role ('DataPipelineDefaultRole') has s3:Get*, s3:List*, s3:Put* and sts:AssumeRole permissions for DataPipeline.
Now I am sure that my role and resource role have s3:Get*
, s3:List*
, s3:Put*
and sts:AssumeRole
In fact they both have FullAccess
to basically all the services I need.
DataPipelineDefaultRole policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "*",
"Condition": {
"StringLike": {
"iam:AWSServiceName": [
"elasticmapreduce.amazonaws.com",
"spot.amazonaws.com"
]
}
}
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"ec2:AuthorizeSecurityGroupIngress",
"sdb:Select*",
"sqs:ReceiveMessage",
"s3:Get*",
"sqs:GetQueue*",
"s3:CreateBucket",
"sns:Unsubscribe",
"s3:List*",
"datapipeline:EvaluateExpression",
"ec2:StartInstances",
"dynamodb:DescribeTable",
"sqs:Delete*",
"iam:ListAttachedRolePolicies",
"ec2:RevokeSecurityGroupEgress",
"dynamodb:GetItem",
"sns:Subscribe",
"iam:ListRolePolicies",
"s3:DeleteObject",
"sdb:BatchPutAttributes",
"iam:GetRole",
"dynamodb:BatchGetItem",
"redshift:DescribeClusterSecurityGroups",
"ec2:CreateTags",
"ec2:DeleteNetworkInterface",
"ec2:RunInstances",
"dynamodb:Scan",
"rds:DescribeDBSecurityGroups",
"ec2:StopInstances",
"ec2:CreateNetworkInterface",
"ec2:CancelSpotInstanceRequests",
"cloudwatch:*",
"sqs:PurgeQueue",
"iam:GetRolePolicy",
"dynamodb:UpdateTable",
"ec2:RequestSpotInstances",
"ec2:DeleteTags",
"sns:ListTopics",
"ec2:ModifyImageAttribute",
"iam:PassRole",
"sns:Publish",
"ec2:DescribeNetworkInterfaces",
"ec2:CreateSecurityGroup",
"rds:DescribeDBInstances",
"ec2:ModifyInstanceAttribute",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:DetachNetworkInterface",
"ec2:TerminateInstances",
"iam:GetInstanceProfile",
"sns:GetTopicAttributes",
"datapipeline:DescribeObjects",
"dynamodb:Query",
"iam:ListInstanceProfiles",
"ec2:Describe*",
"ec2:DeleteSecurityGroup",
"redshift:DescribeClusters",
"sqs:CreateQueue",
"elasticmapreduce:*",
"s3:Put*"
],
"Resource": "*"
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"iam:PassRole",
"s3:Get*",
"s3:List*",
"s3:Put*",
"sts:AssumeRole"
],
"Resource": [
"arn:aws:iam::*:role/DataPipelineDefaultResourceRole",
"arn:aws:iam::*:role/DataPipelineDefaultRole",
"arn:aws:s3:::*/*"
]
},
{
"Sid": "VisualEditor3",
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*",
"s3:Put*"
],
"Resource": "arn:aws:s3:::*"
},
{
"Sid": "VisualEditor4",
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*",
"s3:Put*"
],
"Resource": "*"
}
]
}
DataPipelineDefaultResourceRole policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:*",
"datapipeline:*",
"dynamodb:*",
"ec2:Describe*",
"elasticmapreduce:AddJobFlowSteps",
"elasticmapreduce:Describe*",
"elasticmapreduce:ListInstance*",
"rds:Describe*",
"redshift:DescribeClusters",
"redshift:DescribeClusterSecurityGroups",
"s3:*",
"sdb:*",
"sns:*",
"sqs:*"
],
"Resource": [
"*"
]
}
]
}
I Have been looking into the issue for over a week, tried all the existing solutions, updated the Trust Relationship, recreated the Roles, kept the default roles, let Data Pipeline create new ones and checked the Security Group, still having the same issue.
After activating the pipeline and check the log Uri, I do find 2 folders Ec2Instance
and RedshiftLoadActivity
, in the Redshift log file there are only 2 lines, the other has more [INFO]
describing the downloading of the jar and S3 file for the TaskRunner.
Among the logs There are [INFO]
and these [WARN]
:
Ec2Instance:
private.com.amazonaws.services.s3.internal.S3V4AuthErrorRetryStrategy: Attempting to re-send the request to mylogbucket.s3.eu-central-1.amazonaws.com with AWS V4 authentication. To avoid this warning in the future, please use region-specific endpoint to access buckets located in regions that require V4 signing.
RedshiftLoadActivity:
private.com.amazonaws.services.s3.internal.S3V4AuthErrorRetryStrategy: Attempting to re-send the request to mylogbucket.s3.eu-central-1.amazonaws.com with AWS V4 authentication. To avoid this warning in the future, please use region-specific endpoint to access buckets located in regions that require V4 signing.
The issue should be the roles and policies, yet I made sure Redshift and S3 bucket were not the problem since I tried using a COPY
command on the Query Editor and it loaded the data as expected.
I am currently still stuck in that error and would love some suggestions to how I could possibly fix this problem.
548 [ERROR] (TaskRunnerService-resource:df-0539055_@Ec2Instance_2019-05-30T13:38:35-0) amazonaws.datapipeline.database.ConnectionFactory: Unable to establish connection to jdbc:postgresql://redshift-cluster-1.coykb9.eu-central-1.redshift.amazonaws.com:5439/db Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections