I am doing a reading on AWS EMR on VPC but it seems like it is more of design consideration for AWS EMR Service to access EMR cluster for calls.
What I am trying to do is host a VPC with ALB and EC2 instance running an application as a service to access EMR cluster.
VPC -> Internet Gateway -> Load Balancer -> EC2 (Application endpoints) -> EMR Cluster
I don't want Cluster to be accessible from outside except through Public IP of IG. But Public IP can access only EC2 instance hosting application which calls EMR cluster on same VPC.
Is it recommended approach?
The design looks something like below.
Some challenges I am tackling is how to access S3 from EMR if on VPC,
and if the application is running on EC2 can it access EMR cluster, and if EMR cluster would be available publicly?
Any guidance links or recommendations would be welcome.
EDIT:
Or if I create EMR on VPC do i need to wrap it inside of another VPC something like below?
