I tried to use apache zeppelin with EMR(Spark) Cluster. I get some requirements for using apache zeppelin + EMR cluster with opening firewall. In the workplace, there is the static ip which is blocked by the firewall. As you know, EMR clusters should change their IP and DNS name every time when they create with aws cli command. So do you know how to connect apache zeppelin server(EC2 instance) with the EMR cluster using the fixed IP? Thanks in advance.
2 Answers
I don't understand your question fully. Let me try to answer this part of question: "So do you know how to connect apache zeppelin server(EC2 instance) with the EMR cluster using the fixed IP?"
This should be possible by attaching an elastic ip with EMR clusters' master using: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-cli-commands.html#elastic-ip.
You can also try qubole's managed clusters and support for spark+zeppelin. Qubole takes care of this by providing you a fixed end point to access your zeppelin notebooks.
Disclaimer: I work for Qubole
Finally I succeed to solve this problem or requirement using socat.
socat TCP-LISTEN:8080,fork TCP:$EMR_CLUSTER_NAME:8080
socat TCP-LISTEN:8081,fork TCP:$EMR_CLUSTER_NAME:8081
Also check the script(install-apache-zeppelin-on-amazon-emr.sh) which I revised some codes and used socat instead of using ssh tunneling.