I just set up a Spark cluster in Google Cloud using DataProc and I am trying to submit a simple pyspark hello-world.py job from my local machine using gcutil as specified in the documentation - https://cloud.google.com/dataproc/submit-job
gcloud beta dataproc jobs submit pyspark --cluster cluster-1 hello-world.py
However, I am getting the following error:
15/12/28 08:54:53 WARN org.spark-project.jetty.util.component.AbstractLifeCycle: FAILED [email protected]:4040: java.net.BindException: Address already in use
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.spark-project.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
...
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
I have only submitted this job once, and so I'm puzzled as to why I'm getting this error. Any help would be appreciated.