0
votes

I have a Spark streaming application (Spark 2.4.4) running on AWS EMR 5.28.0. In the driver application on master node, besides setting up the spark streaming job, I am also running a http server (Akka-http 10.1.6) which can query the driver application for data, I bind to port 6161 like the following:

val bindingFuture: Future[ServerBinding] = Http().bindAndHandle(myapiroutes, "127.0.0.1", 6161)

try {
      bindingFuture.map { serverBinding =>
        log.info(s"AlertRestApi bound to ${serverBinding.localAddress}")
      }
    } catch {
      case ex: Exception  => {
        log.error(s"Failed to bind to 127.0.0:6161")
        system.terminate()
      }
    }

then I start spark streaming:

ssc.start()

When I test this on local spark, I am able to access http://localhost:6161/myapp/v1/data and get data from spark streaming, everything is good so far.

However, when I run this application in AWS EMR, I could not access port 6161. I ssh into the driver node and try to curl my url, it gives me error message:

[hadoop@ip-xxx-xx-xx-x ~]$ curl http://xxx.xx.xx.x:6161/myapp/v1/data

curl: (7) Failed to connect to xxx.xx.xx.x port 6161: Connection refused

when I look into the log in the driver node, I do see the port is bound (why the host shows 0:0:0:0:0:0:0:0? I don't know, that is the way in my dev testing, and it works, I see the same log and able to access the url):

20/04/13 16:53:26 INFO MyApp: MyRestApi bound to /0:0:0:0:0:0:0:0:6161

So my question is, what should I do so that I can access the api at port 6161 on the driver node? I realize Yarn resource manager may be involved but I know nothing about Yarn resource manager to point myself where to investigate.

Please help. Thanks

1

1 Answers

0
votes

You are mentioning 127.0.0.1 as the host name or 0.0.0.0??

127.0.0.1 will work in your local system but not in AWS as it is loopback address. In such case you need to use 0.0.0.0 as the host name

Also make sure that ports are open and access is provided from your IP. To do that, go to Inbound rules for your instance and add 6161 under custom TCP rule if not done already.

Let me know if this makes any difference