1
votes

I have single node cluster with 2 CPUs, where I want to run 2 spark streaming jobs.

I also want to use submit mode "cluster". I am using Standalone cluster manager. When I submit one application, I see that driver consumes 1 core, and worker 1 core.

Does it mean that there are no cores available for other streaming job? Can 2 streaming jobs reuse executors?

It is totally confusing me, and I don't find it really clear in documentation.

Srdjan

1
What cluster? I believe this will depend on how your cluster manager reacts -- whether there are container restrictions etc. I use YARN, but I'm no expert. I know within a given Spark application there's a direct correlation between how many executor cores you have and how many DStreams you can have. Read here: spark.apache.org/docs/latest/… -- the way around that limitation is to create just a single DStream that listens to more than one source (in my case, Kafka topics)David Griffin

1 Answers

1
votes

Does it mean that there are no cores available for other streaming job?

If you have a single worker with 2 CPU's and you're deploying in Cluster mode, than you'll have no available cores as the worker has to use a dedicated core for tge driver process to run on your worker machine.

Can 2 streaming jobs reuse executors?

No, each job needs to allocate dedicated resources given by the cluster manager. If one job is running with all available resources, the next scheduled job will be in WAITING state until the first completes. You can see it in the Spark UI.