How can I run 2x spark submits in the same time? I have a simple spark (no extra configurations on my pc) with 4 cores allocated.
If I try to submit an app 2x times, the second one gets "WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources"
Code: from future import print_function
import sys
from operator import add
from pyspark.sql import SparkSession
if __name__ == "__main__":
spark = SparkSession\
.builder\
.appName("test")\
.getOrCreate()
rdd = spark.sparkContext.parallelize(xrange(1000000000), 100)
print(rdd.sample(False, 0.1, 81).count())
spark.stop()
How I try to start them: ./spark-submit --master spark://myaddresshere:7077 --name "app1" --conf spark.shuffle.service.enabled=true --conf park.shuffle.service.enabled=true /path_to_py_file.py
I know that I can pre-set the number of cores to use, but my purpose is to dynamically allocate the resources. If there is only 1 task running => consume 100%, if they are 4 tasks => 25% each.
I've tried multiple options but without luck.
Any hint will be appreciated.