I'm using Spark 2.4.5 running on AWS EMR 5.30.0 with r5.4xlarge instances (16 vCore, 128 GiB memory, EBS only storage, EBS Storage:256 GiB) : 1 master, 1 core and 30 task.
I launched Spark Thrift Server on the master node and it's the only job that is running on the cluster
sudo /usr/lib/spark/sbin/start-thriftserver.sh --conf spark.blacklist.enabled=true --conf spark.blacklist.stage.maxFailedExecutorsPerNode=4 --conf spark.blacklist.task.maxTaskAttemptsPerNode=3 --conf spark.driver.cores=12 --conf spark.driver.maxResultSize=10g --conf spark.driver.memory=86000M --conf spark.driver.memoryOverhead=10240 --conf spark.kryoserializer.buffer.max=768m --conf spark.rpc.askTimeout=700 --conf spark.sql.broadcastTimeout=800 --conf spark.sql.sources.partitionOverwriteMode=dynamic --conf spark.task.maxFailures=20
Then I launch SQL queries on it with JDBC but when heavy queries are running, the UI gets really slow. I thought it would be fine if I put spark.driver.cores=12 (there are 16 in the master node) and spark.driver.memory=86000M (there are 128GB of memory) to leave some margin for the master node to be able to run other processes like the history server but it is still slow.
So I guess there are other settings that I can edit to make the UI works fine but I'm not sure what.
Those are the settings from spark-defaults.conf in the cluster FYI:
spark.driver.extraClassPath /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar spark.driver.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native spark.executor.extraClassPath /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native spark.eventLog.enabled true spark.eventLog.dir hdfs:///var/log/spark/apps spark.history.fs.logDirectory hdfs:///var/log/spark/apps spark.sql.warehouse.dir hdfs:///user/spark/warehouse spark.sql.hive.metastore.sharedPrefixes com.amazonaws.services.dynamodbv2 spark.yarn.historyServer.address <xxxxx>:18080 spark.history.ui.port 18080 spark.shuffle.service.enabled true spark.yarn.dist.files /etc/spark/conf/hive-site.xml spark.driver.extraJavaOptions -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' spark.dynamicAllocation.enabled true spark.blacklist.decommissioning.enabled true spark.blacklist.decommissioning.timeout 1h spark.resourceManager.cleanupExpiredHost true spark.stage.attempt.ignoreOnDecommissionFetchFailure true spark.decommissioning.timeout.threshold 20 spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' spark.hadoop.yarn.timeline-service.enabled false spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS $(hostname -f) spark.files.fetchFailure.unRegisterOutputOnHost true spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem 2 spark.hadoop.mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem true spark.hadoop.fs.s3.getObject.initialSocketTimeoutMilliseconds 2000 spark.sql.parquet.output.committer.class com.amazon.emr.committer.EmrOptimizedSparkSqlParquetOutputCommitter spark.sql.parquet.fs.optimized.committer.optimization-enabled true spark.sql.emr.internal.extensions com.amazonaws.emr.spark.EmrSparkSessionExtensions spark.sql.sources.partitionOverwriteMode dynamic spark.executor.instances 1 spark.executor.cores 16 spark.driver.memory 2048M spark.executor.memory 109498M spark.default.parallelism 32 spark.emr.maximizeResourceAllocation true```