I have a small Hadoop/Yarn cluster that is running on a system with a firewall that must be enabled. We are trying to submit Spark jobs that fail because of the port allocations.
I've configured the firewall for all the standard Hadoop/Yarn/Spark ports that need to be opened, as well as set what I thought was all the configurations to restrict port ranges. But the application manager still creates containers on random ports that get blocked.
The one setting I thought would do the trick was yarn.app.mapreduce.am.job.client.port-range set in mapred-site.xml, but is doesn't seem to be respected or making difference.
Any thoughts/help would be greatly appreciated. Banged my head on the wall way too long on this one.
Edit Forgot versions - Hadoop/Yarn 2.8.0, Spark 2.1.0, CentOS7