0
votes

I know that Hadoop has the Fair Scheduler, where we can assign a job to some priority group and the cluster resources are allocated to the job based on priority. What I am not sure and what I ask is how a non map-red program is prioritized by the Hadoop cluster. Specifically how do the writes to Hadoop through external clients (say some standalone program which is directly opening HDFS file and streaming data to it) would be prioritized by Hadoop when the cluster is busy running map-red jobs.

1

1 Answers

0
votes

The Resource Manager only can prioritize jobs submitted to it (such as MapReduce applications, Spark jobs, etc ...).

Other than distcp, HDFS operations only interact with the NameNode and Datanodes not the Resource Manager so they would be handled by the NameNode in the order they're received.