How does Hadoop framework decides the node to run Map job

Question

As per my understanding, files stored in HDFS are divided into blocks and and each block is replicated to multiple nodes, 3 by default. How does Hadoop framework choose the node to run a map job, out of all the nodes on which a particular block is replicated.

Gemini Keith Gemini Keith · Accepted Answer · 2016-04-18T08:16:05

As I know, there will be same amounts of map tasks as amounts of blocks.

See manual here.

Usually, framework choose those nodes close to the input block for reducing network bandwidth for map task.

That's all I know.

How does Hadoop framework decides the node to run Map job

2 Answers