0
votes

Accordiing to http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Reducer.html

the reducer copies the sorted output from each Mapper using HTTP across the network.

What port on a node is used for this data transfer? Is it 50060 by default?

1

1 Answers

1
votes

It's the port of the task tracker running on each slave node, which is usually 50060 (you can check in the tasktracker log file:

2012-05-29 20:24:23,925 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
2012-05-29 20:24:23,925 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
2012-05-29 20:24:23,925 INFO org.mortbay.log: jetty-6.1.26
2012-05-29 20:24:24,283 INFO org.mortbay.log: Started [email protected]:50060