Hadoop Mapper running slow

Question

I am trying to run a job with both mappers and reducers but the mappers are running slow..

If for the same input i disable reducers, the mappers finish in 3 mins while for mapper-reducer jobs, even at the end of 30 mins the Mappers are not finished.

I am using hadoop 1.0.3 ..I tried both with and without compression of map output. I removed the older version of hadoop 0.20.203 and reinstalled everything from scratch for 1.0.3

Also the Jobtracker logs are filled with:

2012-10-03 10:26:20,138 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 54311: readAndProcess threw exception java.lang.RuntimeException: readObject can't find class . Count of bytes read: 0
java.lang.RuntimeException: readObject can't find class
        at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:185)
        at org.apache.hadoop.ipc.RPC$Invocation.readFields(RPC.java:102)
        at org.apache.hadoop.ipc.Server$Connection.processData(Server.java:1303)
        at org.apache.hadoop.ipc.Server$Connection.processOneRpc(Server.java:1282)
        at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1182)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:537)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:344)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.ClassNotFoundException:
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
        at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)

Can anyone tell what may be wrong

Do you have a combiner configured (one reason i can think that you're mappers fail to complete when run with reducers). — Chris White
Can you share your job configuration / generated job.xml (maybe via pastebin?) — Chris White
Can you paste your job configurations here. If possible the reducer code also. — Manish Verma

Sam Sam · Accepted Answer · 2015-01-30T07:28:54

if your mapper is getting completed in 3 mins. then its not slow with batch processing nature. Yes with your used version of mapreduce you need to make sure that you are using correct no of reducers. if you have cluster size is X then try to use number of reducer as X-1 . See if this helps or not

Hadoop Mapper running slow

1 Answers