Hadoop JobHistory shows only the failed jobs

Question

I'm trying to monitor the jobs for the sample MapReduce application called Find Maximum Temperature in the The Definitive Hadoop book. On the default installation and configuration of Hadoop-2.6, that application works perfectly, i.e. it calculates the yearly maximum temperatures. But after I extended the configurations of mapred-site.xml and yarn-site.xml like these: (taken from How do I view my Hadoop job history and logs using CDH4 and Yarn? and YARN job history not coming)

mapred-site.xml:

<property>
  <name> mapreduce.framework.name</name>
  <value>yarn</value>
</property>
<property> 
  <name>mapreduce.jobhistory.address</name>
  <value>localhost:10020</value> 
</property>
<property> 
  <name>mapreduce.jobhistory.webapp.address</name>
  <value>localhost:19888</value> 
</property>

yarn-site.xml:

  <property>
      <name>yarn.log-aggregation-enable</name>
      <value>true</value>
  </property>
  <property>
     <name>yarn.nodemanager.remote-app-log-dir</name>
     <value>/app-logs</value>
  </property>
  <property>
      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
      <value>logs</value>
  </property>

and when I run the same MaxTemperature application, the application finishes fine and it outputs the file called part-r-00000 but it can't be seen on the JobHistory page at localhost:19888. (Meanwhile the other pages at localhost:8042, localhpst:8088 and localhost:50070 work fine)

Is there a way I can see all the jobs, as they are running on any Hadoop page?

Sometimes when I run the same application, it gives this error:

17/09/19 11:07:49 INFO mapreduce.Job: Task Id : attempt_1505767853223_0003_m_000005_1, Status : FAILED Container launch failed for container_1505767853223_0003_01_000013 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:168) at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

and if it gives this error, it shows up at the JobHistory page. I have no idea why it sometimes fails, but it occurs after a new start of Hadoop: start-dfs.sh and start-yarn.sh and the /usr/local/hadoop-2.6.0/sbin/mr-jobhistory-daemon.sh start historyserver Here is a SS after 3 jobs failed as such:

Terry Terry · Accepted Answer · 2017-09-19T16:20:24

A google search for the org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist returned this SO post org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist

Adding those lines to configuration in yarn-site.xml:

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

solved the problem. Now all jobs, whether failed or succeeded come up at the JobHistory page. Here's an SS:

Hadoop JobHistory shows only the failed jobs

1 Answers