I am very new in Hadoop and Hbase.
And my use case is very simple: I want to get reduce input groups count for a job in the run time (i.e get the counter being updated from the initiation to the termination of the job ).
What I have searched so far: All job related logs are written under directory /var/log/hadoop/userlogs like shown below:
[root@dev1-slave1 userlogs]# pwd
/var/log/hadoop/userlogs
[root@dev1-slave1 userlogs]# ll
total 24
drwx--x--- 2 mapred mapred 4096 Jan 13 19:59 job_201501121917_0008
drwx--x--- 2 mapred mapred 4096 Jan 13 11:31 job_201501121917_0009
drwx--x--- 2 mapred mapred 4096 Jan 13 12:01 job_201501121917_0010
drwx--x--- 2 mapred mapred 4096 Jan 13 12:13 job_201501121917_0011
drwx--x--- 2 mapred mapred 4096 Jan 13 12:23 job_201501121917_0012
drwx--x--- 2 mapred mapred 4096 Jan 13 19:59 job_201501121917_0013
Under each job, there are directories such as attempt_201501121917_0013_m_000000_0 (mapper log) and attempt_201501121917_0013_r_000000_0 (reducer log).
The reducer log directory attempt_201501121917_0013_r_000000_0 contains syslog which contains information about job run. But it doesn't show any information about the counter.
From the jobtracker UI of hadoop, I could see the counter reduce input groups being updated until the job is finished but I could not find the same elsewhere.
How can I achieve this? Is there any Java API to get job-wise counters in an another application (NOT in the application which is performing mapreduce tasks) ?
Any other logs or other files which I should look into?
I hope my requirement is clear.
UPDATE:
Hadoop version: Hadoop 1.0.3-Intel