For now, I have a Hadoop job which creates counters with a pretty big name.
For example, the following one: stats.counters.server-name.job.job-name.mapper.site.site-name.qualifier.qualifier-name.super-long-string-which-is-not-within-standard-limits
. This counter is truncated on web interface and on getName()
method call. I've found out that Hadoop has limitations on the counter max name and this settings id mapreduce.job.counters.counter.name.max
is for configuring this limit. So I incremented this to 500
and web interface now shows full counter name. But getName()
of the counter still returns truncated name.
Could somebody, please, explain this or point me on my mistakes? Thank you.
EDIT 1
My Hadoop server configuration consists of the single server with HDFS, YARN, and map-reduce itself on it. During map-reduce, there are some counter increments and after the job is completed, in ToolRunner
I fetch counters with the use of org.apache.hadoop.mapreduce.Job#getCounters
.
EDIT 2
Hadoop version is the following:
Hadoop 2.6.0-cdh5.8.0
Subversion http://github.com/cloudera/hadoop -r 042da8b868a212c843bcbf3594519dd26e816e79
Compiled by jenkins on 2016-07-12T22:55Z
Compiled with protoc 2.5.0
From source with checksum 2b6c319ecc19f118d6e1c823175717b5
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.8.0.jar
I made some additional investigation and it seems that this issue describes a situation similar to mine. But it's pretty confusing cause I'm able to increase the number of counters but not the length of counter's name...
EDIT 3
Today, I spent pretty much time debugging internals of the Hadoop. Some interesting stuff:
org.apache.hadoop.mapred.ClientServiceDelegate#getJobCounters
method returns a bunch of counters from yarn with TRUNCATED names and FULL display names.- Was unable to debug maps and reducers itself but with help of logging it seems that
org.apache.hadoop.mapreduce.Counter#getName
method works correctly during reducer execution.
getName()
call that still returns the truncated name? Is this iterating over the counters returned fromJob#getCounters()
in the submitting client after waiting for job completion, or is it a separate application querying counters from the job history server, or is it something else entirely? I would expect your configuration to be sufficient. The web UI uses the samegetName()
call. (It would not retroactively fix truncated counter names from jobs submitted before the configuration change though.) – Chris Naurothstats.counters.server-name.job.job-name.mapper.site.site-name.qualifier.qualifier-name.super-long-string-which-is-not-within-standard-limits
– maxmithun