1
votes

When I run a Hive statement and the corresponding MR jobs are launched it usually has a line like:

Stage-Stage-1: Map: 33 Reduce: 131 Cumulative CPU: 8006.47 sec HDFS Read: 1280804751 HDFS Write: 279261996966 SUCCESS

Total MapReduce CPU Time Spent: 0 days 2 hours 13 minutes 26 seconds 470 msec

I had some questions about interpreting that line.

  1. What units are the numbers 1280804751, 279261996966 in? Bytes? Blocks? Any way to convert them to human-readable format?
  2. What does the "Total MapReduce CPU Time Spent" mean? What does "Cumulative CPU" mean?
1

1 Answers

4
votes
  1. The HDFS Read and HDFS Write values are in bytes.

  2. Cumulative CPU is the total CPU time of all the tasks of the MapReduce job for the stage. Total MapReduce CPU Time Spent is the total CPU time of all of the stages of the query. In your example there is only one stage so both values have the same duration.