The MapReduce counter provides us with a window to observe the various details of the MapReduce job run time. I focused on mapreduce performance tuning this March, and most of the optimizations are based on the numerical performance of these counter. MapReduce has a lot of default counter, some friends may have some questions about them, now we analyze the meaning of these default counter, so that we can easily observe the job results.
My analysis is based on Hadoop0.21, and I've seen other versions of Hadoop counter show, the details are the same, if there are differences, the fact version of the main.
Counter has the concept of "group groups", which is used to represent all values of the same range logically. The default counter provided by the MapReduce job is divided into five groups, described below. Here is also a detailed comparison of my test data, they will appear in the form of a table in the description of each group.
Fileinputformatcounters
This group indicates that Maptask reads the contents of the file (total input data) statistics
|
Counter |
Map |
Reduce |
Total |
Fileinputformatcounters |
Bytes_read |
1,109,990,596 |
0 |
1,109,990,596 |
Bytes_read
All input data (bytes) of the maptask equals the sum of all value bytes passed in by the map method of each maptask.
Filesystemcounter
The data on which the MapReduce job executes relies on different file systems, which represents the read and write statistics of the job's interaction with the file system.
|
Counter |
Map |
Reduce |
Total |
Filesystemcounters |
File_bytes_read |
0 |
1,544,520,838 |
1,544,520,838 |
|
File_bytes_written |
1,544,537,310 |
1,544,520,838 |
3,089,058,148 |
|
Hdfs_bytes_read |
1,110,269,508 |
0 |
1,110,269,508 |
|
Hdfs_bytes_written |
0 |
827,982,518 |
827,982,518 |
File_bytes_read
The job reads the file bytes of the local file system. Assuming that our current map input data is from HDFs, then in the map phase, this data should be 0. However, before it executes, the input data is stored on the reduce-side local disk after the shuffle merge, so this data is the total number of input bytes for all reduce.
File_bytes_written
The intermediate results of the map are spill to the local disk, and the final spill file is formed after the map is executed. So the data on the map side indicates how many bytes maptask to the local disk. In contrast to map retransmitting, the reduce end is constantly pulling the intermediate results from the map end at shuffle, then doing the merge and constantly spill to its own local disk. Finally, a separate file is created, which is the input file for the reduce.
Hdfs_bytes_read
During the entire job execution, data is read from HDFs only when the map side is running, which is not limited to the source file content, but also includes the split metadata for all maps. So this value should be slightly larger than fileinputformatcounter.bytes_read.
Hdfs_bytes_written
The final result of reduce is written to HDFs, which is the total amount of a job execution result.
Shuffle Errors
This group describes the number of error cases occurring during the shuffle process and is basically positioned at the shuffle stage when the copy thread fetches the intermediate data from the map side.
|
Counter |
Map |
Reduce td> |
Total |
Shuffle Errors |
bad_id |
0 |
0 |
0 |
|
CONNECTION |
0 |
0 |
0 |
|
Io_error / td> |
0 |
0 |
0 |
|
Wron G_length |
0 |
0 |
0 |
|
Wrong_map |
0 |
0 |
0 |
|
Wrong_reduc E |
0 |
0 |
0 |
bad_id
Each map has an ID, such as ATTEMPT_201109020150_0254_M_000000_0, which is incremented if the ID in the metadata fetched by the reduce's copy thread is not in the standard format.
CONNECTION
Indicates that the copy thread established the connection to the map side incorrectly.
Io_error
If the copy thread of reduce is ioexception when fetching the map-side data, this value will increase accordingly.
Wrong_length
The middle result of the map end is a compressed, formatted data with two length information: The size of the metadata and the size of the compressed data. If these two length messages are transmitted incorrectly, then this counter will increase.
Wrong_map
Each copy thread is of course purposeful: grabbing the intermediate result of some map for a reduce, if the currently fetched map data is not a map defined before the copy thread, then the data is pulled wrong.
Wrong_reduce
Consistent with the above description, if the data being crawled indicates that it is not prepared for this reduce, it still pulls the wrong data.
Job Counter
This group describes the statistics associated with job scheduling.
|
Counter |
Map |
Reduce |
Total |
Job Counters |
Data-local Map Tasks |
0 |
0 |
67 |
|
Fallow_slots_millis_maps |
0 |
0 |
0 |
|
Fallow_slots_millis_reduces |
0 |
0 |
0 |
|
Slots_millis_maps |
0 |
0 |
1,210,936 |
|
Slots_millis_reduces |
0 |
0 |
1,628,224 |
|
Launched map tasks |
0 |
0 |
67 |
|
launched reduce tasks |
0 |
0 |
8 |
Data-local Map Tasks
When the job is dispatched, if a data-local is started (a copy of the source file is local to the Tasktracker performing the map Task)
Fallow_slots_millis_maps
The current job retains slots for the execution of some maptask, with a total amount of time reserved.
Fallow_slots_millis_reduces
Similar to the above
Slots_millis_maps
Total time that all maptask occupy slots, including execution time and time when the child JVM was created/destroyed
Slots_millis_reduces
Similar to the above
Launched map tasks
How many map tasks are started by this job
launched reduce tasks
How many reduce tasks is started by this job
Map-reduce Framework
This counter group contains quite a few job execution detail data. Here's a concept: In general, the record represents a row of data, and the relative byte indicates how large the row of data is, and the group here indicates that after the reduce merge, the input form {"AAA", [5,2,8,...]}
|
Counter |
Map |
Reduce |
Total |
Map-reduce Framework |
Combine Input Records |
200,000,000 |
0 |
200,000,000 |
|
Combine Output Records |
117,838,546 |
0 |
117,838,546 |
|
Failed shuffles |
0 |
0 |
0 |
|
GC time Elapsed (ms) |
23,472 |
46,588 |
70,060 |
|
Map Input Records |
10,000,000 |
0 |
10,000,000 |
|
Map Output bytes |
1,899,990,596 |
0 |
1,899,990,596 |
|
MAP Output Records |
200,000,000 |
0 |
200,000,000 |
|
Merged MAP outputs |
0 |
536 |
536 |
|
Reduce input Groups |
0 |
84,879,137 |
84,879,137 |
|
Reduce Input Records |
0 |
117,838,546 |
117,838,546 |
|
Reduce Output Records |
0 |
84,879,137 |
84,879,137 |
|
Reduce Shuffle bytes |
0 |
1,544,523,910 |
1,544,523,910 |
|
Shuffled Maps |
0 |
536 |
536 |
|
Spilled Records |
117,838,546 |
117,838,546 |
235,677,092 |
|
Split_raw_bytes |
8,576 |
0 |
8,576 |
Combine Input Records
Combiner is to minimize the data that needs to be pulled and moved, so the number of combine input bars is consistent with the number of output bars in the map.
Combine Output Records
After combiner, the data of the same key is compressed, and many duplicate data are resolved at the map end, indicating the final number of all the entries in the middle file of the map end.
Failed shuffles
The number of shuffle errors caused by the copy thread when fetching intermediate data from the map side if the network connection is abnormal or IO is abnormal.
GC time Elapsed (ms)
The total GC time consumed by JMX is obtained to the child JVM executing map and reduce.
Map Input Records
Total number of files read from HDFs by all Maptask
MAP Output Records
The direct output record of the Maptask is the number of times the context.write is called in the map method, that is, the number of native output bars that have not been combine.
Map Output bytes
The output of map Key/value will be serialized into the memory buffer, so the bytes here refers to the sum of the final bytes after serialization.
Merged MAP outputs
Records the total number of merge moves in the shuffle process
Reduce input Groups
How many of these groups does reduce total read?
Reduce Input Records
If there is combiner, then the value here will be equal to the map end combiner operation of the last number, if not, then will be equal to the map output bar number
Reduce Output Records
Total number of entries for all reduce after execution output
Reduce Shuffle bytes
The total number of intermediate data from the map side is captured by the copy thread of the reduce side, representing the sum of the final intermediate files for each maptask.
Shuffled Maps
Each reduce almost has to pull data from all map ends, each copy line pulls the data of a successful map, then increase by 1, so its total is basically equal to reduce number*map number
Spilled Records
The spill process occurs on both the map and the reduce side, where the total number of data spill from memory to disk is counted.
Split_raw_bytes
The data associated with Maptask's split will be stored in HDFs, while the metadata stored in the store also corresponds to how the data is placed in a compressed manner, what the specific type is, and the additional data is added to the MapReduce framework, regardless of the job, The size of the record here is the size of the byte that represents the extra information.
The meaning of the MapReduce default counter