PS:
In the process of MAP and reduce, you can set the State at any time by setting context. setstatus (). This underlying layer is also set using reporter.
1. using counter in version 0.20.x is simple and can be defined directly. If this counter is not available, hadoop will automatically add this counter.
Counter Ct = context. getcounter ("input_words", "Count ");
Ct. increment (1 );
2. In version 0.19.x, You need to define Enum
Enum mycounter {input_words };
Reporter. incrcounter (mycounter. input_words, 1 );
Runningjob job = jobclient. runjob (CONF );
Counters c = job. getcounters ();
Long CNT = C. getcounter (mycounter. input_words );
3.Description of counter by default
Mapreduce counter provides a window for us to observe detailed data during mapreduce job runtime. During the March this year period, I used to focus on mapreduce Performance Tuning. Most of the evaluations of optimization are based on the counter values. Mapreduce comes with many default counters, and some may have some questions about them. Now I will analyze the meanings of these default counters to help you observe the job result.
My analysis is based on hadoop0.21. I have also seen counter presentations for other hadoop versions. The details are similar. If there are any differences, the actual version is the main one.
Counter has the concept of "group Group", which is used to represent all values in the same logical range. The default counter provided by mapreduce job is divided into five groups. Here, we also use my test data for detailed comparison. They will appear in the descriptions of each group in the form of tables.
Fileinputformatcounters
This group indicates the statistics of the content (total input data) read by the map task.
|
Counter |
Map |
Reduce |
Total |
Fileinputformatcounters |
Bytes_read |
1,109,990,596 |
0 |
1,109,990,596 |
Bytes_read
All input data (bytes) of a map task is equal to the sum of all value bytes passed in by the map method of each map task.
Filesystemcounters
Mapreduce job execution depends on different file systems. This group indicates the read/write statistics of the interaction between the job and the file system.
|
Counter |
Map |
Reduce |
Total |
Filesystemcounters |
File_bytes_read |
0 |
1,544,520,838 |
1,544,520,838 |
|
File_bytes_written |
1,544,537,310 |
1,544,520,838 |
3,089,058,148 |
|
Hdfs_bytes_read |
1,110,269,508 |
0 |
1,110,269,508 |
|
Hdfs_bytes_written |
0 |
827,982,518 |
827,982,518 |
File_bytes_read
The number of bytes that the job reads from the local file system. Assume that the input data of the current map is from HDFS. In the map stage, this data should be 0. However, before reduce is executed, its input data is stored in the reduce end disk after the merge of shuffle, so this data is the total number of input bytes of all reduce.
File_bytes_written
The intermediate results of the map are all spill to the local disk. After the map is executed, the final spill file is formed. Therefore, the data in the map end indicates the total number of bytes written by the map task to the local disk. What corresponds to the map end is that when the reduce end shuffle, it will constantly pull the intermediate results of the map end, then perform merge and continuously spill it to its local disk. Finally, a separate file is formed, which is the input file of reduce.
Hdfs_bytes_read
During job execution, data is read from HDFS only when the map side is running. This data is not limited to the source file content, but also includes all the split metadata of the map. Therefore, this value should be slightly larger than fileinputformatcounters. bytes_read.
Hdfs_bytes_written
The final result of reduce is written to HDFS, which is the total number of execution results of a job.
Shuffle errors
This group describes the number of occurrences of various errors in the shuffle process, which is basically located in the shuffle stage when the copy thread captures the intermediate data of the map.
|
Counter |
Map |
Reduce |
Total |
Shuffle errors |
Bad_id |
0 |
0 |
0 |
|
Connection |
0 |
0 |
0 |
|
Io_error |
0 |
0 |
0 |
|
Wrong_length |
0 |
0 |
0 |
|
Wrong_map |
0 |
0 |
0 |
|
Wrong_reduce |
0 |
0 |
0 |
Bad_id
Each map has an ID, such as attempt_201109020150_0254_m_000000_0. If the ID in the metadata captured by the reduce copy thread is not in the standard format, the counter increases
Connection
Indicates that the connection from the copy thread to the map end is incorrect.
Io_error
If the copy thread of reduce encounters an ioexception when capturing data on the map end, this value increases accordingly.
Wrong_length
The intermediate result of the map end is compressed formatted data, and all of it has two length information: the source data size and the compressed data size. If the two lengths are transmitted incorrectly (negative), the counter increases
Wrong_map
Each copy thread certainly has a purpose: to capture the intermediate results of some maps for a reduce. If the currently crawled map data is not a map defined before the copy thread, it indicates that the data is pulled incorrectly.
Wrong_reduce
It is consistent with the preceding description. If the captured data indicates that it is not prepared for this reduce operation, the data is pulled incorrectly.
Job counters
This group describes statistics related to job scheduling.
|
Counter |
Map |
Reduce |
Total |
Job counters |
Data-local map tasks |
0 |
0 |
67 |
|
Fallow_slots_millis_maps |
0 |
0 |
0 |
|
Fallow_slots_millis_reduces |
0 |
0 |
0 |
|
Slots_millis_maps |
0 |
0 |
1,210,936 |
|
Slots_millis_reduces |
0 |
0 |
1,628,224 |
|
Launched map tasks |
0 |
0 |
67 |
|
Launched reduce tasks |
0 |
0 |
8 |
Data-local map tasks
When a job is scheduled, if a data-local is started (the source file is in the tasktracker local where the map task is executed)
Fallow_slots_millis_maps
The current job retains slots for execution of some map tasks. What is the total retention time?
Fallow_slots_millis_reduces
Similar
Slots_millis_maps
All map tasks occupy the total slot time, including the execution time and the time when the sub-JVM is created/destroyed.
Slots_millis_reduces
Similar
Launched map tasks
Number of map tasks started by this job
Launched reduce tasks
Number of reduce tasks started by this job
Map-Reduce framework
This counter group contains a lot of detailed job execution data. Here, we need to have a concept: Generally, record indicates a row of data, while byte indicates the size of the row of data, the Group indicates the input format {"AAA", [5, 8, 2,…]} After reduce merge.
|
Counter |
Map |
Reduce |
Total |
Map-Reduce framework |
Combine input records |
200,000,000 |
0 |
200,000,000 |
|
Combine output records |
117,838,546 |
0 |
117,838,546 |
|
Failed shuffles |
0 |
0 |
0 |
|
GC time elapsed (MS) |
23,472 |
46,588 |
70,060 |
|
Map input records |
10,000,000 |
0 |
10,000,000 |
|
Map output bytes |
1,899,990,596 |
0 |
1,899,990,596 |
|
Map output records |
200,000,000 |
0 |
200,000,000 |
|
Merged map outputs |
0 |
536 |
536 |
|
Reduce input groups |
0 |
84,879,137 |
84,879,137 |
|
Reduce input records |
0 |
117,838,546 |
117,838,546 |
|
Reduce output records |
0 |
84,879,137 |
84,879,137 |
|
Reduce shuffle bytes |
0 |
1,544,523,910 |
1,544,523,910 |
|
Shuffled maps |
0 |
536 |
536 |
|
Spilled records |
117,838,546 |
117,838,546 |
235,677,092 |
|
Split_raw_bytes |
8,576 |
0 |
8,576 |
Combine input records
Combiner is used to minimize the amount of data that needs to be pulled and moved. Therefore, the number of combine inputs is consistent with the number of map outputs.
Combine output records
After combiner, the data with the same key is compressed, and a lot of duplicate data is solved on the map end, indicating the number of all entries in the file in the middle of the map end.
Failed shuffles
Number of shuffle errors caused by network connection exceptions or IO exceptions when the copy thread captures data in the middle of the map.
GC time elapsed (MS)
JMX is used to obtain the total GC time consumed by the sub-JVM that executes map and reduce.
Map input records
Total number of objects read by all MAP tasks from HDFS
Map output records
The direct output record of a map task is the number of times context. Write is called in the map method, that is, the number of native output records without combine.
Map output bytes
The key/value of map output results are serialized into the memory buffer, so the bytes here refers to the sum of the final bytes after serialization.
Merged map outputs
Records the total number of merge actions performed during shuffle.
Reduce input groups
Total number of such groups read by reduce
Reduce input records
If there is a combiner, the value here is equal to the last number after the combiner operation on the map side. If not, it should be equal to the output number of the map.
Reduce output records
Total number of items output after all reduce operations
Reduce shuffle bytes
The copy thread at the reduce end captures a total of intermediate data from the map end, indicating the sum of the final intermediate files of each map task.
Shuffled maps
Almost all reduce workers need to pull data from all the map terminals, and each copy thread pulls the data of a successful map, increasing by 1, so the total number is basically equal to the reduce number * map number.
Spilled records
The spill process occurs on the map and reduce sides. Here, we will count the total number of spill data records from memory to the disk.
Split_raw_bytes
Data related to the split of map tasks are stored in HDFS, while metadata also stores the compression method of the data, and the specific type of the data, the additional data is added by the mapreduce framework and has nothing to do with the job. The record size here is the size of the byte of the additional information.
4. analyze counter and reporter's
Http://blog.sina.com.cn/s/blog_61ef49250100uxwh.html
5. Others
Hadoop: the definitive guide chapter 8th hadoop features
Http://blog.sina.com.cn/s/blog_61ef49250100uxwh.html
Http://lintool.github.com/Cloud9/docs/content/counters.html
Http://langyu.iteye.com/blog/1171091