The meaning of the MapReduce default counter

Last Update:2016-01-31 Source: Internet

Author: User

Tags shuffle

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The MapReduce counter provides us with a window to observe the various details of the MapReduce job run time. I focused on mapreduce performance tuning this March, and most of the optimizations are based on the numerical performance of these counter. MapReduce has a lot of default counter, some friends may have some questions about them, now we analyze the meaning of these default counter, so that we can easily observe the job results.

My analysis is based on Hadoop0.21, and I've seen other versions of Hadoop counter show, the details are the same, if there are differences, the fact version of the main.

Counter has the concept of "group groups", which is used to represent all values of the same range logically. The default counter provided by the MapReduce job is divided into five groups, described below. Here is also a detailed comparison of my test data, they will appear in the form of a table in the description of each group.

Fileinputformatcounters

This group indicates that Maptask reads the contents of the file (total input data) statistics

	Counter	Map	Reduce	Total
Fileinputformatcounters	Bytes_read	1,109,990,596	0	1,109,990,596

Bytes_read

All input data (bytes) of the maptask equals the sum of all value bytes passed in by the map method of each maptask.

Filesystemcounter

The data on which the MapReduce job executes relies on different file systems, which represents the read and write statistics of the job's interaction with the file system.

	Counter	Map	Reduce	Total
Filesystemcounters	File_bytes_read	0	1,544,520,838	1,544,520,838
	File_bytes_written	1,544,537,310	1,544,520,838	3,089,058,148
	Hdfs_bytes_read	1,110,269,508	0	1,110,269,508
	Hdfs_bytes_written	0	827,982,518	827,982,518

File_bytes_read

The job reads the file bytes of the local file system. Assuming that our current map input data is from HDFs, then in the map phase, this data should be 0. However, before it executes, the input data is stored on the reduce-side local disk after the shuffle merge, so this data is the total number of input bytes for all reduce.

File_bytes_written

The intermediate results of the map are spill to the local disk, and the final spill file is formed after the map is executed. So the data on the map side indicates how many bytes maptask to the local disk. In contrast to map retransmitting, the reduce end is constantly pulling the intermediate results from the map end at shuffle, then doing the merge and constantly spill to its own local disk. Finally, a separate file is created, which is the input file for the reduce.

Hdfs_bytes_read

During the entire job execution, data is read from HDFs only when the map side is running, which is not limited to the source file content, but also includes the split metadata for all maps. So this value should be slightly larger than fileinputformatcounter.bytes_read.

Hdfs_bytes_written

The final result of reduce is written to HDFs, which is the total amount of a job execution result.

Shuffle Errors

This group describes the number of error cases occurring during the shuffle process and is basically positioned at the shuffle stage when the copy thread fetches the intermediate data from the map side.

	Counter	Map	Reduce td>	Total
Shuffle Errors	bad_id	0	0	0
	CONNECTION	0	0	0
	Io_error / td>	0	0	0
	Wron G_length	0	0	0
	Wrong_map	0	0	0
	Wrong_reduc E	0	0	0

bad_id

Each map has an ID, such as ATTEMPT_201109020150_0254_M_000000_0, which is incremented if the ID in the metadata fetched by the reduce's copy thread is not in the standard format.

CONNECTION

Indicates that the copy thread established the connection to the map side incorrectly.

Io_error

If the copy thread of reduce is ioexception when fetching the map-side data, this value will increase accordingly.

Wrong_length

The middle result of the map end is a compressed, formatted data with two length information: The size of the metadata and the size of the compressed data. If these two length messages are transmitted incorrectly, then this counter will increase.

Wrong_map

Each copy thread is of course purposeful: grabbing the intermediate result of some map for a reduce, if the currently fetched map data is not a map defined before the copy thread, then the data is pulled wrong.

Wrong_reduce

Consistent with the above description, if the data being crawled indicates that it is not prepared for this reduce, it still pulls the wrong data.

Job Counter

This group describes the statistics associated with job scheduling.

	Counter	Map	Reduce	Total
Job Counters	Data-local Map Tasks	0	0	67
	Fallow_slots_millis_maps	0	0	0
	Fallow_slots_millis_reduces	0	0	0
	Slots_millis_maps	0	0	1,210,936
	Slots_millis_reduces	0	0	1,628,224
	Launched map tasks	0	0	67
	launched reduce tasks	0	0	8

Data-local Map Tasks

When the job is dispatched, if a data-local is started (a copy of the source file is local to the Tasktracker performing the map Task)

Fallow_slots_millis_maps

The current job retains slots for the execution of some maptask, with a total amount of time reserved.

Fallow_slots_millis_reduces

Similar to the above

Slots_millis_maps

Total time that all maptask occupy slots, including execution time and time when the child JVM was created/destroyed

Slots_millis_reduces

Similar to the above

Launched map tasks

How many map tasks are started by this job

launched reduce tasks

How many reduce tasks is started by this job

Map-reduce Framework

This counter group contains quite a few job execution detail data. Here's a concept: In general, the record represents a row of data, and the relative byte indicates how large the row of data is, and the group here indicates that after the reduce merge, the input form {"AAA", [5,2,8,...]}

	Counter	Map	Reduce	Total
Map-reduce Framework	Combine Input Records	200,000,000	0	200,000,000
	Combine Output Records	117,838,546	0	117,838,546
	Failed shuffles	0	0	0
	GC time Elapsed (ms)	23,472	46,588	70,060
	Map Input Records	10,000,000	0	10,000,000
	Map Output bytes	1,899,990,596	0	1,899,990,596
	MAP Output Records	200,000,000	0	200,000,000
	Merged MAP outputs	0	536	536
	Reduce input Groups	0	84,879,137	84,879,137
	Reduce Input Records	0	117,838,546	117,838,546
	Reduce Output Records	0	84,879,137	84,879,137
	Reduce Shuffle bytes	0	1,544,523,910	1,544,523,910
	Shuffled Maps	0	536	536
	Spilled Records	117,838,546	117,838,546	235,677,092
	Split_raw_bytes	8,576	0	8,576

Combine Input Records

Combiner is to minimize the data that needs to be pulled and moved, so the number of combine input bars is consistent with the number of output bars in the map.

Combine Output Records

After combiner, the data of the same key is compressed, and many duplicate data are resolved at the map end, indicating the final number of all the entries in the middle file of the map end.

Failed shuffles

The number of shuffle errors caused by the copy thread when fetching intermediate data from the map side if the network connection is abnormal or IO is abnormal.

GC time Elapsed (ms)

The total GC time consumed by JMX is obtained to the child JVM executing map and reduce.

Map Input Records

Total number of files read from HDFs by all Maptask

MAP Output Records

The direct output record of the Maptask is the number of times the context.write is called in the map method, that is, the number of native output bars that have not been combine.

Map Output bytes

The output of map Key/value will be serialized into the memory buffer, so the bytes here refers to the sum of the final bytes after serialization.

Merged MAP outputs

Records the total number of merge moves in the shuffle process

Reduce input Groups

How many of these groups does reduce total read?

Reduce Input Records

If there is combiner, then the value here will be equal to the map end combiner operation of the last number, if not, then will be equal to the map output bar number

Reduce Output Records

Total number of entries for all reduce after execution output

Reduce Shuffle bytes

The total number of intermediate data from the map side is captured by the copy thread of the reduce side, representing the sum of the final intermediate files for each maptask.

Shuffled Maps

Each reduce almost has to pull data from all map ends, each copy line pulls the data of a successful map, then increase by 1, so its total is basically equal to reduce number*map number

Spilled Records

The spill process occurs on both the map and the reduce side, where the total number of data spill from memory to disk is counted.

Split_raw_bytes

The data associated with Maptask's split will be stored in HDFs, while the metadata stored in the store also corresponds to how the data is placed in a compressed manner, what the specific type is, and the additional data is added to the MapReduce framework, regardless of the job, The size of the record here is the size of the byte that represents the extra information.

The meaning of the MapReduce default counter

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More