Hadoop Task Optimization Recommendations-"Dr.elephant series article-6"

Source: Internet
Author: User

Using Dr.elephant to analyze our tasks, you can know where to optimize.

Accelerate your task flow

For specific tasks, it is advisable to have a specific parameter configuration. For many scenarios, the default task configuration does not guarantee the best performance for each task. While tuning these tasks can take some time, the performance gains that these tuning brings are impressive.

There are several task parameters that require special attention: the number of Mapper, the number of reducer, the configuration of io.*, the memory usage settings, and the number of files generated. These parameters are configured to make the parameters more suitable for the current task, which can greatly improve the performance of the task execution.

Apache's official website in Hadoop map/reduce Tutorial This article provides a lot of detailed and useful debugging advice, which is interesting to read carefully.

Some general recommendationsgradual tuning is important

For pig tasks, if you use default parameters to set the number of reducer, the performance of the task may be fatal. In general, it is worthwhile to spend some time tuning parameter parallel for each pig task. For example:

memberfeaturesgrouped = GROUP Memberfeatures by MemberID PARALLEL 90;

Number of files VS blocks

Too many small files can be very stressful for the namenode storage of the HDFs platform. Namenode uses approximately five bytes to store a file, using a byte to store a block. In general, it is more efficient to use a larger file for a task than to use 10 small files.

Java Task Memory management

By default, each map/reduce job allocates a maximum available memory space of 2G. For Java tasks, this 2G space includes both the 1G heap space and the 0.5-1g non-heap space. For some tasks, the default spatial allocation may not be sufficient. Here are some tips to reduce memory usage:

Usecompressedoops

The JVM in a 32-bit system uses 32bit of unsigned shaping to locate memory areas, and the maximum amount of heap space that can be represented is 2^32-1=4GB. A 64-bit JVM uses a 64bit unsigned long shape to represent a location in memory, with a maximum possible memory heap size of 2^64=16 bytes. Although there is a significant increase in the amount of memory that can be represented, using long instead of int will save these long shapes more than the memory space occupied by saving Int. It takes approximately 1.5 times times the original space. Using a 64-bit JVM no longer limits the amount of heap space we can use to 1G, which is a great convenience for program design. Can we properly reduce the memory usage of 64-bit systems when saving memory space information? The answer is yes. The latest JVM supports adding option compressedoops when used, which is used in some cases to save memory location information using 32bit of space instead of 64bit space. This reduces the use of memory to a certain extent, and we can choose to use the Compressedoops option in the application as follows:

Hadoop-inject.mapreduce. (map|reduce). Java.opts=-xmx1g-xx:+usecompressedoops

Instead of adding the custom section to the Mapred-site.xml default property, Azkaban overrides the default configuration property by default with a custom property. After setting the Use Compressedoops option, we need to confirm that the Compressedoops option and other default configurations are valid. It is necessary to confirm that "-xmx1g" is in the configuration file Mapred-site.xml, while the other configuration files are customized by us.

usecompressedstrings

This option converts a variable of type string to a byte[] type to be saved. If a task uses a large number of string variables, this option will greatly save memory usage. In the parameter mapreduce. (map|reduce). This option is activated by adding-xx:+usecompressedstring in the java.opts configuration. The virtual memory space allocated for each job is 2.1 times times the amount of physical memory space that is required. If our program throws the following error:

Container [pid=pid,containerid=container_id]

Is running beyond virtual memory limits. Current usage:365.1 MB of 1

GB physical memory used; 3.2 GB of 2.1 GB virtual memory used. Killing

Container

We can use this option to optimize the program.

Important Tuning Parameters Mapper

Mapreduce.input.fileinputformat.split.minsize

This parameter represents the minimum value of the size of each file block entered into the map. By increasing the size of this value, you can increase the size of the input file block in each map, thereby reducing the number of maps. For example When you set the mapreduce.input.fileinputformat.split.minsize size to 4 times times the size of the HDFs block (dfs.blocksize), the number of inputs to each map is 4 times times the dfs.blocksize, which reduces the number of maps Amount If you set this value to 256, the input file size is 268435456bit.

Mapreduce.input.fileinputformat.split.maxsize

This parameter represents the maximum value of each file size entered into the map when using the Combinedfileinputformat and Multifileinputformat parameters. When you set this value to less than the block size of HDFs (dfs.blocksize), you can increase the number of mapper. For example, we set Mapreduce.input.fileinputformat.split.maxsize to Dfs.blocksize of 1/4, which limits the file size entered to each map to dfs.blocksize of 1/ 4, the number of maps is increased. When the Combinefileinputformat option is used, the task is processed using only 1 mapper.

Reducer

Mapreduce.job.reduce

The number of reducer is one of the most important factors that affect task performance. Using too little reducer can cause each job to take too long to execute, and using too many reducer can also cause performance problems. Adjusting the number of reducer for each specific task is an art. Here are some ways to help us choose the right number of reducer:

    • More reducer means there are more files on the Namenode. Too many files can cause Namenode to die. Therefore, if the output of reduce is not hit (less than 512MB), use less reducer.
    • More reducer means that each reducer has less time to process the data. If the number of reducer used is too small, the time consumed by each reducer job increases significantly. The reducer runs faster and can handle more tasks.

The cost of shuffling (shuffling) operations is expensive in the task. We can see how much data needs to be transferred between each node through the various counters (Counter) of the HDFs file System (FileSystem). We did an experiment when the number of reducers was 20 o'clock, counter as follows:

Filesystemcouter:

File_bytes_read | 2950482442768

Hdfs_bytes_read | 1223524334581

File_bytes_written | 5697256875163

As you can see, the map output has approximately 5TB of data, and the shuffling time is as follows:

Shuffle End:

17-aug-2010 | 13:32:05 | (1HRS,29MINS,56SEC)

Sort end:

17-aug-2010 | 14:18:35 | (46MINS,29SEC)

These figures show that about 5TB of data takes 1.5 hours to shuffle, and then it takes 46 minutes to sort. The cost of this time is huge! We want to get these jobs done in about 5-15 minutes. We do a simple calculation: using 20 reducers will take 360 minutes, then using 200 reducers will only take 36 minutes, 400 reducers will only take 18 minutes. Digital computing can lead to quantifiable improvements. Increase the number of reducers to 500 and you can see the result:

Shuffle End:

17-aug-2010 | 16:32:32 | (12MIN,46SEC)

Sort end:

17-aug-2010 | 16:32:37 | (4SEC)

The effect is obvious, by increasing the number of reducer can lead to a reduction in time consumption. Extremes meet, if the shuffling time becomes very short, and CPU usage is very small, then the number of reducer is too much. It is very necessary to determine the reasonable number of reducer by experiment.

Mapreducer.job.reduce.slowstart.completedmaps

This parameter determines how much of the mapper must be completed before reducer begins execution. The default value is 80%. For many specific tasks, adjusting the size of this number can lead to performance gains. The factors that determine this figure are:

    • How much data each reducer receives
    • The rest of the map will take time for each map job

If the map has a large amount of output data, it is generally recommended that reducer start execution early to process the data. If the number of map tasks is not very large, it is generally recommended that the execution time of the reducer start late. A rough time estimate for the shuffle process is: When all mapper are finished, the time to start execution begins at the first reducer. This is the time required for reducer to get all the input. It is generally believed that reducers start execution time is: The last map end time + Shuffle time.

Compression

Mapreduce.map.output.compress

Set this parameter to true to compress the data for the map output. This reduces the amount of data transferred between multiple nodes, and then we have to make sure that the compression and decompression time is less than the transfer time between the nodes, otherwise it can be reversed. If the map outputs a large amount of data, or is a relatively easy to compress type, then it is necessary to set this parameter to TRUE, which will reduce the shuffle time. If the map output has a smaller amount of data, then resetting this parameter to False can reduce the amount of CPU compression and decompression. It is important to note that this parameter differs from Mapreduce.output.fileoutputformat.compress, which determines whether compression is required when writing the output of the task back to HDFs.

Memory

Mapreduce. (map|reduce). memory.mb

The new version of Hadoop adds a limit parameter to memory. This allows the system to better manage resource allocations in a busy situation. By default, each job in a Java task uses a 1GB heap size and 0.5-1g non-heap space. Therefore, the default MapReduce. (map|reduce). MOMORY.MB is set to 2GB. In some cases, this memory size is not enough. If you just set Java parameter-xmx, the task will be killed and you need to set the parameter MapReduce at the same time. (map|reduce). MOMORY.MB can effectively elevate or limit memory usage.

Advanced

Control the value of the Io.sort.record.percent

The parameter io.sort.record.percent determines how much of the indirect cache space is used to hold the meta-information for each and every record. In general, many phenomena can indicate that the setting of this parameter is unreasonable.

Assume that the XML configuration file for the log is used in the map job:

Property

Value

Bufstart

45633950

Bufend

68450908

Kvstart

503315

Kvend

838860

Length

838860

Io.sort.mb

256

Io.sort.record.percent

.05


Use the numbers above:

Property

Value

Io.sort.spill.percent (length-kvstart+kvend)/length)

.8

Size of Meta data is (BUFEND-BUFSTAT)

22816958 (MB)

Records in Memory (length-(kvstart+kvend))

671087

Average record Size (size/records)

Bytes

Record+metadata

Bytes

Records per io.sort.mb (io.sort.mb/(Record+metadata))

5.12 million

Metadata% in IO.SORT.MB ((records per IO.SORT.MB) *METADATA/IO.SORT.MB)

.32


In 256MB space, we can save a lot of records. We should set io.sort.record.percent to 0.32 instead of 0.5. When you use 0.5, the cached meta-information cache is larger than the record cache.

Changing this parameter will make the map run faster, and there will be fewer disk overflows. Even a 55% reduction in file overflow to disk, reducing the CPU time by about 30% and 30 minutes of running time.

Mapreduce. (map|reduce). Speculative

This parameter determines whether there will be the same map or reduce concurrency execution. When data skew occurs, there is a significant increase in the run time of some mapper or reducer. At this point, there may not be a need for some speculative behavior to prevent tilting into part of map and reduce.


Author Profile: Cussaur, interested in high-concurrency system design and development, is now focusing on big data development efforts. served as a background development engineer of Xiaomi Technology company, now serves as Senior development engineer of Everstring data platform Group.




Hadoop Task Optimization Recommendations-"Dr.elephant series article-6"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.