Shuffle of hadoop operating principles

Source: Internet
Author: User

The core idea of hadoop is mapreduce, but Shuffle is the core of mapreduce. The main task of Shuffle is the process from the end of map to the start of reduce. First, you can see the position of shuffle. In the figure, partitions, copy phase, and sort phase represent different phases of shuffle.


The shuffle stage can be divided into the shuffle at the map end and the shuffle at the reduce end.

1. Shuffle on the map end

The map end processes the input data and generates intermediate results. The intermediate results are written to the local disk instead of HDFS. The output of each map is first written to the memory buffer. When the data written reaches the set threshold, the system starts a thread to write the data in the buffer to the disk. This process is called spill.

Before spill is written, secondary sorting is performed first. Data in each partition is sorted by key based on the partition to which the data belongs. The goal of partition is to divide records into different reducers to achieve load balancing. In the future, reducers will read their own data based on partition. Then run combiner (if set). The essence of combiner is also a reducer, which aims to process the files to be written to the disk first, the amount of data written to the disk is reduced. Finally, write the data to the local disk to generate a spill file (the spill file is saved in the directory specified by {mapred. Local. dir} and will be deleted after the map task is completed ).

Finally, each map task may generate multiple spill files. Before each map task is completed, the spill files are merged into one file through the multi-channel merge algorithm. So far, the shuffle process of map is over.

Ii. Shuffle at reduce end

The shuffle at the reduce end mainly includes three stages: Copy, sort (merge), and reduce. First, we need to copy the output files generated by the map end to the reduce end. How does each reduce know what data they should process? Because when the map side performs a partition, it is equivalent to specifying the data to be processed by each reducer (partition corresponds to reducer ), therefore, when copying data, the reducer only needs to copy the data in the corresponding partition. That is to say, each reducer processes one or more partitions, but you must first copy the data in the corresponding partition from the output results of each map.

Afang calligraphy


Shuffle of hadoop operating principles

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.