Two stages of partitioner and combiner

Source: Internet
Author: User

Partitioner Programming

data that has some common characteristics is written to the same file.

Sorting and grouping

      when sorting in the map and reduce phases, the comparison is K2. V2 are not involved in sorting comparisons.

If you want V2 to be sorted, you need to assemble K2 and V2 into new classes as K2,

To participate in the comparison. If you want to customize the collation, the sorted object is implemented

Writablecomparable interface, implementing collations in the CompareTo method,

This object is then treated as a K2, and the sorting grouping is done by K2.

combiners Programming

1. Each map generates a lot of output, and combiner is the function of the map end to the output

Do a merge first to reduce the amount of data transferred to reducer.

      2.combiner is basically the merging of local keys with similar local reduce functionality

         without combiner, all results are reduced and efficiency is relatively low,

      3. Using Combiner, the first map is aggregated locally, increasing the speed.

      Ps:combiner is reducer input, combiner absolutely cannot change the final calculation result.

So from a personal point of view, combiner only applies to that kind of reducer input key/value and

The output Key/value type is exactly the same and does not affect the final result of the scene. For example: cumulative, Maximum, etc.


This article from "in order to finger that direction" blog, declined reprint!

Two stages of partitioner and combiner

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.