Eight kinds of grouping strategies for storm

Source: Internet
Author: User

here, you will refer to Storm's seven grouping strategies, and the code is implemented one by one.

First, you need a cluster (you want to imitate the real environment as much as possible, so you don't have to use local mode). Detailed installation methods you can view one of my other blog posts: The deployment process for storm clusters and zookeeper clusters.

Ok. There are now three nodes. One as Nimbus, two as a supervisor. Let me introduce you here. There are two component in storm logic, one is spout and the other is bolt. Stream is emitted by spout and processed between different bolts, in which the basic processing unit of storm is passed: Tuple. A tuple is emitted by the spout and then the bolt receives a tuple for a variety of processing. This entire process constitutes a DAG. Inside Storm is called topology. When using remote mode to submit a topology to the cluster, if you do not kill it, it will run until ... I don't know the end of it. There seems to be no end.

Well, let's see a simple topology. This topology will be used to implement those grouping strategies.

The processing logic in spout is to send a sentence to the next bolt, then the next bolt to do the sentence Word division, the next to do the count, the last bolt to do a summary display. There can be multiple bolts or spout for parallel processing. This is a setting for the degree of parallelism.

Well, the so-called grouping strategy is the way to pass a tuple between spout and Bolt, bolt, and bolt. There are a total of seven ways:

1) shufflegrouping (random Group)

2) fieldsgrouping (grouped by field, where the same word can only be sent to a bolt)

3) allgrouping (broadcast send, that is, every tuple, every bolt will receive)

4) globalgrouping (global grouping, assigning a tuple to a task with the lowest task ID value)

5) nonegrouping (Random Dispatch)

6) directgrouping (direct grouping, specify the corresponding send relationship between tuple and Bolt)

7) Local or shuffle Grouping

8) customgrouping (custom grouping)

OK, try it one by one!

The first is to use the shufflegrouping policy.

the results obtained after starting.

as you can see, the word storm is randomly assigned to two counter, which are H2 and H3 two nodes. You can commit again, and you'll see a different result. Will be contrasted with the fieldgrouping below.

then switch to fieldsgrouping.

the result after startup. Wordcount2 and Wordcount3 are the names of the topology I submitted two times in the picture.

Here are the results of two commits. As you can see, the words that are assigned to each wordcounterbolt are not changed using the fieldsgrouping policy.

The following is the result of the second commit.

then switch to the nonegrouping strategy.

commit the cluster to run.

run the results. Nonegrouping and shufflegrouping are basically the same. are random.

Replace with allgrouping policy.

commit the cluster to run.

run the results. Can see. Two bolts receive the same word, all the words.

Finally, replace the globalgrouping strategy.

commit the cluster to run.

as you can see, it is mainly assigned to the H3 node. Validated from the following results.

OK, let's do it first! The rest of the grouping strategy needs to be changed in one code, and then the next time. Originally do not want to on so many pictures, but, not that there is a picture of the truth! haha ~

Eight kinds of grouping strategies for storm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.