Storm (vi): Streaming and merging of data streams

Last Update:2018-07-23 Source: Internet

Author: User

Tags config emit sleep

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Storm to data processing, different data to different bolts to deal with, and then processed data to the same bolt to store to the database, then need to shunt and confluence, we use an example to understand the shunt and confluence.

We read the text through spout, then send to the first bolt to cut the text, if it is a space to send bolt (1), if the comma is composed of text sent to Bolt (2), that is, the shunt, Then the same word is sent to the second bolt for the same task to be counted (confluence) when the word is cut, and these processes can be used by multiple servers to help us complete.

1. Diversion

1) first defined by Declareoutputfields in the main bolt

        @Override public
	void Declareoutputfields (Outputfieldsdeclarer declarer) {
		Declarer.declarestream (" StreamId1 ", New Fields (" field "));
		Declarer.declarestream ("StreamId2", New Fields ("field"));

2) then specify the send data stream ID when sending

	Collector.emit ("StreamId1", new Values);

3) Finally declare the data flow ID of the bolt when building the topology

       Builder.setbolt ("Split1", New Splitsentence1bolt (), 2). shufflegrouping ("spout", "streamId1");		
       Builder.setbolt ("Split2", New Splitsentence2bolt (), 2). shufflegrouping ("spout", "streamId2");

2. Confluence

When you build the topology, declare that the bolt receives a few bolts to

       Builder.setbolt ("Count", New Wordcountbolt (), 2). fieldsgrouping ("Split1", New fields ("word"))
		. Fieldsgrouping (" Split2 ", New fields (" word "));

Now let's look at the whole example:

First step: Create a spout data source

Import Java.util.Map;
Import Org.apache.storm.spout.SpoutOutputCollector;
Import Org.apache.storm.task.TopologyContext;
Import Org.apache.storm.topology.OutputFieldsDeclarer;
Import Org.apache.storm.topology.base.BaseRichSpout;
Import Org.apache.storm.tuple.Fields;
Import org.apache.storm.tuple.Values;

Import Org.apache.storm.utils.Utils; /** * Data Source * @author zhengcy * */@SuppressWarnings ("Serial") public class Sentencespout extends Baserichspout {PR
	Ivate Spoutoutputcollector collector; Private string[] sentences = {"Apache Storm is a free and open source distributed realtime computation System", "St Orm,makes,it,easy,to,reliably,process,unbounded,streams,of,data "," doing for realtime processing what Hadoop do for BA
	TCH processing "," Can,be,used,with,any,programming,language "," and is a lot of fun to use "};

	private int index = 0; @Override public void Declareoutputfields (Outputfieldsdeclarer declarer) {declarer.declarestream ("StreamId1", New Fie LDS ("Sentence "));
	Declarer.declarestream ("StreamId2", new fields ("sentence"));
		} @SuppressWarnings ("Rawtypes") public void open (Map config, topologycontext context,spoutoutputcollector collector) {
	This.collector = collector;	
	     public void Nexttuple () {if (index >= sentences.length) {return;
		 } if (index%2==0) {collector.emit ("StreamId1", New Values (Sentences[index]));
		 }else{collector.emit ("StreamId2", New Values (Sentences[index]));
		} index++;
	Utils.sleep (1);
 }
}

Step two: Realize the word cutting BOLT1

Import Org.apache.storm.topology.BasicOutputCollector;
Import Org.apache.storm.topology.OutputFieldsDeclarer;
Import Org.apache.storm.topology.base.BaseBasicBolt;
Import Org.apache.storm.tuple.Fields;
Import Org.apache.storm.tuple.Tuple;
Import org.apache.storm.tuple.Values;

/** *
 cut sentences
 * @author zhengcy
 *
 *
/@SuppressWarnings ("Serial") Public
class Splitsentence1bolt extends Basebasicbolt {

	@Override public
	void Declareoutputfields (outputfieldsdeclarer declarer) {
		Declarer.declare (new fields ("word"));
	}

	@Override public
	Void execute (Tuple input, basicoutputcollector collector) {
		String sentence = Input.getstringbyfield ("sentence");
		string[] Words = Sentence.split ("");
		for (String word:words) {
			collector.emit (new Values (Word));}}}

Step three: Realize the word cutting BOLT2

Import Org.apache.storm.topology.BasicOutputCollector;
Import Org.apache.storm.topology.OutputFieldsDeclarer;
Import Org.apache.storm.topology.base.BaseBasicBolt;
Import Org.apache.storm.tuple.Fields;
Import Org.apache.storm.tuple.Tuple;
Import org.apache.storm.tuple.Values;

/** *
 cut sentences
 * @author zhengcy
 *
 *
/@SuppressWarnings ("Serial") Public
class Splitsentence2bolt extends Basebasicbolt {

	@Override public
	void Declareoutputfields (outputfieldsdeclarer declarer) {
		Declarer.declare (new fields ("word"));
	}

	@Override public
	Void execute (Tuple input, basicoutputcollector collector) {
		String sentence = Input.getstringbyfield ("sentence");
		string[] Words = Sentence.split (",");
		for (String word:words) {
			collector.emit (new Values (Word));}}}

Fourth step: statistical bolts for Words

Import Java.util.HashMap;
Import Java.util.Map;
Import Org.apache.storm.task.TopologyContext;
Import Org.apache.storm.topology.BasicOutputCollector;
Import Org.apache.storm.topology.OutputFieldsDeclarer;
Import Org.apache.storm.topology.base.BaseBasicBolt;

Import Org.apache.storm.tuple.Tuple; /** * Statistical Word * @author zhengcy * */@SuppressWarnings ("Serial") public class Wordcountbolt extends Basebasicbolt {PR
	

    Ivate map<string, long> counts = null;  @SuppressWarnings ("Rawtypes") @Override public void Prepare (Map stormconf, Topologycontext context) {this.counts =
    New hashmap<string, long> (); } @Override public void Cleanup () {for (String Key:counts.keySet ()) {System.out.println (key + ":" +
	   This.counts.get (key)); }} @Override public void execute (Tuple input, basicoutputcollector collector) {String word = Input.getstringbyfiel
		D ("word");
		Long count = This.counts.get (word);
		if (count = = null) {count = 0L; } Count++;
	This.counts.put (Word, count); } @Override public void Declareoutputfields (Outputfieldsdeclarer declarer) {}}

Fifth step: Create a Topology topology

Import Org.apache.storm.Config;
Import Org.apache.storm.LocalCluster;
Import Org.apache.storm.StormSubmitter;
Import Org.apache.storm.topology.TopologyBuilder;

Import Org.apache.storm.tuple.Fields; /** * Word Statistics topology * @author zhengcy * */public class Wordcounttopology {public static void main (string[] args) throws E
		xception {Topologybuilder builder = new Topologybuilder ();
		Builder.setspout ("spout", New Sentencespout (), 1);		
		Builder.setbolt ("Split1", New Splitsentence1bolt (), 2). shufflegrouping ("spout", "streamId1");

		Builder.setbolt ("Split2", New Splitsentence2bolt (), 2). shufflegrouping ("spout", "streamId2"); Builder.setbolt ("Count", New Wordcountbolt (), 2). fieldsgrouping ("Split1", New fields ("word")). Fieldsgrouping ("

		Split2 ", New fields (" word "));
		Config conf = new config ();
		
		Conf.setdebug (FALSE);
			if (args! = null && args.length > 0) {//cluster mode conf.setnumworkers (2);
		Stormsubmitter.submittopology (Args[0], conf, builder.createtopology ()); } else {//local mode localcluster cluster = new Localcluster ();
		    Cluster.submittopology ("Word-count", conf, Builder.createtopology ());  
	        Thread.Sleep (10000); 
		Cluster.shutdown (); }
	}
}

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More