The officially provided storm starter example contains many application examples, which are helpful for understanding storm application scenarios. This article uses the source code for functional breakdown, record it, and use it as a memory index.
Let's take a look at a simple example: wordcounttopology. The original code is used to describe the multi-language adaptation scenario. The main function is to randomly generate some strings and divide these strings into groups, count the number of words that appear. Later, I modified it and removed the py call. Use Java to divide phrases.
Let's take a look at topology:
Public static void main (string [] ARGs) throws exception {topologybuilder builder = new topologybuilder (); builder. setspout ("Spout", new randomsentencespout (), 5); // data source builder. setbolt ("split", new splitsentence (), 8 ). shufflegrouping ("Spout"); // word Division builder. setbolt ("count", new wordcount (), 12 ). fieldsgrouping ("split", new fields ("word"); // counts the number of word occurrences. config conf = new config (); Conf. setdebug (true); If (ARGs! = NULL & args. length> 0) {Conf. setnumworkers (3); stormsubmitter. submittopologywithprogressbar ("wordcount", Conf, builder. createtopology ();} else {Conf. setmaxtaskparallelism (3); localcluster cluster = new localcluster (); cluster. submittopology ("word-count", Conf, builder. createtopology (); thread. sleep (10000); cluster. shutdown ();}}
The randomsentencespout function is simple. A random string is generated into the tuple and the key is word.
Let's take a look at Bolt. In this example, shufflegrouping (random grouping) and fieldsgrouping (field grouping) are used ). Shufflegrouping is used for word division. To ensure word statistics are performed in a bolt, fieldsgrouping is used for word Division statistics, during running, tuple with the same key value will be allocated to the same thread.
Public static class splitsentence implements ibasicbolt {public void prepare (MAP Conf, topologycontext context) {} public void execute (tuple, basicoutputcollector collector) {string sentence = tuple. getstring (0); For (string word: sentence. split ("") {// splits the tuple received by spout into spaces to generate WORD data stream collector. emit (new values (Word);} public void cleanup () {} public void declareoutputfields (outputfieldsdeclarer declarer) {declarer. declare (new fields ("word"); // defines the key value} @ overridepublic Map <string, Object> getcomponentconfiguration () {return NULL ;}}
Word statistics bolt:
Public static class wordcount extends basebasicbolt {Map <string, integer> counts = new hashmap <string, integer> (); @ overridepublic void execute (tuple, basicoutputcollector collector) {string word = tuple. getstring (0); integer COUNT = counts. get (Word); If (COUNT = NULL) Count = 0; count ++; counts. put (word, count); collector. emit (new values (word, count); // output result word + word number} @ overridepublic void declareoutputfields (outputfieldsdeclarer declarer) {declarer. declare (new fields ("Word", "Count "));}}
Summary:
Generally, to ensure data reliability and integrity, you can implement the irichbolt interface or inherit the baserichbolt interface. If you do not want to handle the result feedback yourself, you can implement the ibasicbolt interface or inherit the basebasicbolt interface, it is equivalent to automatically processing the prepare method and collector. emit. ack (inputtuple );