First, Introduction
Our previous article was familiar with the basic knowledge of Apache storm as an open source distributed, real-time, extensible, fault-tolerant computing system, and we combined the application with basic knowledge through a simple example of storm.
Storm's topology is a distributed, real-time computing application that connects spouts and bolts in series with stream groupings to form a stream data processing structure that topologys in a cluster until Kill (Storm kill Topology-name [-W wait-time-secs]) topology will not end running.
Topology run mode: Local mode and distributed mode.
Ii. Examples of Word statistics
We read the text through spout, then send to the first bolt to cut the text, and then in the cut word to the same word to the second bolt to the same task to count, these processes can use multiple servers to help us complete.
Components are spout, bolts, Stream groupings (shufflegrouping, fieldsgrouping), topology
First step: Create spout Data Source
Import Java.util.Map;
Import Org.apache.storm.spout.SpoutOutputCollector;
Import Org.apache.storm.task.TopologyContext;
Import Org.apache.storm.topology.OutputFieldsDeclarer;
Import Org.apache.storm.topology.base.BaseRichSpout;
Import Org.apache.storm.tuple.Fields;
Import org.apache.storm.tuple.Values;
Import Org.apache.storm.utils.Utils; /** * Data Source * @author zhengcy * */@SuppressWarnings ("Serial") public class Sentencespout extends Baserichspout {PR
Ivate Spoutoutputcollector collector; Private string[] sentences = {"Apache Storm is a free and open source distributed realtime computation System", "St ORM makes it easy-to-reliably process unbounded streams of data "," doing for realtime processing what Hadoop do for BA
TCH processing "," Storm is simple "," can being used with any programming language "," and are a lot of fun to use "};
private int index = 0; @Override public void Declareoutputfields (Outputfieldsdeclarer declarer) {//Defines the output field description Declarer.declare (New fields ("sentence"));
} @SuppressWarnings ("Rawtypes") public void open (Map config, topologycontext context,spoutoutputcollector collector) {
This.collector = collector;
public void Nexttuple () {if (index >= sentences.length) {return;
}//Send string This.collector.emit (new Values (Sentences[index]));
index++;
Utils.sleep (1); }
}
Step Two: Achieve word cutting Bolt
Import Org.apache.storm.topology.BasicOutputCollector;
Import Org.apache.storm.topology.OutputFieldsDeclarer;
Import Org.apache.storm.topology.base.BaseBasicBolt;
Import Org.apache.storm.tuple.Fields;
Import Org.apache.storm.tuple.Tuple;
Import org.apache.storm.tuple.Values;
/** *
cut sentences
* @author zhengcy
*
*
/@SuppressWarnings ("Serial") Public
class Splitsentencebolt extends Basebasicbolt {
@Override public
void Declareoutputfields (outputfieldsdeclarer declarer) {
//defines the field description
declarer.declare ("word") to be passed to the next bolt;
}
@Override public
Void execute (Tuple input, basicoutputcollector collector) {
String sentence = Input.getstringbyfield ("sentence");
string[] Words = Sentence.split ("");
for (String word:words) {
//Send Word
collector.emit (new Values (word));
}
}
}
Step Three: Count the words Bolt
Import Java.util.HashMap;
Import Java.util.Map;
Import Org.apache.storm.task.TopologyContext;
Import Org.apache.storm.topology.BasicOutputCollector;
Import Org.apache.storm.topology.OutputFieldsDeclarer;
Import Org.apache.storm.topology.base.BaseBasicBolt;
Import Org.apache.storm.tuple.Tuple; /** * Statistical Word * @author zhengcy * */@SuppressWarnings ("Serial") public class Wordcountbolt extends Basebasicbolt {PR
Ivate map<string, long> counts = null; @SuppressWarnings ("Rawtypes") @Override public void Prepare (Map stormconf, Topologycontext context) {this.counts =
New hashmap<string, long> (); } @Override public void Cleanup () {//Topology end execution for (String Key:counts.keySet ()) {System.out.println (key
+ ":" + this.counts.get (key)); }} @Override public void execute (Tuple input, basicoutputcollector collector) {String word = Input.getstringbyfiel
D ("word");
Long count = This.counts.get (word); if (count = = null) {count = 0L;
} count++;
This.counts.put (Word, count);
} @Override public void Declareoutputfields (Outputfieldsdeclarer declarer) {}}
Fourth Step: Create a Topology topology
The stream groupings the spouts and the Bolts in series to form the flow data processing, and sets the number of parallel between spout and bolt processing.
Topology run mode: Local mode and distributed mode.
Import Org.apache.storm.Config;
Import Org.apache.storm.LocalCluster;
Import Org.apache.storm.StormSubmitter;
Import Org.apache.storm.topology.TopologyBuilder;
Import Org.apache.storm.tuple.Fields; /** * Word Statistics topology * @author zhengcy * */public class Wordcounttopology {public static void main (string[] args) throws E
xception {Topologybuilder builder = new Topologybuilder ();
Builder.setspout ("spout", New Sentencespout (), 1);
Builder.setbolt ("Split", New Splitsentencebolt (), 2). shufflegrouping ("spout");
Builder.setbolt ("Count", New Wordcountbolt (), 2). fieldsgrouping ("Split", New Fields ("word"));
Config conf = new config ();
Conf.setdebug (FALSE);
if (args! = null && args.length > 0) {//cluster mode conf.setnumworkers (2);
Stormsubmitter.submittopology (Args[0], conf, builder.createtopology ());
} else {//local mode localcluster cluster = new Localcluster ();
Cluster.submittopology ("Word-count", conf, Builder.createtopology ()); Thread.Sleep (10000);
Cluster.shutdown ();
}
}
}
Third, running the topology
1. Local mode
Local mode is what we use for local development debugging, no need to deploy to storm cluster to run, run Java's main function can be
Local mode
localcluster cluster = new Localcluster ();
Cluster.submittopology ("Word-count", conf, Builder.createtopology ());
2. Cluster mode
Generate the code jar and place it in a server directory, for example:/usr/local/storm, and run the Storm command in the/usr/local/storm/bin directory to submit the topology.
>./storm jar: /stormtest.jar cn.storm.WordCountTopology Wordcounttopolog
See if the Storm UI commits a successful topology
Iv. viewing the log of topologies in cluster mode
We look at the running topology with no error.
First step: Access Storm Management page
Example: http://192.168.2.200:8081/index.html
Visit the Storm Management page and click on the corresponding topology and see which servers the topology is distributed to
Step Two: View the log
TAIL-F logs/workers-artifacts/topology ID/Port/worker.log
For example:
>tail-f Logs/workers-artifacts/wordcounttopology-1-1497095813/6700/worker.log