Bolt is a unit of data processing in topology and a programming unit for the process of storm. All of the processing in topology is done in these bolts, and programmers can implement custom processes such as filtering, functions, aggregation, joins, and so on. In the case of complex computational processes, multiple steps are often required and multiple bolts are used.
Bolts can send data items to multiple data streams (stream). Programmers can first use the Declarestream () method of the Outputfieldsdeclarer class to declare multiple streams, specify the stream to which the data will be sent, and then send the data using Spoutoutputcollector's emit method.
When you declare an input stream for a bolt, you can receive these specified streams from other components. When you receive all the streams for a component, you need to declare the received procedure individually in your program. The Inputdeclarer object receives the default stream from a component by default.
// receives the default stream from a component named "1". declarer.shufflegrouping ("1")
IBolt and IComponent interfaces
Ibolt Interface:
//When the component's task is initially initialized, called by the worker process (worker) in the cluster, prepare () is used to instantiate the Bolt's given run-time task, which is called by one of the processes in the cluster to provide the environment in which the bolt runs.
The Sormconf object maintains configuration information for the bolt in storm. (from topology); The context object is a contextual object that gets information about the component's run-time task. (for example, the location of all tasks in topology, including the ID of the task, component ID, input and output information, etc.)
The Collector object is used to send data items from this bolt. Data items can be sent at any time, including calling the Open () and close () methods.
voidPrepare (Java.util.Map stormconf,topologycontext context,outputcollector collector)//receives a data item and processes
This method is used to receive a data item (Tuple) and can send the result of processing as a new data item (emit), which is the most important method that Bolt needs to implement.
The parameter imput is a data item object that contains a large number of metadata (metadata), including its components, streams, tasks, and so on. The value in the data item can be obtained through the GetValue () method of the tuple class.
voidExecute (Tuple input)//called when the Ibolt will be closedvoidCleanup ()
The method of the tuple class, which is an object of this class as input to the Execute () method. (Method Example: int size (); int Fieldindex (java.lang.String field); ......)
Many methods can be sorted into the following five categories:
1. Methods to get Properties. (Size (), Fieldindex (), and contains () three methods)
2, the method of obtaining metadata. (Getmessageid (), Getsourcecomponent (), Getsourcetask (), Getsourcestreamid (), and Getsourceglobalstreamid () methods)
Where MessageID is assigned by certain rules when data items are created.
3, the method of obtaining the value according to the domain. (GetValue () and multiple get methods for specific data types)
4. The method of obtaining a value based on the name of the domain. (This category includes GetFields (), GetValues (), and select () methods)
5. Gets the value of a tuple or the method of the domain list. (GetFields (), GetValues (), and select () methods)
Gets the list of all domains, value lists, and subset of value lists for the data item, respectively.
Storm's data Processing programming unit: Bolt Learning Finishing