First, let's look at the basic concepts in storm with a comparison of Storm and Hadoop.
|
Hadoop |
Storm |
System roles |
Jobtracker |
Nimbus |
Tasktracker |
Supervisor |
Child |
Worker |
App Name |
Job |
Topology |
Component interface |
Mapper/reducer |
Spout/bolt |
Next, let's look at these concepts in more detail.
A, Nimbus: responsible for resource allocation and task scheduling.
B, Supervisor: Responsible for accepting the tasks assigned by Nimbus, start and stop the worker process which belongs to oneself management.
C, Worker: a process that runs the logic of a specific processing component.
D, each Spout/bolt thread in Task:worker is called a Task. After storm0.8, the task is no longer corresponding to the physical thread, and the same Spout/bolt task may share a physical thread called executor.
The following diagram depicts the relationships between the above roles.
650) this.width=650; "class=" decoded "alt=" Http://www.searchtb.com/wp-content/uploads/2012/08/deploy0.jpg "src=" Http://www.searchtb.com/wp-content/uploads/2012/08/deploy0.jpg "style=" white-space:normal; "/>
A real-time application running in Topology:storm because the flow of messages between components forms a logical topological structure.
Spout: The component that produces the source data stream in a topology. Typically, spout reads data from an external data source and then translates it into the source data inside the topology. Spout is an active role with a nexttuple () function in its interface, and the storm framework calls this function continuously, so that the user can generate the source data in it.
Bolt: A component that accepts data in a topology and then executes the processing. Bolts can perform any operation such as filtering, function manipulation, merging, writing the database, and so on. Bolt is a passive role with an execute (Tuple input) function in its interface, which is called when the message is received, where the user can perform the action he or she wants.
Tuple: The basic unit of a single message delivery. It was supposed to be a key-value map, but because the field names of the tuple passed between the components were already defined beforehand, the tuple would simply fill in each value in order, so it would be a value list.
Stream: A stream is formed by a stream of tuples.
The basic concept of storm record--3--storm