Twitter Storm source code analysis: How Tuple is sent

Source: Internet
Author: User

In this article, let's take a look at how tuple in Storm goes from one tuple to another.

Bolt first calls the emit or emitDirect method of OutputCollector when launching a tuple,
The two methods ultimately call the mk-transfer-fn method in the clojure code:

 
123456 ; worker.clj(defn mk-transfer-fn [transfer-queue](fn [task ^Tuple tuple](.put ^LinkedBlockingQueuetransfer-queue [task tuple])))

In fact, this method only adds a new record (task-id, tuple) to a LinkedBlockingQueue)
The contents in this queue will be processed by the following code:

 
010203040506070809101112131415161718192021222324252627 ; worker.clj; What is this socket?(async-loop(fn [^ArrayList drainer^KryoTupleSerializer serializer]; Extract a task from transfer-queueThis task is actually (task, tuple)(let [felem (.take transfer-queue)](.add drainer felem)(.drainTo transfer-queue drainer))(read-locked endpoint-socket-lock; Get the node ing from node + port to socket(let [node+port->socket @node+port->socket; Get the task ing from task-id to node + porttask->node+port @task->node+port](doseq [[task ^Tuple tuple] drainer]; Obtain the socket corresponding to the task(let [socket(node+port->socket(task->node+port task)); Serialize this tupleser-tuple (.serialize serializer tuple)]; Send this tuple(msg/send socket task ser-tuple)))))

As shown in the code above, tuple is finally sent to the specified task by the msg/send method through socket after being serialized. Noteasync-loopCreates a separate thread to execute the code. Storm will initiate an independent thread to specifically send the message to be sent.

Let's take a look at what this socket is like. This socket is initialized in worker. clj. Check the following code:

 
01020304050607080910111213 ; socket(worker.clj)(swap! node+port->socketmerge(into {}(dofor[[node port :as endpoint] new-connections][endpoint(msg/connectmq-context((:node->host assignment) node)port)])))

The code above shows that the socket is actually created by msg/connect. So what is msg/connect doing? This method is defined in protocol. clj:

 
123456 (defprotocol Context(bind [context virtual-port])(connect [context host port])(send-local-task-empty [context virtual-port])(term [context]))

The definition is just an interface. The specific implementation is in zmq. clj. Zmq is short for ZeroMQ. It can be seen that the supervisor of storm uses zeromq to transmit tuple.

ZMQCOntext in zmq. clj implements the Context interface:

 
01020304050607080910111213141516171819202122232425262728293031323334 (deftype ZMQContext [context linger-ms ipc?]; Implement the Context InterfaceContext; Pull messages from the given virtual-port(bind [this virtual-port](-> context(mq/socket mq/pull)(mqvp/virtual-bind virtual-port)(ZMQConnection.))); Push messages to the specified host and port)(connect [this host port](let [url (if ipc?(str "ipc://" port "ipc")(str "tcp://" host ":" port))](-> context(mq/socket mq/push)(mq/set-linger linger-ms)(mq/connect url)(ZMQConnection.)))); Send an empty message to the local virtual-port(send-local-task-empty [this virtual-port](let [pusher(-> context(mq/socket mq/push)(mqvp/virtual-connect virtual-port))](mq/send pusher (mq/barr))(.close pusher)))(term [this](.term context)); Implement the ZMQContextQuery InterfaceZMQContextQuery(zmq-context [this]context))

Summarize the tuple processing and creation processes of Twitter Storm:

  1. Bolt creates a tuple.
  2. The Worker groups the tuple and the task-id of the tuple to be sent into a queue (queue blockingqueue ).
  3. A separate thread (the thread created by async-loop) will fetch each tuple in the sending queue for processing.
      • Worker creates a zeromq connection from the current task to the target task.
      • Serialize the tuple and send the tuple through the zeromq connection.

Recommended reading:

Twitter Storm installation configuration (cluster) Notes

Install a Twitter Storm Cluster

Notes on installing and configuring Twitter Storm (standalone version)

Storm practice and Example 1

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.