The spout implements the Mk-threads interface for creating a message loop main function that corresponds to executor.
Defmulti mk-threads Executor-selector
The main message loop of the Mk-threads function is implemented by the Async-loop method, and if the passed-in function is a factory method, it is initialized the first time the method is called and returns the function used for the message loop.
spout Input handler function
The input processing function of spout takes a message from the receive queue in a non-blocking manner:
(Disruptor/consume-batch receive-queue Event-handler)
Processing function Prototypes:
TUPLE-ACTION-FN (FN [task-id ^tupleimpl tuple])
Function Description:
- Determine the source of the message (as determined by the stream ID), and if the message is from system_tick_stream_id, call the Rotate method of the Pending object, This method causes the sending message to time out. During the message tracking process, spout uses the pending object to hold all messages sent, using System_tick messages as a signal to clean up the cached message.
- If the message is from metrics_tick_stream_id, call the Metrics-tick method to tidy up the current statistics and send it to the information statistics Bolt node.
- Other sources can only be ack/fail streams, get the ID of the message, clear the data corresponding to the ID from the pending array, and return the data corresponding to that ID, returning the data format:
[Stored-task-id,spout-id,tuple-finished-info,start-time-ms]
Stored-task-id the Taskid,spout-id for sending a message is the accompanying messageid,tuple-finished-info containing the STREAMID and message content that sent the message, Start-time-ms stores the current time when the message is being executed for statistical sampling. Otherwise, it is empty.
- If the message is from Acker-ack-stream-id, the ack-spout-msg callback method is called to process the message (the ACK-SPOUT-MSG function primarily invokes the ACK callback method of the user's Spout object. Ack spout Msg-id), If the message is from Acker-fail-stream-id, call the Fail-spout-msg method to process (the FAIL-SPOUT-MSG function primarily calls the fail callback method of the Spout object. Fail spout Msg-id)
Spout Message Send function
spout uses the SEND-SPOUT-MSG function to send messages.
Function Prototypes:
Send-spout-msg (FN [Out-stream-id values Message-id Out-task-id])
Parameter description:
Out-stream-id is the message content of the message streamid,values, Message-id is the MessageID of the message, indicates whether the message is tracked, Out-task-id is the receiving end of the message TaskID, used to send a message to the direct stream .
Method Description:
- Call the TASK-FN function to get the main function of the message's target TASKID,TASK-FN as a task, which determines which task will receive the message for that stream based on the Streamid and message contents of the message, and how to receive the stream's message. For the direct grouping method, the main purpose is to check whether the target Out-task-id is to receive the message in a direct grouping way.
- The TASK-FN function internally obtains the target task collection by calling the group function from the component to a stream, returned from the Outbound-components function.
- Call the TRANSFER-FN function to send a message that is created by the MK-EXECUTOR-TRANSFER-FN function and sends the message to the executor send queue.
MK-EXECUTOR-TRANSFER-FN function Prototypes:
(Defn MK-EXECUTOR-TRANSFER-FN [Bath-transfer->worker])
Function Description:
- Bath-transfer->worker for the output disruptor Queue for executor.
- There are three overloads of the function, the main difference being whether the sent message is cached.
- When the Disruptor queue receiver is not started or is out of space, use Overflow-buffer to temporarily store the information that will be sent.
- If Overflow-buffer is not empty when the message is sent, it is indicated that the exception has occurred and there is not enough space in the disruptor queue where the message is placed directly into the overflow-buffer to improve efficiency.
- In the spout message loop, the data in the Overflow-buffer is sent preferentially.
Initialization of the Spout object
The open operation of each spout object in executor is called, and the Open method is called only once.
Description of the initialization process:
- Wait for the corresponding topology to be active.
- For each spout in executor, the TASKS-FN function and the send-spout-msg function are obtained, and the SEND-SOPUT-MSG function chooses the target TASKS-FN using the TaskID function. Each spout defines the Send-spout-msg method, which is a task-level non-executor share.
- Invokes the open callback method of the spout object, while instantiating Spoutoutputcollector, which is used primarily to invoke send-spout-msg to send messages.
- Call the consumer-started! function to open the receive queue. Because the receive queue is not open before the open function is called, it is best not to send a message in the spout open function.
Spout message loop
Async-loop: This function uses a thread to loop through the incoming function AFN, requiring the called AFN to return a time interval after execution and to wait between the next call and the event interval.
Process Description:
- The messages in the receive queue are processed in a non-blocking manner.
- Prioritize the sending of data in Overflow-buffer.
- Spout can send a message if Overflow-buffer is empty and pending stores less than max-spout-pending, or if max-spout-pending,topology is not set to active Sleeps 100 milliseconds If the topology is inactive.
- Call Spout's Nexttuple callback method to send the message, Nexttuple will use the emit or Emitdirect method of the incoming Spoutoutputcollector to send the message. And eventually calls the SEND-SPOUT-MSG function to send the message to the executor message queue, send-spout-msg updates the emitted-count.
- If the Emitted-count is the same as the curr-count of the last message, it indicates that the Nexttuple function did not send a message, and the Emptyemit method of Spout-wait-strategy was called to process it, and the default hibernation was 1 milliseconds ( The Topology_sleep_spout_wait_strategy_time_ms configuration item determines the sleep time.
Storm series (15) architecture Analysis Executor-spout