Storm High reliability:Storm has a mechanism to ensure that every tuple emitted from spout is fully processed.
Reliability Mechanism:1. Node failure migration when a worker on one node has a problem, it automatically cuts to the other node;
2. Full Message Delivery
- A message (tuple) sent from spout may cause hundreds or thousands of messages to be created based on this message
- Example of "Word count":
- The storm task reads a complete English sentence from the data source each time, breaking the sentence into separate words, and finally, outputting each word in real time and the number of times it appears.
- Each message sent from spout (every English sentence) triggers a lot of messages to be created, and the words that are separated from the sentences are the new messages that are created.
- These messages form a tree structure that we call a "tuple tree"
Under what conditions does storm think that a message sent from spout is fully processed?
- a tuple tree no longer grows
- Any messages in the tree are identified as "handled"
Reliability Summary:
- Whenever a new node is created in a tuple tree, we need to explicitly notify Storm;
- When we're done with a separate message, we need to tell storm the change state of the tuple tree.
- With the above two steps, Storm can detect when a tuple tree is fully processed and invoke the associated ACK or fail method.
- Anchoring (anchoring)
From for notes (Wiz)
4. Storm Reliability