Multi-lang Protocol of storm/Multi-language Protocol translation

Source: Internet
Author: User
Tags ack emit stdin

Original address:

Http://storm.apache.org/releases/1.0.1/Multilang-protocol.html This Protocol trial version after 0.7.1 Support for multiple languages through the Shellbolt and Shellspout and Shellprocess classes implements the Ibolt and Ispout interfaces, Also implements a protocol for executing scripts or programs through the shell using Java's Processbuilder class When using this protocol in Java, it is necessary to create an inherited Shellbolt bolt that also uses Declareoutputfields to declare the output fields a simple protocol, The stdin and stdout can be decoded in JSON format to support the vast majority of languages to run shell scripts on the cluster, which is located in the resources/directory of the submitted jar package, but at the time of local pattern development and testing, The address of the shell resource can only be in classpath
    • The end of all protocols uses a line-reading mechanism, so ensure that new rows are pruned from the input and attached to the output
    • All JSON-type inputs and outputs are terminated by any line containing the end of the row, making sure that it does not appear to be parsed in JSON.
Some of the following points are simple stdin and stdout describe the need to be aware of the place
  1. Initial handshake
    1. Initialize handshake for various types of shell components
    2. For stdin: Set some info
      1. This is a JSON object that contains the configuration, PID directory, and a topo content
      2. {"conf": {"Topology.message.timeout.secs": 3,//Various configuration information, can be written in accordance with the above format}, "Piddir": "...", "context": {"task->component": {"1": "Example-spout", "2": "__ Acker "," 3 ":" Example-bolt1 "," 4 ":" Example-bolt2 "}," TaskID ": 3,//the following settings only try the version after 0.10.0"ComponentID": "Example-bolt" "stream->target->grouping": {"default": {"Example-b Olt2 ": {" type ":" SHUFFLE "}}}," Streams ": [" Default "]," Stream->outputfields ": {" Defau                    LT ": [" word "]}," source->stream->grouping ": {" example-spout ": {" Default ": { ' Type ': ' Fields ', ' Fields ': [' word ']}} ' Sourc E->stream->fields ": {" example-spout ": {" Default ": [" word "]}}}
      3. The script should be able to create an empty file named PID, which allows supervisor to know that the PID can then close the process
      4. Since the 0.10.0 version of the shell component can be configured by the context has been improved, basically contains all aspects of the content, can be used by the JVM. Important features are: stream->target->grouping and source->stream->grouping can determine the input source and output target respectively
    3. STDOUT: Use {"pid": 1234} to log the PID.
  2. Spouts
    1. The Shell spout is synchronous and breaks in the while loop without input
    2. To stdin
        1. Next is Ispout's nexttuple, so use: {"command": "Next"}
        2. Ack this uses {"command": "ACK", "id": "1231231"}
        3. Fail this uses {"command": "Fail", "id": "1231231"}
    3. to stdout
        1. The result could be something fired or a logs sequence.
        2. Emit like this.
          {    "command": "emit",    // The id for the tuple. Leave this out for an unreliable emit. The id can    // be a string or a number.    "id": "1231231",    // The id of the stream this tuple was emitted to. Leave this empty to emit to default stream.    "stream": "1",    // If doing an emit direct, indicate the task to send the tuple to    "task": 9,    // All the values in this tuple    "tuple": ["field1", 2, 3]}
        3. Logs like this.
          {    "command": "log",    // the message to log    "msg": "hello world!"}
    4. to stdout
        1. Use the Sync command to make the launch and logging stop.
          {"command": "sync"}
          Until another next or ACK or fail command is sent, Shellspout reads your output and ispout, and if there is no stream to launch it should sleep before sync, because Shellspout will not automatically sleep
  3. Bolts
    1. The Shell Bolt protocol is asynchronous and once the stdin stream is available you will receive this stream, which you can write to stdout at any time, by launching, Ack,fail,log, etc.
    2. For stdin: is a tuple but is a JSON-type tuple
      {    // The tuple‘s id - this is a string to support languages lacking 64-bit precision    "id": "-6955786537413359385",    // The id of the component that created this tuple    "comp": "1",    // The id of the stream this tuple was emitted to    "stream": "1",    // The id of the task that created this tuple    "task": 9,    // All the values in this tuple    "tuple": ["snow white and the seven dwarfs", "field2", 3]}
    3. To stdout: is a ack,fail,log, or emit.
      1. Here's an example of a emit.
        {    "command": "emit",    // The ids of the tuples this output tuples should be anchored to    "anchors": ["1231231", "-234234234"],    // The id of the stream this tuple was emitted to. Leave this empty to emit to default stream.    "stream": "1",    // If doing an emit direct, indicate the task to send the tuple to    "task": 9,    // All the values in this tuple    "tuple": ["field1", 2, 3]}
        If you do not launch immediately, you will receive the task ID of the transmit stream. Because of the asynchronous nature, when fired after reading, the task ID may not be received and may be read to the task ID of the previous launch or to a new launch process. But anyway, you will receive the task ID in the order in which it was launched.
      2. The look of an ACK
        {    "command": "ack",    // the id of the tuple to ack    "id": "123123"}
      3. The look of Fail
        {    "command": "fail",    // the id of the tuple to fail    "id": "123123"}
      4. The look of log
        {    "command": "log",    // the message to log    "msg": "hello world!"}
  4. Handling heartbeats (version after 0.9.3)
      1. In storm0.9.3, the heartbeat has been used in shellspout or Shellbolt and their multi-lingual sub-processes to monitor the hanging and zombie of these sub-processes
      2. for spout
        1. Because it is synchronous, the child process always sends sync at the end of next (), so there is no need to do too much to support heartbeat detection. But you can't let the child process sleep longer than timeout in next ().
      3. For Bolt
        1. Because it is asynchronous, Shellbolt always sends the heartbeat packet periodically, in the following format
          {    "id": "-6955786537413359385",    "comp": "1",    "stream": "__heartbeat",    // this shell bolt‘s system task id    "task": -1,    "tuple": []}
          When a child process receives a heartbeat tuple, sync is sent to Shellbolt

Multi-lang Protocol of storm/Multi-language Protocol translation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.