Storm multi-language support

Source: Internet
Author: User
Document directory
  • Using Non JVM languages ages with storm
  • Dsls and multilang adapters

Using Non JVM languages ages with storm

Https://github.com/nathanmarz/storm/wiki/Using-non-JVM-languages-with-Storm

Multilang Protocol

Https://github.com/nathanmarz/storm/wiki/Multilang-protocol

 

Using Non JVM languages ages with storm

The JVM language is relatively simple. You can directly improve DSL encapsulation of Java.
Non-JVM languages are a little more complicated. Storm is divided into two parts: topology and component (blot and spout)

Topology is easier to implement in other languages. Because Nimbus is the thrift server, all languages are eventually converted to the thrift structure. in fact, the logic of topology itself is relatively simple and can be written directly in Java. There is not much need to use other languages.

For component, similar to hadoop, component is executed using shell process, and stdin and stdout are used as the communication between component (JSON messages over stdin/stdout)
Communication involves communication protocols, that is, how each component generates another component can understand JSON message. Storm communication protocols are relatively simple. For more information, see
Multilang Protocol
Currently, storm implements python, Ruby, and fancy versions. If you need to support other languages, it is easy to implement this protocol by yourself.
In fact, it is necessary for component to support multiple languages, because many analysis or statistics modules do not necessarily use Java. If porting is troublesome, it is not as simple as topology.

Two pieces: Creating topologies and implementing spouts and bolts in other ages

  • Creating topologies in another language is easy since topologies are just thrift structures (link to storm. Thrift)
  • Implementing spouts and bolts in another language is called a "multilang components" or "Shelling"
    • Here's a specification of the Protocol: multilang Protocol
    • The thrift structure lets you define multilang components explicitly as a program and a script (e.g., Python and the file implementing your bolt)
    • In Java, you override shellbolt or shellspout to create multilang Components
      • Note that output fields declarations happens in the thrift structure, so in Java you create multilang components like the following:
        • Declare fields in Java, processing code in the other language by specifying it in constructor of shellbolt
    • Multilang uses JSON messages over stdin/stdout to communicate with the subprocess
    • Storm comes with Ruby, Python, and fancy adapters that implement the Protocol. show an example of Python
      • Python supports emitting, anchoring, ACKing, and logging
  • "Storm shell" Command makes constructing jar and uploading to Nimbus easy
    • Makes jar and uploads it
    • Callyour program with host/port of nimbus and the jarfile ID

 

Bolts can be defined in any language. Bolts defined in other languages are executed as subprocesses. Storm uses JSON messages to communicate with these subprocesses through stdin/stdout.
This communication protocol is a database with only 100 rows. The storm team developed Ruby, Python, and fancy versions for these databases.

The bolt definition of the Python version, unlike the Java version, inherits the shellbolt class.

public static class SplitSentence extends ShellBolt implements IRichBolt {    public SplitSentence() {        super("python", "splitsentence.py");    }     public void declareOutputFields(OutputFieldsDeclarer declarer) {        declarer.declare(new Fields("word"));    }}

The following is the definition of splitsentence. py:

import stormclass SplitSentenceBolt(storm.BasicBolt):    def process(self, tup):        words = tup.values[0].split(" ")        for word in words:          storm.emit([word])SplitSentenceBolt().run()

The above example uses Python component. First, it inherits shellbolt, indicating that the input and output are completed through shell stdin/stdout.

Then, Python splitsentence. py is directly called as a sub-process.

In python, first import storm, which encapsulates the communication protocol, a very simple 100 lines, you can see

 

 

Dsls and multilang adapters

Https://github.com/nathanmarz/storm/wiki/DSLs-and-multilang-adapters

  • Scala DSL
  • Jruby DSL
  • Clojure DSL
  • Storm/Esper integration: streaming SQL on top of storm
  • Io-STORM: Perl multilang Adapter
  • Storm-PHP: PhP multilang Adapter

As mentioned above, for JVM languages, it is easy to encapsulate Java and then provide DSL. the above lists all officially provided DSL

Take clojure as an example to learn more.

Clojure DSL

Storm comes with a clojure DSL for defining spouts, bolts, and topologies. the clojure DSL has access to everything the Java API exposes, so if you're a clojure user you can code storm topologies without touching Java at all.

Https://github.com/nathanmarz/storm/wiki/Clojure-DSL

 

Defining a non-JVM language DSL for storm

Https://github.com/nathanmarz/storm/wiki/Defining-a-non-jvm-language-dsl-for-storm

For non-JVM languages, storm shell commands can also be used to implement DSL-like

There's a "Storm shell" command that will help with submitting a topology. Its usage is like this:

storm shell resources/ python topology.py arg1 arg2

Storm shell will thenPackage resources/into a jar, upload the jar to Nimbus, And call your topology. py script like this:

python topology.py arg1 arg2 {nimbus-host} {nimbus-port} {uploaded-jar-location}

Then you can connect to Nimbus using the thrift API and submit the topology, passing {uploaded-jar-location} into the submittopology method. For reference, here's the submittopology definition:

void submitTopology(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite);

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.