The structure of the Storm_0008_structure-of-the-codebase_storm code base

Source: Internet
Author: User
Tags ack zookeeper

Http://storm.apache.org/releases/1.0.1/Structure-of-the-codebase.html

    • Structure of the CODEBASE
The source code is divided into three separate layersfirst: At the very beginning, Storm was designed to support multiple languages, Nimbus is a thrift service, and topologies is a thrift type of structure. The use of thrift allows storm to be used in any language. Second: All storm interfaces are declared as Java interfaces, although internal implementations have a lot of clojure but all use must pass through the Java API. This means that all of the features can be called through Java. Third: Storm is a large part of the implementation of Clojure, but there are nearly half of the Java code, but because Clojure expression is strong, so the majority of the implementation of the logic is in Clojure. The following sections explain each section in detail
    • Storm.thrift
      • Org.apache.thrift7
      • Each component has a component ID
      • Spouts and bolts have the same thrift definition
      • A thrift definition for bolts contains componentobject structure and Componentcommon structure
        • Componentobject, which defines the implementation of the bolt, may be one of the following three types
            • A serialized Java object that implements the Ibolt
            • A shellcomponent represents the implementation of other languages, and when the bolt is declared in this way, it causes storm to instantiate a Shellbolt object to handle the communication between the JVM-based worker process and the part of the non-JVM-based implementation
            • A javaobject structure that is passed to the Storm class name and constructed parameters to instantiate that Bolt. This is useful if you want to define a topology in a non-JVM language. This allows you to use JVM-based spouts and bolts instead of creating and serializing a Java object yourself.
        • Componentcommon, which defines all the other parts of the component, contains the following:
            • This component launches what streams, already each stream of metadata
            • What stream does this component consume?
            • The degree of parallelism of this component
            • Configuration information for component declarations
      • It is important to note that the spouts structure also contains a Componentcommon attribute, so spouts can also declare to consume other input streams. But the Java API does not provide a way for spout to consume other streams, and if you put any input declarations to spout, you will get an error when committing the topology. Spouts has an input declaration attribute that is not intended for use by the user, but for storm itself. Storm adds an implicit stream or bolts to the topology to set up the acking framework, where each spout has two implicit streams from Acker bolts. With these flows, Acker sends an ACK or fail message once the tuple is detected to be completed or failed. The topology of these users needs to be converted into a run-time topology.
    • Java interfaces
      • Storm's interface is mainly declared as Java interface, the main interface has the following three
          • Irichbolt
          • Irichspout
          • Topologybuilder
      • The policy for most interfaces is:
          • Declare an interface with a Java interface
          • When appropriate, provide a base class that provides a default implementation
      • A subtle difference is the difference between ibolt/ispout and irichbolt/irichspout. The main difference is the rich version of the interface, more Declareoutputfields method. The reason for this difference is because the declaration of the output fields for each output stream needs to be part of the thrift structure (guaranteed to be specified from a variety of languages), but as a user, you want to be able to declare the stream as part of your own class. The thing Topologybuilder do is when you create a thrift representation, call Declareoutputfields to get the declaration and convert it to the thrift structure.
    • Realize
      • All functions are declared through the Java interface to ensure that each feature can be called through Java.
      • Although the two code quantities are the same, Clojure implements the primary logic. But there are two exceptions, the DRPC and transactional topologies implementations. These two are implementations of pure Java. This serves as an abstraction to achieve a higher degree of abstraction for storm. The code for these two modules is in the Org,apache.storm.coordination,org.apache.storm.drpc,org.apache.storm.transactional package.
      • The following is a summary of the main Java packages and the features of the Clojure namespace:
      • Java Packages
        • Org.apache.storm.coordination: Enables batch processing over storm
        • Org.apache.storm.drpc:DRPC a higher degree of abstraction
        • Org.apache.storm.generated:storm Generating Thrift Code
            • Using the thrift this fork, simply rename the package to Org.apache.thrift7 to avoid conflicts with the thrift version
        • Org.apache.storm.grouping: interface that contains custom stream groupings
        • Org.apache.storm.hooks: An interface that contains hooks into various events, such as when a task sends a tuple, when a tuple is ack.
        • Org.apache.storm.serialization: Tuples for storm serialization and deserialization, based on Google Kyro
        • Org.apache.storm.spout: Defines the spout and related interfaces
        • Org.apache.storm.task: Defines bolts and related interfaces, Topologycontext is also defined in this
        • Org.apache.storm.testing: Contains a series of test bolts and tools for use in unit tests
        • Org.apache.storm.topology: Based on the Java layer of the underlying thrift architecture, provides a neat, pure Java API for storm.
        • Org.apache.storm.transactional: Implementation of a transactional topology
        • Implementation of the tuple data model of Org.apache.sorm.tuple:storm
        • Org.apache.storm.utils: The data structure and the messy tools used by the code base
      • Clojure namespaces
        • Org.apache.storm.bootstrap: Contains a macro that can import the various classes and namespaces required by the entire code base
        • Org.apache.storm.clojure: Clojure DSL implemented for Storm
        • The zookeeper logic required by the Org.apache.storm.cluster:storm daemon is encapsulated in this file. The code for this file manages the description of the cluster state in the zookeeper file system.
        • Org.apache.storm.command.*: A variety of commands are implemented for the Storm command-line client, both very short
        • Org.apache.storm.config:Clojure Read and parse code for the configuration file.
        • The implementation of the Org.apache.storm.daemon.acker:acker Bolt is an important part of Storm's assurance of data processing.
        • The implementation of common functions used by the Org.apache.storm.daemon.common:storm daemon, such as obtaining the IDs of topologies based on the name of the topology.
        • Implementation of Org.apache.storm.daemon.drpc:DRPC Server
        • The realization of Org.apache.storm.daemon.nimbus:Nimbus
        • The realization of Org.apache.storm.daemon.supervisor:Supervisor
        • Implementation of separate tasks for org.apache.storm.daemon.task:spout and bolts
        • Org.apache.storm.daemon.worker: Implementation of the work process
        • Org.apache.storm.event: Implements an asynchronous function executor.
        • Org.apache.storm.log: Defines a function to log in log4j
        • Org.apache.storm.messaging.*: Defines a higher-level interface for Point-to-point communication, Local mode uses the memory-based Java queue, uses ZEROMQ on the cluster, and the generic interface is defined in PROTOCOL.CLJ.
        • Org.apache.storm.stats: The implementation of a statistical summary routine that sends these stats information to ZK to be displayed in the UI
        • Org.apache.atorm.testing: Implementation of the facility to test the topology
        • Org.apache.storm.thrift: Clojure Wrapper, which surrounds the thrift API, makes using the thrift structure more enjoyable
        • Org.apache.storm.timer: Background Timer
        • Implementation of the Org.apache.storm.ui.*:storm UI
        • Org.apache.storm.util: Common tools used throughout the code base
        • Org.apache.storm.zookeeper: The clojurewrapper that contains the Zookeeper API implements high-level operations such as mkdirs and delete recursive.

The structure of the Storm_0008_structure-of-the-codebase_storm code base

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.