Kafka+storm+hbase

Source: Internet
Author: User
Tags auth prepare serialization split zookeeper log4j

This blog is based on the following software:


Centos 7.3 (1611)
kafka_2.10-0.10.2.1.tgz
zookeeper-3.4.10.tar.gz
hbase-1.3.1-bin.tar.gz
apache-storm-1.1.0.tar.gz
hadoop-2.8.0.tar.gz
jdk-8u131-linux-x64.tar.gz
IntelliJ idea 2017.1.3 x64
IP role
172.17.11.85 Namenode, Secondarynamenode, Datanode, Hmaster, Hregionserver
172.17.11.86 DataNode, Hregionserver
172.17.11.87

DataNode, Hregionserver


1. First of all, the idea of a kafka–>storm

I use a producer to give a fixed topic under production data

public class Producer {private Final kafkaproducer<string, string> Producer;

    Private final String topic;
        Public Producer (String topic) {Properties props = new Properties ();
        Props.put ("Bootstrap.servers", "172.17.11.85:9092,172.17.11.86:9092,172.17.11.87:9092");
        Props.put ("Client.id", "demoproducer");
        Props.put ("Batch.size", 16384);//16m props.put ("linger.ms", 1000); Props.put ("Buffer.memory", 33554432);//32m props.put ("Key.serializer", "org.apache.kafka.common.serialization.Str
        Ingserializer ");

        Props.put ("Value.serializer", "Org.apache.kafka.common.serialization.StringSerializer");
        Producer = new kafkaproducer<> (props);
    this.topic = topic; } public void Producermsg () throws interruptedexception {String data = "Apache Storm was a free and open sour Ce distributed Realtime Computation system Storm makes it easy-reliably process unbounded streams of data doing for Realtime processing what Hadoop does for batch processing. Storm is simple, can being used with any programming language, and are a lot of fun to use!\n "+" Storm have MA NY use Cases:realtime analytics, online machine learning, continuous computation, distributed RPCs, ETL, and more. Storm is FAST:A benchmark clocked it in over a million tuples processed per second per node.
                It is scalable, fault-tolerant, guarantees your data would be processed, and are easy-to-set up and operate.\n "+ "Storm integrates with the queueing and database technologies your already use. A Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the Stre AMS between each stage of the computation however needed.
        Read more in the tutorial. ";
        data = Data.replaceall ("[\\pP '" ""] "," ");
        string[] Words = Data.split ("");

        Random _rand = new Random ();
        Random rnd = new Random ();
    int events = 10;    for (long nevents = 0; nevents < events; nevents++) {Long runtime = new Date (). GetTime ();
            int lastipnum = Rnd.nextint (255);
            String IP = "192.168.2." + lastipnum;
            String msg = Words[_rand.nextint (words.length)];
                try {producer.send (New producerrecord<> (topic, IP, msg));
            SYSTEM.OUT.PRINTLN ("Sent message: (" + IP + "," + msg + ")");
            } catch (Exception e) {e.printstacktrace ();
    }} thread.sleep (10000); } public static void Main (string[] args) throws Interruptedexception {Producer Producer = new Producer (consta Nts.
        TOPIC);
    Producer.producermsg ();

 }
}

The producer will split the punctuation in two sentences and then divide it into a single word, then produce it under the subject of execution.
This should be no problem, the next is the consumer, but also the storm's spout kafkaspout:


kafkaspoutconfig<string, string> kafkaspoutconfig = Kafkaspoutconfig
                    . Builder (Args[0], args[1])
                    . SetProp (Consumerconfig.enable_auto_commit_config, "true")
                    . SetProp (Consumerconfig.auto_commit_interval_ms_ CONFIG,
                    SetProp (Consumerconfig.session_timeout_ms_config, 30000).
                    Setoffsetcommitperiodms (10000)
                    . Setgroupid (args[2])
                    . Setmaxuncommittedoffsets (+).
                    Setfirstpolloffsetstrategy ( KafkaSpoutConfig.FirstPollOffsetStrategy.LATEST)
                    . Build ();



kafkaspout<string, string> kafkaspout = new kafkaspout<> (kafkaspoutconfig);

Consumer (Spout)-Specified subject consumption data is then emitted to the next bolt


public class Wordcountbolt extends Basebasicbolt {
    private map<string, integer> counts = new hashmap<> (); Public

    void Execute (Tuple input, basicoutputcollector collector) {
        String level = Input.getstringbyfield (" Value ");
        Integer count = Counts.get (level);
        if (count = = null)
            count = 0;
        count++;
        Counts.put (level, count);
        System.out.println ("Wordcountbolt Receive:" +level+ ""   +count);
        Collector.emit (level, count.tostring ()));
    }

    public void Declareoutputfields (Outputfieldsdeclarer declarer) {
        declarer.declare (the new fields ("word", "count"));
    }
}
2.storm->hbase

The first thing to do is to copy the Hbase-site.xml configuration file from the cluster

The next step is the API call:


Simplehbasemapper Mapper = new Simplehbasemapper ()
                    . Withrowkeyfield ("word")
                    . Withcolumnfields (New Fields (" Count "))
                    . withcolumnfamily (" result ");
            Hbasebolt Hbasebolt = new Hbasebolt (args[3], Mapper)
                    . Withconfigkey ("HBase");

3. Construction of the entire topology

Builder.setspout ("Kafkaspout", kafkaspout, 1);            Builder.setbolt ("Wordsplitbolt", New Wordsplitbolt (), 2)
//                    . Shufflegrouping ("Kafkaspout");
            Builder.setbolt ("Countbolt", New Wordcountbolt (), 2)
                    . fieldsgrouping ("Kafkaspout", New Fields ("value"));
            Builder.setbolt ("Hbasebolt", Hbasebolt, 1)
                    . Addconfiguration ("HBase", New Hashmap<string, object> ())
                    . Shufflegrouping ("Countbolt");


The next thing is the real point, the point! Focus! Focus!


Key 1:-pom version information for a file


 <dependencies> <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-core</artifactId> <version>1.1.0</version> 
<!--< scope>provided</scope>--> </dependency> <dependency> <groupid>or G.apache.storm</groupid> <artifactId>storm-hbase</artifactId> 
<version>1.1 .0</version> </dependency> <dependency> <groupid>org.apache.storm</ Groupid> <artifactId>storm-kafka-client</artifactId> 
<version>1.1.0</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId><artifactId>hadoop-client</artifactId> 
<version>2.7.3</version><exclusions><exclusion> <groupId>org.slf4j</groupId> <artif actid>slf4j-log4j12</artifactid> </exclusion> 
</exclusions> </ Dependency> </dependencies>


 I am importing hadoop-client 2.7.3, as for why. If I write 2.8.0, that will produce the following exception


Java.lang.NoSuchMethodError:org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosTicket ( Ljavax/security/auth/subject;) Z at Org.apache.hadoop.security.usergroupinformation.<init> (
    usergroupinformation.java:652) ~[hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject (usergroupinformation.java:843) ~[
    Hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.security.UserGroupInformation.getLoginUser (usergroupinformation.java:802) ~[
    Hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.security.UserGroupInformation.getCurrentUser (usergroupinformation.java:675) ~[
    Hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.hbase.security.user$securehadoopuser.<init> (user.java:285) ~[hbase-common-1.1.0.jar : 1.1.0] at org.apache.hadoop.hbase.security.user$securehadoopuser.<init> (user.java:281) ~[ Hbase-common-1.1.0.jar:1.1.0] at Org.apache.hadoop.hbase.security.User.getCurrent (user.java:185) ~[ Hbase-common-1.1.0.jar: 1.1.0] at Org.apache.hadoop.hbase.security.UserProvider.getCurrent (userprovider.java:88) ~[ Hbase-common-1.1.0.jar:1.1.0] at org.apache.storm.hbase.common.hbaseclient.<init> (HBaseClient.java:43) ~[ Storm-hbase-1.1.0.jar:1.1.0] at Org.apache.storm.hbase.bolt.AbstractHBaseBolt.prepare (abstracthbasebolt.java:75) ~[storm-hbase-1.1.0.jar:1.1.0] at Org.apache.storm.hbase.bolt.HBaseBolt.prepare (hbasebolt.java:109) ~[ Storm-hbase-1.1.0.jar:1.1.0] at Org.apache.storm.daemon.executor$fn__5044$fn__5057.invoke (executor.clj:791) ~[ Storm-core-1.1.0.jar:1.1.0] at Org.apache.storm.util$async_loop$fn__557.invoke (util.clj:482) [
    Storm-core-1.1.0.jar:1.1.0] at Clojure.lang.AFn.run (afn.java:22) [Clojure-1.7.0.jar:?] At Java.lang.Thread.run (thread.java:745) [?: 1.8.0_121] 72750 [thread-22-hbasebolt-executor[1 1]] ERROR O.a.s.d.executor-java.lang.nosuchmethoderror: Org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosTicket (Ljavax/security/auth/subject;) Z at org.apache.hadoop.security.usergroupinformation.<init> (usergroupinformation.java:652) ~[
    Hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject (usergroupinformation.java:843) ~[
    Hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.security.UserGroupInformation.getLoginUser (usergroupinformation.java:802) ~[
    Hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.security.UserGroupInformation.getCurrentUser (usergroupinformation.java:675) ~[
    Hadoop-common-2.8.0.jar:?] At Org.apache.hadoop.hbase.security.user$securehadoopuser.<init> (user.java:285) ~[hbase-common-1.1.0.jar : 1.1.0] at org.apache.hadoop.hbase.security.user$securehadoopuser.<init> (user.java:281) ~[ Hbase-common-1.1.0.jar:1.1.0] at Org.apache.hadoop.hbase.security.User.getCurrent (user.java:185) ~[ Hbase-common-1.1.0.jar:1.1.0] at org.apache.hadoop.hbase.security.UserProvider.getCurrent (userprovider.java:88) ~ [hbase-common-1.1.0.jar:1.1.0] at org.apache.storm.hbase.common.hbaseclient.<init> (hbaseclient.java:43) ~[storm-hbase-1.1.0.jar:1.1.0] at
    Org.apache.storm.hbase.bolt.AbstractHBaseBolt.prepare (abstracthbasebolt.java:75) ~[storm-hbase-1.1.0.jar:1.1.0] At Org.apache.storm.hbase.bolt.HBaseBolt.prepare (hbasebolt.java:109) ~[storm-hbase-1.1 
 
 Should be caused by version incompatibility 


Key 2: log4j-over-slf4j.jar AND slf4j-log4j12.jar conflict


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/geekp/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.8/log4j-slf4j-impl-2.8.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/geekp/.m2/repository/org/slf4j/slf4j-log4j12/1.7.10/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]


....

SLF4J: Detected both log4j-over-slf4j.jar AND slf4j-log4j12.jar on the class path, preempting StackOverflowError. 
SLF4J: See also http://www.slf4j.org/codes.html#log4jDelegationLoop for more details.

[Thread-22-HbaseBolt-executor[1 1]] ERROR o.a.s.util - Async loop died!
java.lang.NoSuchMethodError: org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosTicket(Ljavax/security/auth/Subject;)Z
    at org.apache.hadoop.security.UserGroupInformation.<init>(UserGroupInformation.java:652) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:843) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:802) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:675) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.hbase.security.User$SecureHadoopUser.<init>(User.java:285) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.hadoop.hbase.security.User$SecureHadoopUser.<init>(User.java:281) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.hadoop.hbase.security.User.getCurrent(User.java:185) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.hadoop.hbase.security.UserProvider.getCurrent(UserProvider.java:88) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.storm.hbase.common.HBaseClient.<init>(HBaseClient.java:43) ~[storm-hbase-1.1.0.jar:1.1.0]
    at org.apache.storm.hbase.bolt.AbstractHBaseBolt.prepare(AbstractHBaseBolt.java:75) ~[storm-hbase-1.1.0.jar:1.1.0]
    at org.apache.storm.hbase.bolt.HBaseBolt.prepare(HBaseBolt.java:109) ~[storm-hbase-1.1.0.jar:1.1.0]
    at org.apache.storm.daemon.executor$fn__5044$fn__5057.invoke(executor.clj:791) ~[storm-core-1.1.0.jar:1.1.0]
    at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:482) [storm-core-1.1.0.jar:1.1.0]
    at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
71976 [Thread-22-HbaseBolt-executor[1 1]] ERROR o.a.s.d.executor - 
java.lang.NoSuchMethodError: org.apache.hadoop.security.authentication.util.KerberosUtil.hasKerberosTicket(Ljavax/security/auth/Subject;)Z
    at org.apache.hadoop.security.UserGroupInformation.<init>(UserGroupInformation.java:652) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:843) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:802) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:675) ~[hadoop-common-2.8.0.jar:?]
    at org.apache.hadoop.hbase.security.User$SecureHadoopUser.<init>(User.java:285) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.hadoop.hbase.security.User$SecureHadoopUser.<init>(User.java:281) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.hadoop.hbase.security.User.getCurrent(User.java:185) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.hadoop.hbase.security.UserProvider.getCurrent(UserProvider.java:88) ~[hbase-common-1.1.0.jar:1.1.0]
    at org.apache.storm.hbase.common.HBaseClient.<init>(HBaseClient.java:43) ~[storm-hbase-1.1.0.jar:1.1.0]
    at org.apache.storm.hbase.bolt.AbstractHBaseBolt.prepare(AbstractHBaseBolt.java:75) ~[storm-hbase-1.1.0.jar:1.1.0]
    at org.apache.storm.hbase.bolt.HBaseBolt.prepare(HBaseBolt.java:109) ~[storm-hbase-1.1.0.jar:1.1.0]
    at org.apache.storm.daemon.executor$fn__5044$fn__5057.invoke(executor.clj:791) ~[storm-core-1.1.0.jar:1.1.0]
    at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:482) [storm-core-1.1.0.jar:1.1.0]
    at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
71976 [Thread-26-kafkaSpout-executor[4 4]] INFO  o.a.s.k.s.KafkaSpout - Initialization complete
71992 [Thread-22-HbaseBolt-executor[1 1]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
    at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.1.0.jar:1.1.0]
    at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
    at org.apache.storm.daemon.worker$fn__5642$fn__5643.invoke(worker.clj:759) [storm-core-1.1.0.jar:1.1.0]
    at org.apache.storm.daemon.executor$mk_executor_data$fn__4863$fn__4864.invoke(executor.clj:274) [storm-core-1.1.0.jar:1.1.0]
    at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:494) [storm-core-1.1.0.jar:1.1.0]
    at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]

Process finished with exit code 1
1


Just do not introduce slf4j-log4j12, modify the pom file:


 <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>2.7.3</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>
            </exclusions>
        </dependency>


Key 3: Hbase configuration file hbase-site.xml on the server
This question is really especially important, md bothered me for a day.

My configuration file on the server cluster is like this

<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1,slave2</value>
</property>
<property>
<name>hbase.master.info.bindAddress</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>16010</value>
</property>
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>

</configuration>


When I downloaded the configuration file, I changed the corresponding host name to an IP address. However, I couldn't get out of it. After a lot of troubleshooting, I found the final reason.
Very important: Hbase does not fully parse my local file, it will get the master's IP address from the configuration file, and then get the IP address, then go to the master node in the cluster to find the IP address of the slave node, but because I am In the cluster, the slave node in the conf/zoo.cfg file in the Hbase installation path is written as slave1 and slave2. After the program gets the name, it goes to the window to find the corresponding IP address, but! ! ! ! The point is that the corresponding IP address is not written under my hosts file. Caused it can not find the IP address from the node, this is the mastermind of not writing to Hbase! ! ! ! ! ! ! !

The solution is to add it in C:\Windows\System32\drivers\etc\hosts


Here to give yourself a note, in the future configuration of the cluster configuration file depends on the ability to write an IP address must write an IP address! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.