1. The raw data is kept in the HBase database to prepare for subsequent offline analysis. Solution Ideas (1) Create a Hbaseconsumer, as Kafka Consumer (2) Save data from Kafka to HBase
2. Start the service(1) Start zookeeper, Kafka, Flume $./zkserver.sh Start $ bin/kafka-console-consumer.sh--zookeeper localhost:2181--topic Mytopic $ bin/flume-ng Agent--conf conf--conf-file conf/flume-kafka-conf.properties--name agent1-dflume.root.logger= Info,console
(2) Start DFS $ start-dfs.sh (3) Start hbase $ start-hbase.sh
3. Create an HBase tableCreate a table
Create' Log_info ', {NAME = ' info '}
HBase (Main):002:0>
Create' Log_info ', {NAME = ' info '}
ERROR:java.io.IOException:Table Namespace Manager not ready yet, try again later at Org.apache.hadoop.hbase.mast Er. Hmaster.getnamespacedescriptor (hmaster.java:3447) at Org.apache.hadoop.hbase.master.HMaster.createTable ( hmaster.java:1845) at org.apache.hadoop.hbase.master.HMaster.createTable (hmaster.java:2025) at Org.apache. Hadoop.hbase.protobuf.generated.masterprotos$masterservice$2.callblockingmethod (MasterProtos.java:42280) at Org.apache.hadoop.hbase.ipc.RpcServer.call (rpcserver.java:2107) at Org.apache.hadoop.hbase.ipc.CallRunner.run ( callrunner.java:101) at Org.apache.hadoop.hbase.ipc.fiforpcscheduler$1.run (fiforpcscheduler.java:74) at Ja Va.util.concurrent.executors$runnableadapter.call (executors.java:471) at Java.util.concurrent.FutureTask.run ( futuretask.java:262) at Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1145) at Java.util.concurrent.threadpoolexecutor$workeR.run (threadpoolexecutor.java:615) at Java.lang.Thread.run (thread.java:745)
In the creation of the HBase table above error, the following analysis of the above error through zkcli.sh zookeeper, the Hbase.rootdir path of hbase files are deleted is OK
4, write Kafka consumer save HBase
Package Com.yun.kafka;
Import Java.util.HashMap;
Import java.util.List;
Import Java.util.Map;
Import java.util.Properties;
Import Java.util.regex.Matcher;
Import Java.util.regex.Pattern;
Import Com.yun.hbase.HBaseUtils;
Import Kafka.consumer.Consumer;
Import Kafka.consumer.ConsumerConfig;
Import Kafka.consumer.ConsumerIterator;
Import Kafka.consumer.KafkaStream;
Import Kafka.javaapi.consumer.ConsumerConnector;
Import Kafka.message.MessageAndMetadata; /** * Read data from Kafka and save to HBase * * @author SHENFL * */public class Stormkafkatohbasecustomer extends Thread {P Attern p = Pattern.
Compile ("Province Company authentication Interface url\\[(. *)]\\, Response time \\[([0-9]+) \ \], current time \\[([0-9]+] \ \]");
Private Consumerconnector Consumerconnector;
Public Stormkafkatohbasecustomer () {Properties props = new Properties ();
Props.put ("Zookeeper.connect", "192.168.2.20:2181");
Set consumer group Props.put ("Group.id", "Jf-group"); Consumerconfig config = new ConsumerconFig (props);
This.consumerconnector = consumer.createjavaconsumerconnector (config); } @Override public void Run () {map<string, integer> topiccountmap = new hashmap<string, Integer
> (); Topiccountmap.put ("Mytopic", 1);//Each time a record is obtained from the topic topic Map<string, list<kafkastream< byte[], Byte[]>>
;> createmessagestreams = consumerconnector. Createmessagestreams (Topiccountmap);
Hbaseutils hbase = new Hbaseutils (); while (true) {//Get information from Kafka's topic kafkastream< byte[], byte []> kafkastream = createmess
Agestreams.get ("Mytopic"). Get (0);
consumeriterator< byte[], byte []> iterator = Kafkastream. iterator ();
if (iterator. Hasnext ()) {messageandmetadata< byte[], byte []> mm = iterator. Next ();
String v = new String (Mm.message ());
Matcher m = P.matcher (v); if (M. Find ()) {String URL = m.group (1);
String usetime = M.group (2);
String currenttime = M.group (3); System.
Out.println (Thread.CurrentThread (). GetId () + "= +" +url + "+" + usetime + "+" + currenttime); The raw data remains in HBase, Http://hn.auth.com->2000->1444274868019,rowkey is auth+ date hbase.put ("Log_i
NFO "," Auth_ "+currenttime," info "," url ", url);
Hbase.put ("Log_info", "Auth_" +currenttime, "info", "Usetime", usetime);
Hbase.put ("Log_info", "Auth_" +currenttime, "info", "currenttime", currenttime); }}}} public static void Main (string[] args) {Stormkafkatohbasecustomer sto
Rmkafkatohbasecustomer = new Stormkafkatohbasecustomer ();
Stormkafkatohbasecustomer.start (); }
}
5. Validate HBase Data
HBase (main):030:0> get ' log_info ', ' auth_1444314527110 '
COLUMN CELL
info:currenttime timestamp=1444314527104, value=1444314527110
info:url timestamp=1444314527087, value=http://hn.auth.com
info:usetime timestamp=1444314527096, value=2000
When you see this, you can consume the data in Kafka through the consumer program, and save the data consumed in Kafka to HBase for easy follow-up analysis.