Flume custom hbasesink class

Source: Internet
Author: User

Flume custom hbasesink class

Reference (to the original author) http://ydt619.blog.51cto.com/316163/1230586
Https://blogs.apache.org/flume/entry/streaming_data_into_apache_hbase

Sample configuration file of flume 1.5
# Name the components on this agenta1.sources = r1a1. sinks = k1a1. channels = c1 # Describe/configure the sourcea1.sources. r1.type = spooldira1.sources. r1.spoolDir =/home/scut/Downloads/testFlume # Describe the sinka1.sinks. k1.type = org. apache. flume. sink. hbase. asyncHBaseSinka1.sinks. k1.table = Router # Set hbase table name a1.sinks. k1.columnFamily = log # Set columnFamilya1.sinks in hbase. k1.serializer. payloadColumn = serviceTime, browerOS, clientTime, screenHeight, screenWidth, url, userAgent, mobileDevice, gwId, mac # Set columna1.sinks of hbase. k1.serializer = org. apache. flume. sink. hbase. baimiAsyncHbaseEventSerializer # sets the processing class of serializer # Use a channel which buffers events in memorya1.channels. c1.type = memorya1.channels. c1.capacity = 1000a1. channels. c1.transactionCapacity = 100 # Bind the source and sink to the channela1.sources. r1.channels = c1a1. sinks. k1.channel = c1
The key attribute a1.sinks. k1.serializer. payloadColumn lists all column names. A1.sinks. k1.serializer sets the flume serializer processing class. In the BaimiAsyncHbaseEventSerializer class, the content of payloadColumn is obtained and separated by commas (,) to obtain all column names. BaimiAsyncHbaseEventSerializer class
/** Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. see the NOTICE file * distributed with this work for additional information * regarding copyright ownership. the ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except T in compliance * with the License. you may obtain a copy of t He License at ** http://www.apache.org/licenses/LICENSE-2.0 ** Unless required by applicable law or agreed to in writing, * software distributed under the License is distributed on an * "as is" BASIS, without warranties or conditions of any * KIND, either express or implied. see the License for the * specific language governing permissions and limitations * under the License. */package org. apach E. flume. sink. hbase; import java. util. arrayList; import java. util. list; import org. apache. flume. context; import org. apache. flume. event; import org. apache. flume. flumeException; import org. hbase. async. atomicIncrementRequest; import org. hbase. async. putRequest; import org. apache. flume. conf. componentConfiguration; import org. apache. flume. sink. hbase. simpleHbaseEventSerializer. keyType; import com. google. common. base. Charsets; public class implements miasynchbaseeventserializer implements AsyncHbaseEventSerializer {private byte [] table; private byte [] cf; private byte [] [] payload; private byte [] [] payloadColumn; private final String payloadColumnSplit = "\\^ A"; private byte [] incrementColumn; private String rowSuffix; private String rowSuffixCol; private byte [] incrementRow; private KeyType keyType; @ Override public Void initialize (byte [] table, byte [] cf) {this. table = table; this. cf = cf ;}@ Override public List <PutRequest> getActions () {List <PutRequest> actions = new ArrayList <PutRequest> (); if (payloadColumn! = Null) {byte [] rowKey; try {switch (keyType) {case TS: rowKey = SimpleRowKeyGenerator. getTimestampKey (rowSuffix); break; case TSNANO: rowKey = SimpleRowKeyGenerator. getNanoTimestampKey (rowSuffix); break; case RANDOM: rowKey = SimpleRowKeyGenerator. getRandomKey (rowSuffix); break; default: rowKey = SimpleRowKeyGenerator. getUUIDKey (rowSuffix); break;} // for Loop, submit all columns and put requests for data. For (int I = 0; I <this. payload. length; I ++) {PutRequest putRequest = new PutRequest (table, rowKey, cf, payloadColumn [I], payload [I]); actions. add (putRequest) ;}} catch (Exception e) {throw new FlumeException ("cocould not get row key! ", E) ;}}return actions;} public List <AtomicIncrementRequest> getIncrements () {List <AtomicIncrementRequest> actions = new ArrayList <AtomicIncrementRequest> (); if (incrementColumn! = Null) {AtomicIncrementRequest inc = new AtomicIncrementRequest (table, incrementRow, cf, incrementColumn); actions. add (inc);} return actions;} @ Override public void cleanUp () {// TODO Auto-generated method stub} @ Override public void configure (Context context) {String pCol = context. getString ("payloadColumn", "pCol"); String iCol = context. getString ("incrementColumn", "iCol"); rowSuffixCol = Context. getString ("rowPrefixCol", "mac"); String suffix = context. getString ("suffix", "uuid"); if (pCol! = Null &&! PCol. isEmpty () {if (suffix. equals ("timestamp") {keyType = KeyType. TS;} else if (suffix. equals ("random") {keyType = KeyType. RANDOM;} else if (suffix. equals ("nano") {keyType = KeyType. TSNANO;} else {keyType = KeyType. UUID;} // read the column from the configuration file. String [] pCols = pCol. replace ("",""). split (","); payloadColumn = new byte [pCols. length] []; for (int I = 0; I <pCols. length; I ++) {// convert the column name to lower case payloadColumn [I] = pCols [I]. toLowerCase (). getBytes (Charsets. UTF_8) ;}} if (iCol! = Null &&! ICol. isEmpty () {incrementColumn = iCol. getBytes (Charsets. UTF_8);} incrementRow = context. getString ("incrementRow", "incRow "). getBytes (Charsets. UTF_8) ;}@ Override public void setEvent (Event event) {String strBody = new String (event. getBody (); String [] subBody = strBody. split (this. payloadColumnSplit); if (subBody. length = this. payloadColumn. length) {this. payload = new byte [subBody. length] []; for (int I = 0; I <subBody. length; I ++) {this. payload [I] = subBody [I]. getBytes (Charsets. UTF_8); if (new String (this. payloadColumn [I]). equals (this. rowSuffixCol) {// The rowkey prefix is the value of a column. The default value is the mac address this. rowSuffix = subBody [I] ;}}@ Override public void configure (ComponentConfiguration conf) {// TODO Auto-generated method stub }}
Focus on the setEent, configure, and getActions functions. Configure function: Read the content of the flume configuration file, including the column name and rowkey suffix. setEvent function: Get the content of the flume event and save it to the payload array. GetActions function: Creates a PutRequest instance and writes rowkey, columnfamily, column, and value to the putrequest instance. After compiling and executing the custom BaimiAsyncHbaseEventSerializer function in the source code, compile the source code to generate flume-ng-hbase-sink. *. jar package, replace the original flume-ng-hbase-sink in flume. *. jar package. Download flume 1.5 source code, decompress the package and enter the directory flume-1.5.0-src/flume-ng-sinks/flume-ng-hbase-sinks/src/main/java/org/apache/flume/sink/hbase/copy the above BaimiAsyncHbaseEventSerializer class to the directory above. Go to the flume-1.5.0-src/flume-ng-sinks/flume-ng-hbase-sinks/and run the mvn compilation Command [mvn install-Dmaven. test. skip = true the mvn will generate a flume-1.5.0-src in the flume-ng-hbase-sink-1.5.0.jar/flume-ng-sinks/flume-ng-hbase-sinks/target directory after compilation, replace the jar package with the jar package under $ FLUME_HOME/lib and run the flume Command [flume-ng agent-c. -f conf/spoolDir. conf-n a1-Dflume. root. logger = INFO, console]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.