Flume-ng Source reading: Avrosink

Source: Internet
Author: User

Org.apache.flume.sink.AvroSink is used to transmit data over the network, sending event to an RPC server (such as Avrosource) and using Avrosink and avrosource to form a hierarchical structure. It inherits from Abstractrpcsink extends Abstractsink implements configurable this is sink like any other extends Abstractsink Configurable, so the emphasis is also on confgure, start, process, stop these four methods, the implementation of the Initializerpcclient (Properties props) method.

A, configure (context) method, first get the host hostname and port ports in the configuration file, set the Clientprops properties Hosts=h1,hosts.h1=hostname:port , and then put all the information in the configuration information into the Clientprops, and get the cxnresetinterval indicating the time interval for the connection to be repeated, by default 0 is to not repeat the connection creation.

The

Second, start () method calls CreateConnection () to establish a connection, and if an exception is found, call the DestroyConnection () choke connection to avoid resource leaks. The CreateConnection () method primarily initializes the client = Initializerpcclient (Clientprops) and creates a thread, and executes a destroy link destroyconnection () after a given delay cxnresetinterval, so the thread is not executed because of the default cxnresetinterval=0. This is not very clear, why should the destruction??? The Initializerpcclient (Clientprops) method constructs the corresponding rpcclient based on the information in the configuration file: First, there are four types available for the type specified by the "Client.type" parameter ( Nettyavrorpcclient (Use this as the default client if there is no "client.type"), Failoverrpcclient, Loadbalancingrpcclient, thriftrpcclient) , the instantiation needs to be configured to perform the necessary configuration execution Client.configure (properties):

(1) The Nettyavrorpcclient.configure (Properties properties) method first acquires the lock, checks the Connstate connection state to ensure that it is not configured, and then obtains "Batch-size" Set BatchSize, if configured less than 1 then use the default value of 100; get "hosts", and if multiple hosts are configured, use only the first; get "hosts." Prefix, if there are multiple then use the first, then parse out hostname and port, build a Inetsocketaddress object address; Get connection timeout "Connect-timeout", set ConnectTimeout, If less than 1000 is configured, the default value of 20000 is used, the unit is MS; Gets the corresponding time "Request-timeout", sets the Requesttimeout, if the configured less than 1000, then uses the default value 20000, the unit MS; Gets the compression type " Compression-type ", if there is a configuration compression also need to get the level of compression compressionlevel, the last Call Connect () Link RPC server.

The actual link in Connect (long timeout, Timeunit tu) method, constructs a thread pool Calltimeoutpool first, then constructs the corresponding factory class Compressionchannelfactory according to whether has the compression ( has a compression configuration) or nioclientsocketchannelfactory (no compression configuration);

Nettytransceiver (This.address,socketchannelfactory,tu.tomillis (timeout)) Transceiver Object Transceiver ; Returns a avroclient based on transceiver; finally set the link state to ready.

(2) The Failoverrpcclient.configure (Properties properties) method invokes the Configurehosts (Properties properties) method, This method gets the host list hosts in the configuration file, gets the maximum number of attempts "max-attempts", sets the Maxtries, the default is the size of the hosts; Gets the bulk size

"Batch-size", sets BatchSize, uses default size 100 if configured to be less than 1, and this client is placed as an active isactive=true. You can see that this client can use multiple host.

(3) Loadbalancingrpcclient.configure (Properties properties) Gets the host list hosts in the configuration file, and does not allow less than two, otherwise it explodes; Gets host selector " Host-selector ", There are two types of built-in selectors: Loadbalancingrpcclient.roundrobinhostselector and Loadbalancingrpcclient.randomorderhostselector, and the default is round _robin (that is, roundrobinhostselector) polling (or customizable, to implement the Loadbalancingrpcclient.hostselector interface ); get "Backoff", Set Backoff (whether or not to use the deferred algorithm, which is a Boolean value that sets the penalty time for this sink, no longer considered to be active) (the default false is not enabled), and gets the maximum delay time "Maxbackoff", Set the Maxbackoff, and then select the corresponding class and instantiate selector based on whether the selector is round_robin or random, and finally set up host Selector.sethosts (hosts).

These two built-in selectors: Roundrobinhostselector actually use the roundrobinorderselector;randomorderhostselector actually use Randomorderselector, Both are described in the Flume-ng source reading sinkgroups and Sinkrunner, which is no longer explained here.

(4) Thriftrpcclient.configure (Properties Properties) Gets the state lock Statelock.lock (); Gets the first in the host list in the configuration file, just one; get bulk size " Batch-size ", set BatchSize, if configured less than 1, use default size 100; Get host name hostname and port ports; Get response time requesttimeout if less than 1000 is set to the default 20000ms ; Get connection Pool Size "MaxConnections", set ConnectionPoolSize, set to default value 5 if size is less than 1; Create connection pool Admin Object connectionmanager= New Connectionpoolmanager (connectionpoolsize); Set the connection status to Ready,connstate = State.ready; last state unlock statelock.unlock ().

These four client are extends Abstractrpcclient implements Rpcclient.

This column more highlights: http://www.bianceng.cnhttp://www.bianceng.cn/webkf/tools/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.