The batchsize of transactioncapacity and sink in flume memerychannel need attention

Source: Internet
Author: User

Fluem appear, transactioncapacity query, come to these:

Recently in doing flume real-time log collection, with the flume default configuration, found not completely real-time, so looked at, the original is Memerychannel transactioncapacity in mischief, because he default is 100, That is, the collection end of the sink will be collected after 100 to commit the transaction (that is, send to the next destination), so I modified transactioncapacity to 10, I want to see if it will be more real-time, the results found that the collection log agent started when the error.

16/04/29 09:36:15 ERROR sink. Abstractrpcsink:rpc Sink avro-sink:unable to get event from channel Memorychannel. Exception follows.
Org.apache.flume.ChannelException:Take list for memorytransaction, capacity, consider committing more freque ntly, increasing capacity, or increasing thread count
at org.apache.flume.channel.memorychannel$ Memorytransaction.dotake (memorychannel.java:96)
at Org.apache.flume.channel.BasicTransactionSemantics.take ( basictransactionsemantics.java:113)
at Org.apache.flume.channel.BasicChannelSemantics.take ( basicchannelsemantics.java:95)
at org.apache.flume.sink.AbstractRpcSink.process (abstractrpcsink.java:354)
at Org.apache.flume.sink.DefaultSinkProcessor.process (defaultsinkprocessor.java:68)
at Org.apache.flume.sinkrunner$pollingrunner.run (sinkrunner.java:147)
at Java.lang.Thread.run (thread.java:745)

So very puzzled, why the default value of 100 can, and set 10 will be said to be small, so look up the data, found that the original is sink batchsize parameters in mischief, below, I came to the rationale of this context, this sink batchsize what mean? That is, the sink will be sent from the Channel One at a time, and this send is to be sent in the form of a transaction, so this batchsize event is routed to a transaction cache queue (Takelist), which is a two-way queue, This queue can be rolled back when the transaction fails (that is, memerychannel the extracted data into the queue), its initial size is the size of the transactioncapacity definition, the source code is: takelist = new Linkedblockingdeque<event> (transcapacity); Source code from 1190000003586635 share.

Let's see where this error is thrown:

if (takelist.remainingcapacity () = = 0) {
throw new Channelexception ("Take list for memorytransaction, capacity" +
Takelist.size () + "full, consider committing more frequently," +
"Increasing capacity, or increasing thread count");
}

In the above case, sink takes 100 events at a time, plugs into takelist, and after 10 plugs, the above exception is thrown, so the solution to this error is: in Sink, The channel's transactioncapacity parameter cannot be less than the batchsize of sink.

BatchSize for transactioncapacity and sink in Memerychannel of flume

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.