Simple analysis and carding of channel channels in "Flume" Flume

Source: Internet
Author: User

Channels is the repositories where the events is staged on a agent. Source adds the events and Sink removes it.

The channel is where the event is staged, and source is responsible for adding the Event,sink to the channel responsible for moving out the event from the channel

flume1.5.2 built-in channels are: memory, files, JDBC

1. Memory Channel Memory-channel

Time is stored in memory queues and is a good choice for situations where performance requirements are high and data loss can be accepted when an agent fails

Capacity: By default, the maximum number of event numbers that can be stored in this channel is 100,

Trasactioncapacity: The maximum number of event times that can be received in source or sent to sink is 100

Keep-alive:event the allowed time to add to the channel or move out

byte**: The limit of the byte amount of the event, including only Eventbody

A1.channels = C1a1.channels.c1.type = memorya1.channels.c1.capacity = 10000a1.channels.c1.transactioncapacity = 10000a1.channels.c1.bytecapacitybufferpercentage = 20a1.channels.c1.bytecapacity = 800000
The biggest flaw in this channel is that data is lost.

2. JDBC Channel Jdbc-channel

The event is persisted in the database, which currently supports the flume1.5.2 built-in Derby database.

This is a durable channel and is ideal for situations where recoverability is a high requirement.

Of course, the database of Derby is certainly not suitable for you, the Internet company is now MySQL, so want to combine Jdbc-channel and MySQL, it is necessary for users to combine their own situation for the development of Custom.


3. File Channel File-channel

By default the File Channel uses paths for checkpoint and data directories that is within the user home as specified Abov E. As a result if you had more than one File Channel instances active within the agent and only one would be able to lock the Directories and cause the other channel initialization to fail. It is therefore necessary so provide explicit paths to all the configured channels and preferably on different disks. Furthermore, as file channel would sync to disk after every commit, coupling it with a sink/source that batches events Toge Ther may is necessary to provide good performance where multiple disks is not available for checkpoint and data Directori Es.

It is natural that the channel data is synchronized to disk and performance degrades, but the checkpoint mechanism is added to prevent data loss.


For the deformed memory channel, which is the memory channel and the file channel used together, we do not explain here, because this mixed use, the official also give hints-not recommended in the production environment to use.

The reason for this is that data loss is not addressed, or if there is a problem on the line, troubleshooting is more complicated.


Simple analysis and carding of channel channels in "Flume" Flume

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.