Distributed computing Distributed log Import Tool-flume

Last Update:2015-08-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Background

Flume is a distributed log management system sponsored by Apache, and the main function is to log,collect the logs generated by each worker in the cluster to a specific location.

Why write this article, because now the search out of the literature is mostly the old version of the Flume, in Flume1. x version, that is, flume-ng version with a lot of changes before, many of the market's documents are outdated, we must pay attention to this point, I will provide a few more new, reference value of the article later.

There are several aspects to Flume's advantages:
* Java implementation, good cross-platform performance
* There is a certain fault-tolerant mechanism, and the prevention of data protection mechanism
* Provides a lot of agents
* Easy to develop, with developer option

Function

Stand-alone version is the above form, there are three components, respectively, is source,channel,sink. In use, as long as the installation of Flume, and then configure the corresponding conf file, it is OK.
Source: Mainly the origin of the configuration log file (multiple agents are available, multiple data sources are supported)
Channel: Similar to a queue, staging the received log data
Sink: Output The log file (there are many ways to project it onto the screen, or you can read it to a database or a specified file)

# Name The components in this agentA1. Sources=R1A1. Sinks= K1A1. Channels= C1# describe/configure The sourceA1. Sources. R1. Type= Avro#avro是flume的一种type, read the local log fileA1. Sources. R1. Bind= localhost#这个和下面的port对应于avro the-client portA1. Sources. R1. Port=44444# Describe The sinkA1. Sinks. K1. Type=com. Waqu. Sink. Odpssink #对应代码里的包名A1. Sinks. K1. Sink. BatchSize= -             #需要大于10A1. Sinks. K1. Sink. Table= *******#自己建的hub表以及key-id InformationA1. Sinks. K1. Sink. Project=******* A1. Sinks. K1. Sink. ODPs. Access_id =********** A1. Sinks. K1. Sink. ODPs. Access_key =********** A1. Sinks. K1. Sink. ODPs. End_point =***********A1. Sinks. K1. Sink. Sink. Tunnel. End_point =*******# Use a channel which buffers events in memoryA1. Channels. C1. Type= Memorya1. Channels. C1. Checkpointdir= +A1. Channels. C1. Datadirs= -# Bind The source and sink to the channelA1. Sources. R1. Channels= C1A1. Sinks. K1. Channel= C1

The following is for these three points, detailed introduction of the following

Flume Workflow

The agent supports a variety of input source, several more commonly used type.
*http, can listen to the HTTP port, take log
*netcat, you can listen for Telnet-like port data
*spooling, listening for new files in a file directory
*avro Source, send the specified file, this does not support real-time monitoring, that is to say we monitor A.log file, when A.log changed, we can not get the change of the log
*exec Source, which can monitor a file in real time

The point is that exec Source, which is cool, allows shell commands to be executed on the agent so that we can use the tail command to monitor what's new in a file.

-flog.txt

Develop

* Start with the Official SDK package to develop a packaged jar file
* Put the jar in the Flume lib file directory
* Configure conf file
* Start Agent: flume-ng agent --conf conf --conf-file ./conf/my.conf -name a1 -Dflume.root.logger=INFO,console
* Start Data Source:flume-ng avro-client -H localhost -p 44444 -F /home/garvin/log.txt -Dflume.root.logger=INFO,console

Recommend a few useful things:
An example of a code implementation: Https://github.com/waqulianjie/odps_sink
Developer Document:http://flume.apache.org/flumeuserguide.html
A more complete introduction: http://www.aboutyun.com/thread-8917-1-1.html

This article comes from the blog "Bo Li Garvin"
Reprint please indicate source: Http://blog.csdn.net/buptgshengod]

Distributed computing Distributed log Import Tool-flume

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Distributed computing Distributed log Import Tool-flume

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Distributed computing Distributed log Import Tool-flume

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support