Log Extraction Framework Flume introduction and installation Configuration

Source: Internet
Author: User

  • One: Flume Introduction and function
  • II: Flume installation and configuration and simple testing
A: Flume introduction and Functional Architecture 1.1 Flume Introduction:
 1.1.1 Flume是Cloudera提供的一个高可用的,高可靠的,分布式的海量日志采集、聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据;同时,Flume提供对数据进行简单处理,并写到各种数据接受方(可定制)的能力。1.1.2当前Flume有两个版本Flume 0.9X版本的统称Flume-og,Flume1.X版本的统称Flume-ng。由于Flume-ng经过重大重构,与Flume-og有很大不同,使用时请注意区分。
Features of the 1.2 flume:
  1.2.1 flume 是一个分布式的,可靠的,可用的,非常有效率的对大数据量的日志数据进行收集,聚集,移动信息的服务。flume 仅支持在linux上面运行.  1.2.2 flume 是一个基于流式数据,非常简单(就写一个配置文件就可以),灵活的架构,一个健壮的,容错的,简单的扩展数据模型用于在线上实时应用分析, 他的表现为:写一个source,channel,sink 之后一条命令就能够操作成功了。  1.2.3 flume , kafka 实时进行数据收集,spark , storm 实时去处理,impala 实时去查询。
1.3 Flume structure diagram:

1.4 Flume's structural diagram explanation:
 flume-ng 只有一个角色的节点:agent 的角色,agent 有source,channel, sink 组成。

 1. Event 是flume数据传输的基本单元 2. flume 以事件的形式将数据从源头传送到最终的目的 3. Event 由可选的header 和加载有数据的一个byte array 构成   3.1 载有的数据对flume 是不透明的   3.2 header 是容纳凌key-value字符串对的无序集合,key 在集合内饰唯一的。   3.3 header 可以在上下文路由中使用扩展
1.5 Channel/event/sink Chart:

source 监控某个文件,将数据拿到,封装在一个event当中,并put/commit 到chennel 当中,chennel 是一个队列,队列的有点事先进先出,放好后尾部一个个 event 出来,sink 主动去从chennel 当中去拉数据,sink 在把数据写到某个地方,比如hdfs 上去。
II: Installation and configuration of Flume 2.1 flume installation:
2.2 Build configuration file:
cd yangyang/flume/confcp -p flume-env.sh.template flume-env.shcp -p flume-conf.properties.template flume-conf.properties更改 flume-env.sh 增加java 的环境export JAVA_HOME=/home/hadoop/yangyang/jdk

2.3 Install the Telnet package:
yum install -y telnet-*rpm -ivh netcat-1.10-891.2.x86_64.rpm
2.4 Creating test-conf.properties file Processing
cd /home/hadoop/yangyang/flume/cp -p flume-conf.properties test-conf.propertiesecho "" > test-conf.properties 清空文件
Vim Test-conf.properties
# example.conf: A single-node Flume configuration# Name the components on this agenta1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = netcata1.sources.r1.bind = localhosta1.sources.r1.port = 44444# Describe the sinka1.sinks.k1.type = logger# Use a channel which buffers events in memorya1.channels.c1.type = memorya1.channels.c1.capacity = 1000a1.channels.c1.transactionCapacity = 100# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1
Run an agent process
cd /home/hadoop/yangyang/flumebin/flume-ng agent --conf conf --conf-file conf/test-conf.properties --name a1 -Dflume.root.logger=INFO,console


Telnet Login Processing:
telnet localhost 44444


Log Extraction Framework Flume introduction and installation Configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.