Recently, the company's business data volume is increasing, the previous Message Queuing-based log system more and more difficult to meet the current business volume, manifested as message backlog, log delay, log storage date is too short, so we started to redesign this piece, the industry has a relatively mature process, based on streaming, using flume Collect logs, send to the Kafka queue to do buffering, Storm distributed real-time framework for consumer processing, short-term data landed in HBase, MONGO, long-term data into Hadoop storage. Next, we intend to put together the problems encountered in the course of the study, the knowledge of the record, as a memo, as a share, to bring to the needs of the people.
The process of learning Flume Ng still encountered some obstacles, but in search engine found all the flume information is the previous flume og
(Flume orign Generation), and this is not our use, we want to use flume ng
(flume next Generation), so you need to search for the best searches flume ng
to find, detailed look at the differences between the two versions:
The first revolution in the history of Flume Ng:flume
Very detailed, including two generations of changes, installation configuration, etc., can be carefully read the next.
About using flume ng to collect logs, the U.S. network has two good articles, you can refer to:
Flume-based Log collection system (i) Architecture and design
Flume-based Log collection system (ii) Improvement and optimization
About Learning Flume-ng, official user documentation is the best information, reading English also some difficult circumstances, @javachen write this article, to clarify the document most of the scenes, Xirong highly recommended, flume-ng principle and use
After reading the above article on the flume have some concepts, you can look at this PPT, from Yahoo's engineers to share: (SlideShare PPT sharing services in the mainland is the wall, not scientific internet? See how to use the Shadowsocks service to enjoy free online learning methods)
Feb hug:large scale Data Ingest Using Apache Flume from Yahoo! Developer Network
Flume-ng cluster scripting
Flume-ng construction is still very simple, as shown in the following script, each machine performs the following on the OK, it is important to make the configuration according to the different production environment machine, in the Javachen article has the detailed configuration of the place, can be related to reading.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
#!/bin/bash
# Author:xirong # date:2015-02-06
##### Building a flume cluster script #注意: # 1. Requires JDK7 environment, if there is no Java environment, please configure # 2. There is a/home/work directory, otherwise it cannot be installed # ####
# ZIP file Decompression TAR-ZXF apache-flume-1.5.2-bin.tar.gz-c/home/work/flume_cluster/
# Configure Flume Environment Echo' # # Flume configuration ' >>/etc/profile Echo' Export Flume_home=/home/work/flume_cluster/apache-flume-1.5.2-bin ' >>/etc/profile Echo' Export path=.: $PATH:: $FLUME _home/bin ' >>/etc/profile
Source/etc/profile
# Add Java Environment variables \cp-F$FLUME _home/conf/flume-env.sh.template$FLUME _home/conf/flume-env.sh Echo' java_home=/opt/jdk1.7.0_75/' >> $FLUME _home/conf/flume-env.sh echo ' confgratulations! Fluem has been installed and flume-env.sh have been Set! ' # test succeeded flume-ng Version ## Flume 1.5.2 ## Source code repository:https://git-wip-us.apache.org/repos/asf/flume.git ## revision:229442aa6835ee0faa17e3034bcab42754c460f5 ## Compiled by Hshreedharan on Wed Nov 12:51:22 PST page ## from source with checksum 837f81bd1e304a65fcaf8e5f692b3f18 |
Practical articles, including optimization configuration above production environment, FAQs, Detailed view flume ng actual combat configuration
Flume background boot:, using nohup
commands, that is nohup flume-ng agent --conf $FLUME_HOME/conf --conf-file $FLUME_HOME/conf/flume_weblog.conf --name agent1 &
, detailed reference: Linux tips: Several ways to make a process run reliably in the background
As the previous Jstorm build, provide a button to install the script, Zip package contains the required files, Baidu Network disk, password for usfc
, welcome to download experience.
If you have any questions, please leave a message or find my contact information on the right, contact me
Original http://www.ixirong.com/2015/05/18/how-to-install-flume-ng/
Distributed real-time log system (ii) Construction of Environment Flume cluster/flume ng data