Deployment Readiness
Configure the Log collection system (FLUME+KAFKA), version:
apache-flume-1.8.0-bin.tar.gz
kafka_2.11-0.10.2.0.tgz
Suppose the Ubuntu system environment is deployed in three working nodes:
192.168.0.2
192.168.0.3
192.168.0.4Flume Configuration Instructions
Suppose Flume's working directory is in/usr/local/flume,Monitor a log file (such as/tmp
Transferred from: http://www.aboutyun.com/thread-7884-1-1.html
Questions Guide:1. How to implement the Flume end to customize a sink, to follow our rules to save the log.2. To get the value of RootPath from the flume configuration file, how to configure it.Recently you need to use Flume to do the collection of remote logs, so learn some
Overview
Flume is a highly available, highly reliable, distributed, massive log collection, aggregation, and transmission software provided by Cloudera.
The core of Flume is to collect data from the data source , and then send the collected data to the specified destination (sink). In order to ensure that the delivery process must be successful, before sending to the destination (sink), the dat
Http://blog.csdn.net/alphags/article/details/52862578?locationNum=10fps=1
This article mainly refers to from the Apache Flume user documentation (http://flume.apache.org/FlumeUserGuide.html), because the Apache Flume 1.X Chinese resources are not many, So here's the process of documenting my deployment, hoping to give some hints to people with the same needs.(A lot of English documents, here only write so
Question Guide: What is the problem with 1.Flume? 2. What are the additional features of Flume based on open source? How the 3.Flume system is tuned.
In the flume-based log collection system (a) architecture and design, we detail the architecture design of the flume
First, the architecture scheme such as:Second, the installation of the various components of the program are as follows:1), Zookeeper+kafkaHttp://www.cnblogs.com/super-d2/p/4534323.html2) HBaseHttp://www.cnblogs.com/super-d2/p/4755932.html3) Flume Installation:Installing and installing the JDKFlume operating system requires more than 1.6 of the Java operating environment, download the JDK installation package from the Oracle Web site, unzip the instal
Copyright notice: This article by Wang Liang original article, reprint please indicate source:Article original link: https://www.qcloud.com/community/article/214Source: Tengyun https://www.qcloud.com/communityPhenomenonThe long-running operation found that the disk full of the flume cluster was deployed and was found to be caused by the Flume log directory.Specific questionsSpecifically, Flume's large file
apache Flume is a distributed, reliable, and efficient log data collection component ; we typically use flume to distribute log files scattered across multiple servers in the cluster into a central data platform to address the problem of" Viewing from discrete log files, statistical data. " Of course, flume not only collects log files, it also supports the colle
I. Overview1. By building a highly available flume for data collection and storage on HDFs, the frame is composed as follows:650) this.width=650; "src=" Https://s5.51cto.com/wyfs02/M01/05/CC/wKiom1msukvhD4OfAACMzR0FBDM139.png "title=" 301254248495863 (1). png "alt=" Wkiom1msukvhd4ofaacmzr0fbdm139.png "/>Second, the configuration agent1.cat flume-client.properties#name thecomponentsonthisagent Declare the na
http://blog.csdn.net/hijk139/article/details/8308224Business systems need to collect monitoring system logs and think of the flume of Hadoop. After testing, although the function is not strong enough, but basically can meet the functional requirements. Flume is a distributed, reliable and highly available service Log Collection tool, capable of completing log collection, storage, analysis, and other tasks s
BackgroundFlume is a distributed log management system sponsored by Apache, and the main function is to log,collect the logs generated by each worker in the cluster to a specific location.Why write this article, because now the search out of the literature is mostly the old version of the Flume, in Flume1. x version, that is, flume-ng version with a lot of changes before, many of the market's documents are
Sqoop
Flume
Hdfs
Sqoop is used to import data from a structured data source, such as an RDBMS
Flume for moving bulk stream data to HDFs
HDFs Distributed File system for storing data using the Hadoop ecosystem
The Sqoop has a connector architecture. The connector knows how to connect to the appropriate data source and get the data
1. Build a example file under flume/conf: Write the following configuration information to the example file#配置agent1表示代理名称agent1. Sources=source1agent1.sinks=Sink1agent1.channels=channel1# Configuration Source1agent1.sources.source1.type=Spooldir Agent1.sources.source1.spoolDir=/usr/bigdata/flume/conf/test/Hmbbs agent1.sources.source1.channels=Channel1agent1.sources.source1.fileHeader=falseagent1.sources.so
-round.
3 Implementing the Architecture
A schema implementation architecture is shown in the following figure:
Analysis of 3.1 producer layer
The service assumptions within the PAAs platform are deployed within the Docker container, so in order to meet the non-functional requirements, another process is responsible for collecting logs and therefore does not invade the service framework and processes. Using flume ng for log collection, this open s
1 ... Cache file backlog occurs in the/flume/fchannel/spool/data/directoryPossible causes: same time the same client under the two monitoring directory MV file, or at the same time multiple clients to the server to upload files2. Clear: /flume/fchannel/spool/data/directory After the file restart, the monitoring directory file backlog, no uploadRepeat an exception inside the Flume.log:Java.lang.IllegalStateE
data loss. Try to use tail-F. Note that it is in uppercase;
2. About channel:
1. We recommend that you use the new composite spillablememorychannel for the collection node. We recommend that you use memory channel for the summary node, depending on the actual data volume, generally, memory channel is recommended for Flume agents whose data volume exceeds MB per minute (the file channel processing speed is about 2 m/s, which may vary with machin
How is the built-in monitoring of flume integrated? Many people have asked this question. Currently, you can use the cloudera manager and ganglia graphical monitoring tools to obtain JSON strings from the browser or customize the reports to other monitoring systems. What is the monitoring information? Is the statistical information of each component, such as the number of successfully received events, the number of successfully sent events, and the nu
Flume is a highly available, highly reliable, distributed mass log capture, aggregation, and transmission system provided by Cloudera, Flume supports the customization of various data senders in the log system for data collection, while Flume provides simple processing of data The ability to write to various data-receiving parties (customizable). Currently belong
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.