1. Background information
Many of the company's platforms generate a large number of logs (typically streaming data, such as the PV of search engines, queries, etc.), which require a specific log system, which in general requires the following characteristics:
(1) Construct the bridge of application system and analysis system, and decouple the correlation between them;
(2) support the near real-time on-line analysis system and the off-line analysis system similar to Hadoop;
(3) with high scalabi
In the flume-based log collection system (a) architecture and design, we detail the architecture design of the flume-based log collection system and why it is designed. In this section, we will describe the problems encountered in the actual deployment and use process, the functional improvements to flume, and the optimizations that are made to the system.1 Summa
Netstat-ntpl[root@bigdatahadoop sbin]#./nginx-t-c/usr/tengine-2.1.0/conf/nginx.conf
Nginx: [Emerg] "upstream" directive is isn't allowed here in/usr/tengine-2.1.0/conf/nginx.conf:47
Configuration file/usr/tengine-2.1.0/conf/nginx.conf test Failed
One more}.
16/06/26 14:06:01 WARN node. Abstractconfigurationprovider:no configuration found for this host:clin1
Java environment variable "This may not be wrong"
Org.apache.commons.cli.ParseException:The specified configuration file does not exist
Flume supports the configuration of agents through zookeeper, but this is an experimental feature. The configuration file must be uploaded to the zookeeper first. The following agent is in the structure of the Zookeeper node tree:
-/flume
|-/a1 [agent configuration file]
| |/a2 [agent profile]
classes that process the configuration file:
Org.apache.flume.node.PollingZooKeeperConfigurationProvider: If
a single-node flume deployment1 Hadoop PreparationCreate the Flume directory in HDFs and assign permissions for the flume directory to flume usersHDFs Dfs-mkdir FlumeHDFs Dfs-chown-r Flume:flume/flume2 flume-env.shEnter ${flume_home}/conf
CP
Label: Flume The demo is not saying. You can search by yourself.But now the internet is mainly Flume 1.4 version number of information. Flume 1.5 In a sensational big change. Assuming you're ready to try, I'm here to introduce you to the program minimization structure, and the data that uses Mongosink is stored in MongoDB. Completely independent of execution, wit
1. overview-"three Functions of flume"collecting, aggregating, and movingCollect aggregation Moves2. Block diagram 3. Architectural Features-"on Streaming Data flowsstreaming-based dataData flow: job-"get Data continuously"Task Flow: JOB1->JOB2->JOB3JOB4-"for Online analytic application.-"flume is only running in the Linux environmentWhat if my log server is windows?-"very SimpleWrite a configuration file,
How to do integration, in fact, especially simple, online is actually a tutorial.http://blog.csdn.net/fighting_one_piece/article/details/40667035 look here.I'm using the first integration. When you do, there are a variety of problems. Probably from from 2014.12.17 5 o'clock in the morning to 2014.12.17 night 18 o'clock 30 summed up in fact very simple, but do a long time AH Ah!!! This kind of thing, a fall into your wit. Question 1, need to refer to a variety of packages, these packages to bre
Today's meeting to discuss why log processing uses both Flume and Kafka, is it possible to use only Kafka without Flume? The idea was to use only the Flume interface, whether it is the input interface (socket and file) and the output interface (kafka/hdfs/hbase, etc.).Consider a single scenario, and from a simplified system perspective, it might be better to use
Centos6.5 install flume, centos6.5flume
Flume is installed here because it is used for game Business Log collection and analysis.
1. Install the java environmentRpm-ivh jdk-8u51-linux-x64.rpmPreparing... ######################################## ### [100%]1: jdk1.8.0 _ 51 ##################################### ###### [100%]Unpacking JAR files...Rt. jar...Jsse. jar...Charsets. jar...Tools. jar...Localedata. ja
From: http://flume.apache.org/FlumeUserGuide.html#data-flow-model
Learn flume through translation.Introduction
Apache flume is a distributed, highly reliable, and highly available system. It is mainly used to efficiently collect, aggregate, and move a large amount of log data from various data sources.
The collected data is stored in a centralized manner.
The application scenarios of Apache
Deployment Readiness
Configure the Log collection system (FLUME+KAFKA), version:
apache-flume-1.8.0-bin.tar.gz
kafka_2.11-0.10.2.0.tgz
Suppose the Ubuntu system environment is deployed in three working nodes:
192.168.0.2
192.168.0.3
192.168.0.4Flume Configuration Instructions
Suppose Flume's working directory is in/usr/local/flume,Monitor a log file (such as/tmp
Transferred from: http://www.aboutyun.com/thread-7884-1-1.html
Questions Guide:1. How to implement the Flume end to customize a sink, to follow our rules to save the log.2. To get the value of RootPath from the flume configuration file, how to configure it.Recently you need to use Flume to do the collection of remote logs, so learn some
Flume Knowledge Points:Event is a row of data1.flume is a distributed log collection system that transmits collected data to its destination.2.flume has a core concept, called an agent. The agent is a Java process that runs on the Log collection node.The 3.agent consists of 3 core components: source, channel, sink.The 3.1 source component is dedicated to collecti
Flume Introduction and use (i)Flume IntroductionFlume is a distributed, reliable, and practical service that efficiently collects, integrates, and moves massive amounts of data from different data sources. Distributed: Multiple machines can simultaneously run the acquisition data, different agents before the transmission of data over the networkReliable: Flume w
Overview
Flume: A distributed, reliable, and usable service for efficiently collecting, aggregating, and moving large-scale log data
We build a flume + Spark streaming platform to get data from flume and process it.
There are two ways to do this: Use the push-based method of Flume-style, or use a custo
entire data transfer process, the event is flowing. The transaction guarantee is at the event level. Flume can support multi-level flume agent, support fan-in (fan-in), fan-out (fan-out).Second, the Environment preparation1) Hadoop cluster (landlord version 2.7.3, a total of 6 nodes, can refer to http://www.cnblogs.com/qq503665965/p/6790580.html)2) Flume cluster
http://flume.apache.org/install 1, upload 2, unzip 3, modify the JDK directory in the conf/flume-env.sh file Note: java_opts configuration If we transfer the file too large reported memory overflow need to modify this configuration item 4, Verify that the installation was successful./flume-ng VERSION5, configuring environment variables export Flume_home=/home/apache-flu
The data source used in the previous article is to take data from a socket, a bit belonging to the "Heterodoxy", serious is from the Kafka and other message queue to take the data!The main supported source, learned by the official website are as follows: The form of data acquisition includes push push and pull pullsfirst, spark streaming integration Flume The way of 1.pushMore recommended is the pull method. Introduce dependencies: Dependency
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.