Business background:The output of the log files generated by the Java project to flumeThe first step:Output the log to flume, write the log4j in the Java program, and specify the output to which Flume serverLog4j.rootlogger=info,flumelog4j.appender.flume= Org.apache.flume.clients.log4jappender.log4jappenderlog4j.appender.flume.hostname= 192.168.13.132log4j.appender.flume.port=41414Step Two:Import Java.util.
Spooling Directory Source:The following 2 sets of parameters are explained:Fileheader and Fileheaderkey:Fileheader is a Boolean value that can be configured to TRUE or false to indicate whether the file name is added to the header of the event in the encapsulated event after the Flume has read the data.Fileheaderkey indicates that if there is a header in the event (when Fileheader is configured to True), the header stores the file name in the Basename
pool. Each sink has a priority, the higher the priority, the greater the value, such as 100 priority above 80 priority. If a sink fails to send an event, the sink with the highest priority will attempt to send the failed event.
a1.sinkgroups = G1
a1.sinkgroups.g1.sinks = K1 K2
a1.sinkgroups.g1.processor.type = Failover
A1.SINKGROUPS.G1.PROCESSOR.PRIORITY.K1 = 5
a1.sinkgroups.g1.processor.priority.k2 = ten
A1.sinkgroups.g1.processor.maxpenalty = 10000
The above configuration group has K1, K2 tw
1.flume is a distributed log collection system that transmits collected data to its destination. 2.flume has a core concept, called an agent. The agent is a Java process that runs on the Log collection node. The 3.agent consists of 3 core components: source, channel, sink. The 3.1 source component is dedicated to collecting logs and can handle various types of log data in various formats, including Avro, th
Flume configuration get information transferred to the Kafka cluster conf directory under new configuration file [[emailprotected]flume]#vimconf/file-monitor.conf# Statement agenta1.sources=r1a1.sinks=k1a1.channels=c1# Defining a data source a1.sources.r1.type=execa1.sources.r1.command=tail-f/data/ Xx.loga1.sources.r1.channels=c1#filter Filter a1.sources.r1.interceptors= I1a1.sources.r1.interceptors.i1.typ
Configuring flume cluster Reference Https://www.cnblogs.com/jifengblog/p/9277793.htmlload-balance load Balancing Introduction
Load balancing is an algorithm that is used to solve a machine (a process) that cannot resolve all requests.
The load Balancing Sink Processor can implement the load balance function, such as AGENT1 is a routing node that balances the Channel staging Event to the corresponding plurality of Sink components, and each
Pre-Preparation
Elk Official Website: https://www.elastic.co/, package download and perfect documentation.
Zookeeper Official website: https://zookeeper.apache.org/
Kafka official website: http://kafka.apache.org/documentation.html, package download and perfect documentation.
Flume Official website: https://flume.apache.org/
Heka Official website: https://hekad.readthedocs.io/en/v0.10.0/
The system is a centos6.6,64 bit machine.
Version of the softwa
1. Background information
Many of the company's platforms generate a large number of logs per day (typically streaming data, such as search engine PV, queries, etc.), and processing these logs requires a specific logging system, in general, these systems need to have the following characteristics:
(1) Construct the bridge of application system and analysis system, and decouple the correlation between them;
(2) Support near real-time online analysis system and similar to the offline analysis sys
1. Background information
Many of the company's platforms generate a large number of logs per day (typically streaming data, such as search engine PV, queries, etc.), and processing these logs requires a specific logging system, in general, these systems need to have the following characteristics:
(1) Construct the bridge of application system and analysis system, and decouple the correlation between them;
(2) Support near real-time online analysis system and similar to the offline analysis syst
In order to achieve near real-time search, there must be a mechanism to process the data in real time and then generate to the SOLR index, flume-ng just provide a mechanism, it can collect data in real time, and then through Morphlinesolrsink to the data ETL, It is finally written to the SOLR index so that it can query the new incoming data in near real time in the SOLR search engine.
Build steps:
1 We only do a demo here, so we've created a new file
Article from http://www.cnblogs.com/hark0623/p/4205756.html reprint Please specifyFlume more with some doubts, this month according to plan is to read Flume source, I hope to solve my doubts, in addition, when doubts resolved, I will also send the process and conclusions to the blog, will eventually update the link to the current post, doubts as follows:1, by reading the official website, found how to request JSON to obtain
OK, come straight to the dryIn the use of Flume-ng, stepped a lot of pits, now for a moment, I hope you bypass the pit to reach the purpose of skilled use of flumeThe first pit: can not correctly decode the file, causing the file can not be correctly renamed, after throwing a bug, all files can not be collected by Flume, is a more serious mistake, caused by Flume
1. DownloadHttp://www.apache.org/dist/flume/stable/Download the latest tar.gz package.2. DecompressTar-zxvf ....3. Configure Environment VariablesFlume_home and PathRemember to execute source/etc/profile4. Add a simple test caseA. Create a file in the conf directory, test-conf.propertis, the content is as follows:# Define the alias (sources-> channels-> sinks)A1.sources = S1A1.sinks = K1A1.channels = C1
# Describe the sourceA1.sources. s1.type = AvroA
Note: Environment: Sklin-linuxHow to download flume:wget http://Www.apache.org/dyn/closer.lua/flume/1.6.0/apache-flume-1.6.0-bin.tar.After the download is complete, unzip it using tarTAR-ZVXF apache-flume-1.6. 0-bin.tar.Enter the Flume conf configuration package, use the command touch flume.conf, and then CP
First, overview:This section first provides a data transfer process based on the Netcat Source+channel (memory) +sink (logger). Then dissect the code execution logic in Netcatsource.Second, flume configuration file:The following configuration file, netcat.conf, defines the source using Netcat, which listens on port 44444. # Name The components in this agenta1.sources=r1a1.sinks=K1a1.channels=c1# Describe/Configure the Sourcea1.sources.r1.type=Netcata1
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.