Apache-flume Restart the script, Apache-flume Restart the regular start of multiple processes, kill not clean, write a restart script.#echo-e parameter output is red, online can search the shell output with color font encoding a lot.Catobi-track_restart.sh#!/bin/bashpid= ' lsof-i:8787|grepjava| awk ' {print$2} ' if[-n ' ${pid} ' ];thenecho-e ' ############\033[31m kill${pid}\033[0m############# "foriin" ${p
Log4j Direct output log to FlumeThis jar is a tool class provided by the CDH release of Cloudera, which can be configured to output log4j logs directly to the flume for easy log acquisition.In the CDH5.3.0 version, it is: Flume-ng-log4jappender-1.5.0-cdh5.3.0-jar-with-dependencies.jarDirectory is:/opt/cloudera/parcels/cdh/lib/flume-ng/tools/Specific Use examplesl
1.Jdkthe installationrefer to the installation of the JDK here. 2.installationZookeeperrefer to my The "Fully distributed" section of the Zookeeper installation tutorial. 3.installationKafkarefer to my The "Fully distributed Build" section of the Kafka installation tutorial. 4.installationFlumerefer to my Flume Installation Tutorial. 5.ConfigurationFlume5.1. ConfigurationKafka-s.cfg$ cd/software/flume/conf/
First, the demand
Use flume to capture the file information under Linux and pass it into the Kafka cluster.
Environment ready Zookeeper cluster and Kafka cluster are installed well.
Second, the configuration flume
Download Flume website. The blogger himself is using flume1.6.0.
Official Address http://flume.apache.org/download.html
1. Overview
Flume is a high-performance, highly possible distributed log collection system for Cloudera company.
The core of Flume is to collect data from the data source and send it to the destination. In order to ensure that the transmission must be successful, before sending to the destination, will first cache the data, waiting for the data to really arrive at the destination, delete their own cached da
first, Flume basic part:Flume--Log collection framework background: Log scattered across the machine, and want to use the big data platform for statistical analysis from other servers to collect log movement to the cluster, and can monitor, need to be timeliness, fault tolerance, load balancing Flume generally by configuring configuration File for an overview of the collection of data: flume.apache.org dist
Structure:Nginx-flume->kafka->flume->kafka (because involved in the cross-room problem, between the two Kafka added a flume, egg pain. )Phenomenon:In the second layer, write Kafka topic and read Kafka topic same, manually set sink topic does not take effectTo open the debug log:SOURCE instantiation:APR 19:24:03,146 INFO [conf-file-poller-0] (org.apache.flume.sour
Flume write HDFs operation in the Hdfseventsink.process method, the path creation is done by BucketpathAnalyze its source code (ref.: http://caiguangguang.blog.51cto.com/1652935/1619539)Can be implemented using%{} variable substitution, only need to get the time field in the event (the Nginx log of the local times) incoming Hdfs.path can beThe specific implementation is as follows:1. In the Kafkasource process method, add:DT = Kafkasourceutil.getdatem
Transferred from: http://www.cnblogs.com/lxf20061900/p/4014281.htmlThe pathname of the HDFs sink in Flume-ng (the corresponding parameter "Hdfs.path", which is not allowed to be empty) and the file prefix (corresponding to the parameter "Hdfs.fileprefix") support the regular parsing timestamp to automatically create the directory and file prefix by time.In practice, it is found that the flume built-in parsi
Flume Architecture and Core components(1)Source 收集 负责从什么地方采集数据(2)Channel 记录 (3)Sink 输出Official documentsHttp://flume.apache.org/FlumeUserGuide.htmlHttp://flume.apache.org/FlumeUserGuide.html#starting-an-agentFlume Use IdeasThe key to using Flume is to write the configuration file
(1) Configuring the source
(2) Configuration Channerl
(3) configuration sink
(4) string The above three comp
Capture Directory to HDFsUsing flume to capture a directory requires an HDFS cluster to be startedVI spool-hdfs.conf# Name the components on Thisagenta1.sources=r1a1.sinks=K1a1.channels=c1# Describe/Configure the source# #注意: You can not repeat the same name in the monitoring target file A1.sources.r1.type=Spooldira1.sources.r1.spoolDir=/root/Logs2a1.sources.r1.fileHeader=true# Describe The Sinka1.sinks.k1.type=Hdfsa1.sinks.k1.channel=C1a1.sinks.k1.hd
Label:Flume is a highly available, highly reliable, distributed mass log collection, aggregation and transmission system. You can look at the model: Each flume agent can provide a flume service. Each agent has three members: source, channel, sink As shown, fetching data from source and sending it to Channel,channel is like a buffer, from which sink reads data from the channel. --------------------------
Recently, flume is used for data collection. The spooldir source has the following problems:
If a line of the file contains garbled characters and does not comply with the specified encoding specification, flume throws an exception and stops there.
Once the files in the folder specified by spooldir are modified, flume throws an exception and stops there.
In f
First, Netcat source + memory Channel + logger SINK1. Modify Configuration1) Modify the flume-env.sh file under $flume_home/conf, modify the contents as followsExport JAVA_HOME=/OPT/MODULES/JDK1. 7. 0_672) under the $flume_home/conf directory, create the agent subdirectory, creating a new netcat-memory-logger.conf with the following configuration:# netcat-memory-logger# Name The components in this agenta1.sources=r1a1.sinks=K1a1.channels=c1# Describe/
Background: Kafka The completion of the message bus, so that the data of each system can be aggregated in the Kafka node, the next task is to maximize the value of data, let the data "Hui" talk.Environment Preparation:Kafka server.CDH 5.8.3 Server, install Flume,solr,hue,hdfs,zookeeper service.Flume provides a scalable, real-time data transmission channel, Morphline provides lightweight ETL functionality, Solrcloud+hue provides high-performance search
Environment Description: master server ip:192.168.80.1281. Prepare Apache-flume-1.7.0-bin.tar File2. Upload to Master (192.168.80.128) server3, Decompression Apache-flume-1.7.0-bin.tarTAR-ZXVF Apache-flume-1.7.0-bin.tar4. Enter the configuration file directory of the FlumeCd/apache-flume-1.7.0-bin/conf5, modify the con
1. First you need to know flume HTTP monitoring if bootingPlease refer to the monitoring parameters of the blog flumeThat is, in Http://localhost:3000/metrics, you can access the following content2. Install the Flume monitor plugin in Open-falcon, refer to the official documentation http://book.open-falcon.org/zh_0_2/usage/flume.htmlOfficial documentation is very unclear, please refer to the next steps in t
Flume ArchitectureMainly by 3 components, respectively, Source,channel and sink,3 components of the event in the Flume data flow or pipeline, the function can be seen by the introduction of Flume: When a Flume source receives an event, It stores it into one or more channels. The channel is a passive store that keeps th
Flume 1.7 Installing and running under Windows
Install Java and configure environment variables.
Install Flume,flume's official website http://flume.apache.org/, after downloading the direct decompression can.
Second, the operationCreate a configuration file: Create a example.conf under the extracted file apache-flume-1.6.0-bin/conf, as follows.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.