flume– primary knowledge of Flume, source and sinkDirectoryBasic conceptsCommon source sourcesCommon sinkBasic conceptsWhat's the name flume?Distributed, reliable, large number of log collection, aggregation, and mobility tools.? eventsevent, which is the byte data of a row of data, is the basic unit of Flume sending f
I. Introduction of FlumeFlume is a distributed, reliable, and highly available mass-log aggregation system that enables the customization of various data senders in the system for data collection, while Flume provides the ability to simply process the data and write to various data-receiving parties (customizable).Design goal:(1) ReliabilityWhen a node fails, the log can be transmitted to other nodes without loss.
Recently learned the use of the next flume, in line with the company will be independent of the development of the log system, the official website user manual: Http://flume.apache.org/FlumeUserGuide.htmlFlume schema A. ComponentFirst move, the structure of the Internet.As you can see from the diagram, the Flume event is defined as a data stream, a data stream consisting of an agent, which is actually a JVM
I am testing HDFs sink, found that the sink side of the file scrolling configuration items do not play any role, configured as follows:a1.sinks.k1.type=hdfsa1.sinks.k1.channel=c1a1.sinks.k1.hdfs.uselocaltimestamp=truea1.sinks.k1.hdfs.path=hdfs:/ /192.168.11.177:9000/flume/events/%y/%m/%d/%h/%ma1.sinks.k1.hdfs.fileprefix=xxxa1.sinks.k1.hdfs.rollinterval= 60a1.sinks.k1.hdfs.rollsize=0a1.sinks.k1.hdfs.rollcount=0a1.sinks.k1.hdfs.idletimeout=0The configur
," + "increasing capacity, or increasing thread count") ; }
Take before also pre-judgment, if the takelist is full, indicating take operation is too slow, there is an event accumulation phenomenon, you should adjust the transaction capacitywhat happens when a transaction commits, and what does the transaction commit?? Commit is a transaction commitTwo cases:1, put the event submissionwhile (!putlist.isempty ()) { if (!queue.offer (Putlist.removefirst ())) {
From Bin/flume this shell script can see Flume starting from the Org.apache.flume.node.Application class, which is where the main function of Flume is.
The main method first resolves the shell command, throwing an exception if the specified configuration file does not exist.
According to the command contains "no-reload-conf" parameters, decide which way to load t
This article is a simple example of the flume official document in the practice and description of the official example
Http://flume.apache.org/FlumeUserGuide.html#a-simple-example
Flume's Netcat source automatically creates a socket Server that can fetch data simply by sending the data to the netcat source of this socket,flume.
Examples are as follows:
1, first configure the agent: in Flume's conf dire
This paper will take timestampinterceptor as an example to analyze how interceptors work in Flume.First, consider the implementation structure of the Interceptor.1. Interceptor interface is realizedThe method of the interface is defined as follows: public void Initialize (); Public event intercept (event event); Public list public void close ();/** Builder implementations must have a no-arg constructor * * Public Interface Builder extends configurable { Publ IC Interceptor Build (); }2.
{ return null; } }4. Return to a usable sinkIf a failure occurs, then look at the execution logic of the first half of the code in the process:Long now = System.currenttimemillis (); while (!failedsinks.isempty () Failedsinks.peek (). Getrefresh () Prerequisites: Failedsinks is not empty and the sink activation time of the team header is less than the current time1, poll out the queue of the first Failedsink2, using the current sink processing, if the processing is successful, then
ObjectiveFirst look at the definition of event in Flume official websiteA line of text content is deserialized into an event "serialization is the process of converting an object's state into a format that can be persisted or transmitted. Relative to serialization is deserialization, which transforms a stream into an object. These two processes combine to make it easy to store and transfer data ", the maximum definition of event is 2048 bytes, exceedi
From Bin/flume this shell script can see Flume starting from the Org.apache.flume.node.Application class, which is where the main function of Flume is.The Main method first parses the shell command, assuming that the specified configuration file does not exist and then dumps the exception.According to the command contains the "no-reload-conf" parameters, decided
1. Introduction to Flume
Flume is a distributed, reliable, andHigh AvailabilityA system that aggregates massive logs. It supports customization of various data senders to collect data. Flume also provides simple processing of data and writes it to various data receivers (customizable) capabilities.Design goals:
(1) Reliability
When a node fails, logs can be trans
Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1145)At Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:615)At Java.lang.Thread.run (thread.java:745)
Can't upload compressed file, file name problem file, estimate video file is even worse
16/06/26 18:18:59 INFO IPC. Nettyserver: [id:0x6fef6466,/192.168.184.188:40594 =>/192.168.184.188:44444] CONNECTED:/192.168.184.188:4059416/06/26 18:19:05 INF
(zkconnstring,
new Exponentialbackoffretry (1000, 1));
Flume uses curator as the zookeeper client, curator is a zookeeper client of Netflix's open source, and zookeeper has higher levels of abstraction than the native clients provided by curator Simplifies the development of zookeeper clients.
Curator's maven configuration:
Zookeeper also has an original eco-client, MAVEN configuration:
Using the original eco-client,
Flume load Balancing is the choice of a certain algorithm per sink output to the specified place, if the file output is very large, load balancing is still necessary, through the output of multiple channels to alleviate the output pressureFlume built-in load balancing algorithm by default is round robin, polling algorithm, ordered selectionHere's a look at the specific examples:# Name The components in this agenta1.sources = R1a1.sinks = K1 k2a1.chann
Multiplexing technology is intended to send an event to a specific channel based on configuration information.A source instance can specify multiple channels, but a sink instance can only specify one channel.Flume supports fanning out the flow from one source to multiple channels. There is modes of fan out, replicating and multiplexingFlume supports two modes of output from source to multiple channel: copy and Reuse1. In copy mode, the event data received by source is output to all channel confi
Basic concepts of flume, data stream model, and flume data stream1. Basic concepts of flume
AllFlumeAll related terms are in italic English. The meanings of these terms are as follows.
FlumeA reliable and distributed system for collecting, aggregating, and transmitting massive log data.
Web ServerOne generationEvents.
Agent flumeA node in the system contains thre
I. Introduction of FlumeFlume is a distributed, reliable, and highly available mass-log aggregation system that enables the customization of various data senders in the system for data collection, while Flume provides the ability to simply process the data and write to various data-receiving parties (customizable).Design objectives:(1) ReliabilityWhen a node fails, the log can be transmitted to other nodes without loss.
see the following printing information: V. Log Collection Test1) Start zookeeper cluster (students who do not build zookeeper can ignore)2) Start HDFs start-dfs.sh3) Simulation website log, the landlord here casually get the test data4) Upload to/hadoop/home/logsHADOOP01 output:HADOOP05 output: Because the HADOOP05 setting has a higher priority than HADOOP06, HADOOP06 has no log writes.We looked again at HDFs, whether the log file was successfully u
http://flume.apache.org/install 1, upload 2, unzip 3, modify the JDK directory in the conf/flume-env.sh file Note: java_opts configuration If we transfer the file too large reported memory overflow need to modify this configuration item 4, Verify that the installation was successful./flume-ng VERSION5, configuring environment variables export Flume_home=/home/apa
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.