1.installationJdkrefer to the installation of the JDK here. 2.installationFlume2.1. DownloadFlume:http://flume.apache.org/download.html650) this.width=650; "Src=" https://s5.51cto.com/oss/201710/25/ Da9277a9d433278d21a9ccdef349d90a.png-wh_500x0-wm_3-wmp_4-s_3707767358.png "title=" 1.png "alt=" Da9277a9d433278d21a9ccdef349d90a.png-wh_ "/>Click the link: apache-flume-1.7.0-bin.tar.gz download. 2.2. Unpacking the installation package$ tar zxvf apache-
This article is a simple example of the flume official document in the practice and description of the official example
Http://flume.apache.org/FlumeUserGuide.html#a-simple-example
Flume's Netcat source automatically creates a socket Server that can fetch data simply by sending the data to the netcat source of this socket,flume.
Examples are as follows:
1, first configure the agent: in Flume's conf dire
systemc) Channel:angent Internal data transfer channel for passing data from source to sinkNote: The data passed from source to Channel to Sink is in the form of event events; Event an event is a data flow unit. FLume collection system structure diagram 1. Simple Structure Single agent collects dataComplex StructureTandem between multi-level agentsFlume Practical Cases F Installation deployment for Lume1, Flu
{ return null; } }4. Return to a usable sinkIf a failure occurs, then look at the execution logic of the first half of the code in the process:Long now = System.currenttimemillis (); while (!failedsinks.isempty () Failedsinks.peek (). Getrefresh () Prerequisites: Failedsinks is not empty and the sink activation time of the team header is less than the current time1, poll out the queue of the first Failedsink2, using the current sink processing, if the processing is successful, then
Flume is a real-time message collection system, it defines a variety of source, channel, sink, can be selected according to the actual situation.Flume Download and Documentation:http://flume.apache.org/KafkaKafka is a high-throughput distributed publish-subscribe messaging system that has the following features:
Provides persistence of messages through the disk data structure of O (1), a structure that maintains long-lasting performance even
This paper will take timestampinterceptor as an example to analyze how interceptors work in Flume.First, consider the implementation structure of the Interceptor.1. Interceptor interface is realizedThe method of the interface is defined as follows: public void Initialize (); Public event intercept (event event); Public list public void close ();/** Builder implementations must have a no-arg constructor * * Public Interface Builder extends configurable { Publ IC Interceptor Build (); }2.
scribe, Chukwa, Kafka, flume log System comparison1. Background informationMany of the company's platforms generate a large number of logs per day (typically streaming data, such as search engine PV, queries, etc.), processing these logs requires a specific logging system, in general, these systems need to have the following characteristics: (1) Build the bridge of application system and analysis system, and decouple the association between them; (2)
1. Background introduction Many of the company's platforms generate a large number of logs per day (typically streaming data, for example, the search engine PV, query, etc.), the processing of these logs requires a specific log system, in general, these systems need to have the following characteristics: (1) The construction of application systems and analysis systems of the bridge, and the correlation between them decoupling (2) support for near real-time online analysis system and off-line ana
1. Background information
Many of the company's platforms generate a large number of logs (typically streaming data, such as the PV of search engines, queries, etc.), which require a specific log system, which in general requires the following characteristics:
(1) Construct the bridge of application system and analysis system, and decouple the correlation between them;
(2) support the near real-time on-line analysis system and the off-line analysis system similar to
ObjectiveFirst look at the definition of event in Flume official websiteA line of text content is deserialized into an event "serialization is the process of converting an object's state into a format that can be persisted or transmitted. Relative to serialization is deserialization, which transforms a stream into an object. These two processes combine to make it easy to store and transfer data ", the maximum definition of event is 2048 bytes, exceedi
From Bin/flume this shell script can see Flume starting from the Org.apache.flume.node.Application class, which is where the main function of Flume is.The Main method first parses the shell command, assuming that the specified configuration file does not exist and then dumps the exception.According to the command contains the "no-reload-conf" parameters, decided
This is the original blog, reproduced please indicate the source: http://www.cnblogs.com/MrFee/p/4683953.html1, Appendtofilefunction: Appends the contents of one or more source file systems to the target file systemHow to use: Hadoop fs-appendtofile source files 1, source files 2 ... Target file Hadoop fs-appendtofile/flume/web_output/part-r-00000/
For daily terabytes of data collection, these systems typically require the following characteristics:
Construct the bridge of application system and analysis system, and decouple the association between them;
Support for near real-time online analysis systems and offline analysis systems like Hadoop;
With high scalability. That is, when the amount of data increases, it can be scaled horizontally by increasing the nodes.
Descript
Flume load Balancing is the choice of a certain algorithm per sink output to the specified place, if the file output is very large, load balancing is still necessary, through the output of multiple channels to alleviate the output pressureFlume built-in load balancing algorithm by default is round robin, polling algorithm, ordered selectionHere's a look at the specific examples:# Name The components in this agenta1.sources = R1a1.sinks = K1 k2a1.chann
Big Data We all know about Hadoop, but not all of Hadoop. How do we build a large database project. For offline processing, Hadoop is still more appropriate, but for real-time and relatively strong, data volume is relatively large, we can use storm, then storm and what technology collocation, in order to do a suitable for their own projects.1. What are the charac
Multiplexing technology is intended to send an event to a specific channel based on configuration information.A source instance can specify multiple channels, but a sink instance can only specify one channel.Flume supports fanning out the flow from one source to multiple channels. There is modes of fan out, replicating and multiplexingFlume supports two modes of output from source to multiple channel: copy and Reuse1. In copy mode, the event data received by source is output to all channel confi
Basic concepts of flume, data stream model, and flume data stream1. Basic concepts of flume
AllFlumeAll related terms are in italic English. The meanings of these terms are as follows.
FlumeA reliable and distributed system for collecting, aggregating, and transmitting massive log data.
Web ServerOne generationEvents.
Agent flumeA node in the system contains thre
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.