Article from http://www.cnblogs.com/hark0623/p/4205756.html reprint Please specifyFlume more with some doubts, this month according to plan is to read Flume source, I hope to solve my doubts, in addition, when doubts resolved, I will also send the process and conclusions to the blog, will eventually update the link to the current post, doubts as follows:1, by reading the official website, found how to request JSON to obtain
From the above information, you can see the problem, the server and the client connection information is not on, the server has a lot of established connection, in fact, useless. This situation, at first, I was also very strange, did not find the reason, can only view the log.Through the log information, it was found that an exception occurred, but it is strange that before the exception information, there is an RPC sink {} Closing RPC client: {}Here DestroyConnection, destroyed a connection, wh
1. DownloadHttp://www.apache.org/dist/flume/stable/Download the latest tar.gz package.2. DecompressTar-zxvf ....3. Configure Environment VariablesFlume_home and PathRemember to execute source/etc/profile4. Add a simple test caseA. Create a file in the conf directory, test-conf.propertis, the content is as follows:# Define the alias (sources-> channels-> sinks)A1.sources = S1A1.sinks = K1A1.channels = C1
# Describe the sourceA1.sources. s1.type = AvroA
Note: Environment: Sklin-linuxHow to download flume:wget http://Www.apache.org/dyn/closer.lua/flume/1.6.0/apache-flume-1.6.0-bin.tar.After the download is complete, unzip it using tarTAR-ZVXF apache-flume-1.6. 0-bin.tar.Enter the Flume conf configuration package, use the command touch flume.conf, and then CP
First, overview:This section first provides a data transfer process based on the Netcat Source+channel (memory) +sink (logger). Then dissect the code execution logic in Netcatsource.Second, flume configuration file:The following configuration file, netcat.conf, defines the source using Netcat, which listens on port 44444. # Name The components in this agenta1.sources=r1a1.sinks=K1a1.channels=c1# Describe/Configure the Sourcea1.sources.r1.type=Netcata1
Background: Use KAFKA+FLUME+MORPHLINE+SOLR to do real-time statistics.SOLR has no data since December 23. View Log discovery because a colleague added a malformed buried point data, resulting in a lot of error.It is inferred that because the use of MEM channel is full, the message is too late to process, resulting in the loss of new data.Modify flume to use the file channel:Kafka2solr.sources =SOURCE_FROM_K
For the log, I think the monitoring is not very meaningful, because the speed of writing is generally not particularly fast, but if it is spooldir source, inside a few grams into the data let Flume parse, especially in the combination of Kafka or other framework, monitoring is important, Can analyze the bottleneck of the entire architecture
Flume's monitoring is based on JSON, through JMX to generate Metrix data, can be directly accessed through the
Because Flume Spooldir does not support recursive detection of subdirectory files, and the business needs, so modify the source code, recompile
Code modification Reference from: http://blog.csdn.net/yangbutao/article/details/8835563In 1.4, however, the Spoolingfilelinereader class has not been modified, but apache-flume-1.4.0-src\flume-ng-core\src\main\java
Flume after downloading, unzip, add a configuration file, write the configuration canI wrote the config file under Conf, named Flume-conf-spooldir.propertiesFlume Run Command:Bin/flume-ng agent--conf conf--conf-file conf/flume-conf-spooldir.properties--name logagent-dflume.root.logger= Debug,consolewhich-dflume.root.lo
https://www.quora.com/ Why-does-flume-take-more-resource-cpu-when-file-channel-is-used-compared-to-when-memory-channel-is-usedIn case of File channel, the CPU would is used for the following
serializing/deserializing Events from/to file channel. In memory channel, the is plainly stored in RAM, so no serialization is required.
A Small CPU overhead per disk write in determining the disk location where it needs to write. Typically this is ba
The client SDK of the Android log phone was completed last week and started debugging the log server this week.Use flume for log collection, and then go to Kafka. When testing, I always found out some of the event, and later learned that the use of channel and sink is wrong. When multiple sink use the same channel, the event is diverted from the common consumption, not each sink copy. Finally, change to multiple channel, each channel corresponds to a
The flume can monitor and manage the running state of the component, which can be pulled automatically when the component is closed, by starting a scheduled task thread pool (Monitorservice, the maximum number of threads is 30), running the monitoring thread (monitorrunnable thread), Each 3s determines whether the state of the component (including Channel,sinkrunner) meets the requirements (the available state consists of two start and stop), calls th
management solution, realize the software pipelining production, guarantee the correctness, the reliabilityGuided creation, import of projects, integrated version control (GIT/SVN), project Management (trac/redmine), Code quality (Sonar), continuous integration (Jenkins)Private deployment, unified management, for developersDistributedDistributed services: Dubbo+zookeeper+proxy+restfulDistributed message Middleware: Kafka+flume+zookeeperDistributed ca
This course is based on the production and flow of real-time data, through the integration of the mainstream distributed Log Collection framework flume, distributed Message Queuing Kafka, distributed column Database HBase, and the current most popular spark streaming to create real-time stream processing project combat, Let you master real-time processing of the entire processing process, to reach the level of big Data intermediate research and develo
Origin:
Since Hadoop is used, and because the project is not currently distributed, it is a clustered environment that causes the business log to be moved every time, and then analyzed by Hadoop.In this case, it is not as good as the previous distributed flume to work with out-of-the-box HDFs to avoid unnecessary operations. Preparation Environment:
You must have a ready-to-use version of Hadoop. My version is 2.7.3. If you don't know how to install
Last time Flume+kafka+hbase+elk:http://www.cnblogs.com/super-d2/p/5486739.html was implemented.This time we can add storm:storm-0.9.5 simple configuration is as follows:Installation dependencieswget http://download.oracle.com/otn-pub/java/jdk/8u45-b14/jdk-8u45-linux-x64.tar.gztar ZXVF jdk-8u45-linux-x64.tar.gzcd jdk-8u45-linux-/etc/profileAdd the following: Export Java_home =/home/dir/jdk1. 8 . 0_45export CLASSPATH=.: $JAVA _home/jre/lib/rt.jar: $JAVA
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.