Log into the Elasticsearch cluster via flume see here: Flume log import ElasticsearchKibana IntroductionKibana HomeKibana is a powerful elasticsearch data display Client,logstash has built-in Kibana. You can also deploy Kibana alone, the latest version of Kibana3 is pure html+jsclient. can be very convenient to deploy to Apache, Nginx and other httpserver.Address of Kibana3: https://github.com/elasticsearch
Preparatory work:1.apache Download Flume2. Unzip the Flume3. Modify flume-env.sh, configure Java_home
Netcat Capture Demo:1. Create the netcat-logger.conf in conf # defines the name of each component in the agent a1.sources = r1a1.sinks = K1a1.channels = c1# Describe and configure the source component: R1a1.sources.r1.type = Netcata1.sources.r1.bind = Localhosta1.sources.r1.port = 44444 # Describes and configures the sink component: K1a1.
Business background:The output of the log files generated by the Java project to flumeThe first step:Output the log to flume, write the log4j in the Java program, and specify the output to which Flume serverLog4j.rootlogger=info,flumelog4j.appender.flume= Org.apache.flume.clients.log4jappender.log4jappenderlog4j.appender.flume.hostname= 192.168.13.132log4j.appender.flume.port=41414Step Two:Import Java.util.
Spooling Directory Source:The following 2 sets of parameters are explained:Fileheader and Fileheaderkey:Fileheader is a Boolean value that can be configured to TRUE or false to indicate whether the file name is added to the header of the event in the encapsulated event after the Flume has read the data.Fileheaderkey indicates that if there is a header in the event (when Fileheader is configured to True), the header stores the file name in the Basename
pool. Each sink has a priority, the higher the priority, the greater the value, such as 100 priority above 80 priority. If a sink fails to send an event, the sink with the highest priority will attempt to send the failed event.
a1.sinkgroups = G1
a1.sinkgroups.g1.sinks = K1 K2
a1.sinkgroups.g1.processor.type = Failover
A1.SINKGROUPS.G1.PROCESSOR.PRIORITY.K1 = 5
a1.sinkgroups.g1.processor.priority.k2 = ten
A1.sinkgroups.g1.processor.maxpenalty = 10000
The above configuration group has K1, K2 tw
Article from http://www.cnblogs.com/hark0623/p/4205756.html reprint Please specifyFlume more with some doubts, this month according to plan is to read Flume source, I hope to solve my doubts, in addition, when doubts resolved, I will also send the process and conclusions to the blog, will eventually update the link to the current post, doubts as follows:1, by reading the official website, found how to request JSON to obtain
From the above information, you can see the problem, the server and the client connection information is not on, the server has a lot of established connection, in fact, useless. This situation, at first, I was also very strange, did not find the reason, can only view the log.Through the log information, it was found that an exception occurred, but it is strange that before the exception information, there is an RPC sink {} Closing RPC client: {}Here DestroyConnection, destroyed a connection, wh
OK, come straight to the dryIn the use of Flume-ng, stepped a lot of pits, now for a moment, I hope you bypass the pit to reach the purpose of skilled use of flumeThe first pit: can not correctly decode the file, causing the file can not be correctly renamed, after throwing a bug, all files can not be collected by Flume, is a more serious mistake, caused by Flume
1. DownloadHttp://www.apache.org/dist/flume/stable/Download the latest tar.gz package.2. DecompressTar-zxvf ....3. Configure Environment VariablesFlume_home and PathRemember to execute source/etc/profile4. Add a simple test caseA. Create a file in the conf directory, test-conf.propertis, the content is as follows:# Define the alias (sources-> channels-> sinks)A1.sources = S1A1.sinks = K1A1.channels = C1
# Describe the sourceA1.sources. s1.type = AvroA
Background: Use KAFKA+FLUME+MORPHLINE+SOLR to do real-time statistics.SOLR has no data since December 23. View Log discovery because a colleague added a malformed buried point data, resulting in a lot of error.It is inferred that because the use of MEM channel is full, the message is too late to process, resulting in the loss of new data.Modify flume to use the file channel:Kafka2solr.sources =SOURCE_FROM_K
For the log, I think the monitoring is not very meaningful, because the speed of writing is generally not particularly fast, but if it is spooldir source, inside a few grams into the data let Flume parse, especially in the combination of Kafka or other framework, monitoring is important, Can analyze the bottleneck of the entire architecture
Flume's monitoring is based on JSON, through JMX to generate Metrix data, can be directly accessed through the
Because Flume Spooldir does not support recursive detection of subdirectory files, and the business needs, so modify the source code, recompile
Code modification Reference from: http://blog.csdn.net/yangbutao/article/details/8835563In 1.4, however, the Spoolingfilelinereader class has not been modified, but apache-flume-1.4.0-src\flume-ng-core\src\main\java
===========> create hbase tables and column families first Case 1: One row of source data corresponding to HBase (hbase-1.12 no problem)================================================================================#说明: The case is flume listening directory/home/hadoop/flume_hbase capture to HBase; You must first create the table and column families in HBaseData Catalog:Vi/home/hadoop/flume_hbase/word.txt1001 Pan Nan2200 Lili NVCreate ' tb_words ', '
https://www.quora.com/ Why-does-flume-take-more-resource-cpu-when-file-channel-is-used-compared-to-when-memory-channel-is-usedIn case of File channel, the CPU would is used for the following
serializing/deserializing Events from/to file channel. In memory channel, the is plainly stored in RAM, so no serialization is required.
A Small CPU overhead per disk write in determining the disk location where it needs to write. Typically this is ba
The client SDK of the Android log phone was completed last week and started debugging the log server this week.Use flume for log collection, and then go to Kafka. When testing, I always found out some of the event, and later learned that the use of channel and sink is wrong. When multiple sink use the same channel, the event is diverted from the common consumption, not each sink copy. Finally, change to multiple channel, each channel corresponds to a
I. Introduction of FLUME-NGPlease refer to the official documentation: http://flume.apache.org/FlumeUserGuide.htmlSecond, examplesRequirements Description: A directory needs to be monitored and automatically uploaded to the server and encrypted during transmission.Overall solution: N client-agent-->server-agentClient-agent:
A1.sources =R1a1.channels=c1a1.sinks=K1#sourcea1.sources.r1.type=Spooldira1.sources.r1.channels=C1a1.sources.r1.basenameHead
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.