Demand:
Through the open source software FLUENTD collects Apache access logs from each device to the FLUENTD forwarding server, which is then written to the HDFs file system via the Webhdfs interface.
Software Release Notes:
Hadoop version: 1.1.2
FLUENTD version: 1.1.21
Test Environment Description:
Apache is installed on the NODE29 server, as well as Fluentd, as a FLUENTD client;
Node1 server, Namenode for Hadoop server;
Node29 the FLUENTD configuration file on the server:
<source> type tail format apache2 path/var/log/httpd/access_log pos_file/var/log/td-agent/access_log.pos time _format%y-%m-%d%h:%m:%s localtime tag apache.access </source> #Log Forwarding to Node1 Server<match APACHE.ACC Ess> type forward# time_slice_format%y%m%d# time_slice_wait 10m# localtime #定义日志入库日志的时间; Time_format%Y-%m-%d%H: %m:%s#localtime is very important, do not set the log time and the system time difference 8 hours; localtime #定义入库日志的时间; <server> host Node1 Port 24224 </server> Flush_interval 1s</match>
Node1 server configuration, this server is configured with Hadoop Namenode, as well as the forwarding role as FLUENTD, the configuration file is as follows:
<source> type forward Port 24224</source>
<match apache.access>
Type Webhdfs
Host Node1.test.com
Port 50070
Path/apache/%y%m%d_%h/access.log.${hostname}
Time_slice_format%y%m%d
Time_slice_wait 10m
#定义日志入库日志的时间;
Time_format%y-%m-%d%h:%m:%s
LocalTime
Flush_interval 1s
</match>
After the configuration, restart the FLUENTD service;
Start the test, in node29 with the AB command to start accessing Apache, generate access logs;
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/58/2E/wKioL1Srm8vR_ovaAAPZIkcRXtw457.jpg "title=" ab.png "alt=" Wkiol1srm8vr_ovaaapzikcrxtw457.jpg "/>
Then, go to the Node1 server to view the HDFs file system, whether the relevant files and directories are generated:
To view the generated directory:
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/58/2E/wKioL1SrnD2T4tjMAANsBMDhnlw258.jpg "title=" QQ picture 20150106162233.png "alt=" Wkiol1srnd2t4tjmaansbmdhnlw258.jpg "/>
View the file inside the specific log:
Hadoop fs-cat/apache/20150106_16/access.log.node1.test.com
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/58/32/wKiom1Srm-KBfz8qAAYjJTfE3zY387.jpg "title=" QQ picture 20150106162421.png "alt=" Wkiom1srm-kbfz8qaayjjtfe3zy387.jpg "/>
As shown, FLUENTD has already collected Apache logs from the Node29 server into the HDFs file system via forwarding mode, allowing offline analysis with Hadoop for next step.
This article is from the "Shine_forever blog" blog, make sure to keep this source http://shineforever.blog.51cto.com/1429204/1599771
Open Source Log collection software Fluentd forwarding (forward) architecture configuration