Collect logs into the HDFs file system using open source log collection software Fluentd

Source: Internet
Author: User

Description: Originally research Open source log system is flume, later found that the configuration is troublesome, online search to Fluentd is also open source log collection system, configuration is more simple, performance is good, so the study of this stuff! Official homepage, we can see: fluentd.org, Support 300+ plugins, should be good!


Fluentd is communicating with HDFs through the Webhdfs in Hadoop, so when configuring Fluentd, make sure Webhdfs can communicate properly and write data to HDFs by Webhdfs!

Schematic diagram is as follows:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/54/5B/wKiom1SAB_PCQmAmAADSV4dSD3E785.jpg "title=" Http-to-hdfs.png "alt=" Wkiom1sab_pcqmamaadsv4dsd3e785.jpg "/>


For WEBHDFS configuration and testing, see this article: http://shineforever.blog.51cto.com/1429204/1585942


Installation Environment General Description:

1) Fluentd and Hadoop Namenode to be installed on a physical machine;

2) OS version: Rhel 5.7 64-bit

3) Hadoop version: 1.2.1

4) jdk1.7.0_67

5) Ruby version: Ruby 2.1.2p95


1. Pre-installation preparation, install Ruby, because FLUENTD is developed by Ruby:

Yum install openssl-devel zlib-devel gcc gcc-c++ make autoconf readline-devel curl-devel expat-devel gettext-devel


Uninstalling the system comes with the Ruby version:

Yum Erase Ruby ruby-libs ruby-mode ruby-rdoc ruby-irb Ruby-ri Ruby-docs


Install Ruby from the source:

Wget-c http://cache.ruby-lang.org/pub/ruby/2.1/ruby-2.1.2.tar.gz

Then unzip the package, compile, install Ruby into the directory/usr/local/ruby, and set the profile environment variable.

Test Ruby:

[Email protected] install]# ruby-v

Ruby 2.1.2p95 (2014-05-08 revision 45877) [X86_64-linux]


The above field appears, which represents the successful installation of Ruby.


2.FLUENTD Installation:

FLUENTD has source installation, gem installation or RPM mode installation of three ways;

This article uses the RPM installation method Official document has helped us to write the script, the direct execution is OK:


Curl-l http://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | Sh


After the installation is successful, the startup script is:/etc/init.d/td-agent start

The configuration file path is:/etc/td-agent/


[Email protected] install]# cd/etc/td-agent/

You have new mail in/var/spool/mail/root

[Email protected] td-agent]# pwd

/etc/td-agent

[[email protected] td-agent]# ls

LOGROTATE.D plugin PRELINK.CONF.D td-agent.conf


3. Install Fluentd plugin with gem Fluent-plugin-webhdfs

1) Since the local firewall blocks the ruby source, replace the source of the gem:

[Email protected] bin]# Td-agent-gem source--remove https://ruby.taobao.org/

Https://ruby.taobao.org/removed from sources

[Email protected] bin]# Td-agent-gem source-a https://ruby.taobao.org/

https://ruby.taobao.org/added to sources


2) Install the plugin:

Td-agent-gem Install Fluent-plugin-webhdfs


To view a list of the gem's installations:

Td-agent-gem List


LOCAL GEMS * * *


BigDecimal (1.2.4)

Bundler (1.7.7)

Cool.io (1.2.4)

Fluent-mixin-config-placeholders (0.3.0)

Fluent-mixin-plaintextformatter (0.2.6)

Fluent-plugin-webhdfs (0.4.1)

Fluentd (0.12.0.PRE.2)

HTTP_PARSER.RB (0.6.0)

Io-console (0.4.2)

JSON (1.8.1)

LTSV (0.1.0)

Minitest (4.7.5)

Msgpack (0.5.9)

Psych (2.0.5)

Rake (10.1.0)

RDoc (4.1.0)

Sigdump (0.2.2)

String-scrub (0.0.5)

Test-unit (2.1.2.0)

Thread_safe (0.3.4)

Tzinfo (1.2.2)

Tzinfo-data (1.2014.10)

Uuidtools (2.1.5)

Webhdfs (0.6.0)

Yajl-ruby (1.2.1)


4) Configure flunetd, load fluent-plugin-webhdfs module;

Add the following fields:

Vim/etc/td-agent/td-agent.conf

<match hdfs.*.*> Type Webhdfs host node1.test.com port 50070 path/log/%y%m%d_%h/access.log.${hostname} flush_i Nterval 1s</match>


Restart the Td-agent service;


5) Set up HDFS related configuration:

Create a log directory

Hadoop Fs-mkdir/log

Give the log directory permissions of 777, if not given, the data is not written in, the official documents are not described, testing for a long time to find!

Hadoop Fs-chmod 777/log


6) Restart the Td-agent service again, start the test, and test the command as follows:

Curl-x post-d ' json={"JSON": "Message"} ' Http://172.16.41.151:8888/hdfs.access.test


At this point, we find that there are changes in Hadoop files!

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/54/5B/wKioL1SAGl3hvxMKAAEJZGry3HE760.jpg "title=" QQ picture 20141204162108.jpg "alt=" Wkiol1sagl3hvxmkaaejzgry3he760.jpg "/>

Error during installation configuration:

1)

2014-12-03 15:56:12 +0800 [warn]: Failed to communicate HDFs cluster, path:/log/20141203_15/access.log.node1.test.com

2014-12-03 15:56:12 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-03 15:56:28 +0800 error_class= "Webhdfs::clienterror" error= "{\" remoteexception\ ": {\" exception\ " : \ "illegalargumentexception\", \ "javaclassname\": \ "java.lang.illegalargumentexception\", \ "message\": \ "n must be Positive\ "}}" instance=23456251808160

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:313:in ' request '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:231:in ' operate_requests '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:45:in ' Create '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:189:in ' Rescue in Send_data '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:186:in ' Send_data '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:205:in ' Write '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Buffer.rb:296:in ' Write_chunk '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Buffer.rb:276:in ' Pop '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Output.rb:311:in ' Try_flush '

2014-12-03 15:56:12 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Output.rb:132:in ' Run '


The above situation, is your HDFs file system problems, can not write data and so on, please test the operation of HDFs separately is normal!


2)

2014-12-04 14:44:55 +0800 [warn]: Failed to communicate HDFs cluster, path:/log/20141204_14/access.log.node1.test.com

2014-12-04 14:44:55 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:45:30 +0800 error_class= "Webhdfs::ioerror" error= "{\" remoteexception\ ": {\" exception\ ": \" Accesscontrolexception\ ", \" javaclassname\ ": \" org.apache.hadoop.security.accesscontrolexception\ ", \" message\ ": \ "Org.apache.hadoop.security.AccessControlException:Permission Denied:user=webuser, Access=write, inode=\\\" \\\ ": Hadoop:supergroup:rwxr-xr-x\ "}}" instance=23456251808060

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:317:in ' request '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:242:in ' operate_requests '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:45:in ' Create '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:189:in ' Rescue in Send_data '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:186:in ' Send_data '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:205:in ' Write '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Buffer.rb:296:in ' Write_chunk '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Buffer.rb:276:in ' Pop '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Output.rb:311:in ' Try_flush '

2014-12-04 14:44:55 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Output.rb:132:in ' Run '

2014-12-04 14:45:31 +0800 [warn]: Failed to communicate HDFs cluster, path:/log/20141204_14/access.log.node1.test.com

2014-12-04 14:45:31 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:46:26 +0800 error_class= "Webhdfs::ioerror" error= "{\" remoteexception\ ": {\" exception\ ": \" Accesscontrolexception\ ", \" javaclassname\ ": \" org.apache.hadoop.security.accesscontrolexception\ ", \" message\ ": \ "Org.apache.hadoop.security.AccessControlException:Permission Denied:user=webuser, Access=write, inode=\\\" \\\ ": Hadoop:supergroup:rwxr-xr-x\ "}}" instance=23456251808060

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:317:in ' request '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:242:in ' operate_requests '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client _v1.rb:45:in ' Create '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:189:in ' Rescue in Send_data '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:186:in ' Send_data '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/ Fluent/plugin/out_webhdfs.rb:205:in ' Write '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Buffer.rb:296:in ' Write_chunk '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Buffer.rb:276:in ' Pop '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Output.rb:311:in ' Try_flush '

2014-12-04 14:45:31 +0800 [warn]:/usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/ Output.rb:132:in ' Run '


The above situation, is generally your HDFS does not set the right permissions, the storage log of the HDFs directory chmod 777, you can !


If the log is written to HDFs properly, the log shows: 2014-12-04 14:48:40 +0800 [warn]: Retry succeeded. instance=23456251808060



This article is from the "Shine_forever blog" blog, make sure to keep this source http://shineforever.blog.51cto.com/1429204/1586347

Collect logs into the HDFs file system using open source log collection software Fluentd

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.