利用開源日誌收集軟體fluentd收集日誌到HDFS檔案系統中

來源:互聯網
上載者:User

標籤:hadoop   hdfs   fluentd   日誌收集   

說明:本來研究開源日誌的系統是flume,後來發現配置比較麻煩,網上搜尋到fluentd也是開源的日誌收集系統,配置簡單多了,效能不錯,所以就改研究這個東東了!官方首頁,大家可以看看:fluentd.org,支援300+的plugins,應該是不錯的!


fluentd是通過hadoop中的webHDFS與HDFS進行通訊的,所以在配置fluentd時,一定要保證webHDFS能正常通訊,和通過webHDFS寫資料到hdfs中!

原理圖如下:

650) this.width=650;" src="http://s3.51cto.com/wyfs02/M00/54/5B/wKiom1SAB_PCQmAmAADSV4dSD3E785.jpg" title="http-to-hdfs.png" alt="wKiom1SAB_PCQmAmAADSV4dSD3E785.jpg" />


webHDFS的相關配置與測試,請看這篇文章:http://shineforever.blog.51cto.com/1429204/1585942


安裝環境大致說明:

1)fluentd和hadoop中的namenode要安裝到一台物理機器上;

2)os版本:rhel 5.7 64位

3)hadoop版本:1.2.1

4)jdk1.7.0_67

5)ruby版本:ruby 2.1.2p95 


1.安裝前的準備工作,安裝ruby,因為fluentd是ruby開發的:

yum install openssl-devel zlib-devel gcc gcc-c++ make autoconf readline-devel curl-devel expat-devel gettext-devel


卸載系統內建ruby版本:

yum erase ruby ruby-libs ruby-mode ruby-rdoc ruby-irb ruby-ri ruby-docs


通過源碼安裝ruby:

wget -c http://cache.ruby-lang.org/pub/ruby/2.1/ruby-2.1.2.tar.gz

然後解壓包,編譯,把ruby安裝到目錄 /usr/local/ruby即可,然後設定profile環境變數。

測試ruby:

[[email protected] install]# ruby -v

ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]


出現以上欄位,代表ruby安裝成功。


2.fluentd安裝:

fluentd有源碼安裝,gem安裝或者rpm方式安裝三種方式;

本文採用rpm的安裝方式官方文檔已經幫我們寫好了指令碼,直接執行就行了:


curl -L http://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh


安裝成功以後,啟動指令碼是:/etc/init.d/td-agent start

設定檔路徑是:/etc/td-agent/


[[email protected] install]# cd /etc/td-agent/

You have new mail in /var/spool/mail/root

[[email protected] td-agent]# pwd

/etc/td-agent

[[email protected] td-agent]# ls

logrotate.d  plugin  prelink.conf.d  td-agent.conf  


3.利用gem安裝fluentd外掛程式fluent-plugin-webhdfs

1)由於國內防火牆block了ruby源,請更換gem的源:

[[email protected] bin]# td-agent-gem source --remove https://ruby.taobao.org/

https://ruby.taobao.org/ removed from sources

[[email protected] bin]# td-agent-gem source -a https://ruby.taobao.org/      

https://ruby.taobao.org/ added to sources


2)安裝外掛程式:

td-agent-gem  install fluent-plugin-webhdfs


查看gem的安裝列表:

td-agent-gem list


*** LOCAL GEMS ***


bigdecimal (1.2.4)

bundler (1.7.7)

cool.io (1.2.4)

fluent-mixin-config-placeholders (0.3.0)

fluent-mixin-plaintextformatter (0.2.6)

fluent-plugin-webhdfs (0.4.1)

fluentd (0.12.0.pre.2)

http_parser.rb (0.6.0)

io-console (0.4.2)

json (1.8.1)

ltsv (0.1.0)

minitest (4.7.5)

msgpack (0.5.9)

psych (2.0.5)

rake (10.1.0)

rdoc (4.1.0)

sigdump (0.2.2)

string-scrub (0.0.5)

test-unit (2.1.2.0)

thread_safe (0.3.4)

tzinfo (1.2.2)

tzinfo-data (1.2014.10)

uuidtools (2.1.5)

webhdfs (0.6.0)

yajl-ruby (1.2.1)


4)配置flunetd,載入fluent-plugin-webhdfs 模組;

加入以下欄位:

vim /etc/td-agent/td-agent.conf

<match hdfs.*.*>  type webhdfs  host node1.test.com  port 50070  path /log/%Y%m%d_%H/access.log.${hostname}  flush_interval 1s</match>


重啟td-agent服務;


5)設定hdfs相關配置:

建立log目錄

 hadoop fs -mkdir /log

賦予log目錄許可權為777,如果不賦予,資料寫不進去,官方文檔沒有說明,測試了好久才發現!

hadoop fs -chmod 777 /log


6)再次重啟td-agent服務,開始測試,測試命令如下:

curl -X POST -d ‘json={"json":"message"}‘ http://172.16.41.151:8888/hdfs.access.test


這時就發現hadoop裡面檔案有變化了!

650) this.width=650;" src="http://s3.51cto.com/wyfs02/M00/54/5B/wKioL1SAGl3hvxMKAAEJZGry3HE760.jpg" title="QQ圖片20141204162108.jpg" alt="wKioL1SAGl3hvxMKAAEJZGry3HE760.jpg" />

安裝配置過程中的報錯:

1)

2014-12-03 15:56:12 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141203_15/access.log.node1.test.com

2014-12-03 15:56:12 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-03 15:56:28 +0800 error_class="WebHDFS::ClientError" error="{\"RemoteException\":{\"exception\":\"IllegalArgumentException\",\"javaClassName\":\"java.lang.IllegalArgumentException\",\"message\":\"n must be positive\"}}" instance=23456251808160

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:313:in `request‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:231:in `operate_requests‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush‘

  2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run‘


出現以上情況,是你的hdfs檔案系統有問題,不能寫資料等等,請單獨測試hdfs的是否運行正常!


2)

2014-12-04 14:44:55 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141204_14/access.log.node1.test.com

2014-12-04 14:44:55 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:45:30 +0800 error_class="WebHDFS::IOError" error="{\"RemoteException\":{\"exception\":\"AccessControlException\",\"javaClassName\":\"org.apache.hadoop.security.AccessControlException\",\"message\":\"org.apache.hadoop.security.AccessControlException: Permission denied: user=webuser, access=WRITE, inode=\\\"\\\":hadoop:supergroup:rwxr-xr-x\"}}" instance=23456251808060

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:317:in `request‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:242:in `operate_requests‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush‘

  2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run‘

2014-12-04 14:45:31 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141204_14/access.log.node1.test.com

2014-12-04 14:45:31 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:46:26 +0800 error_class="WebHDFS::IOError" error="{\"RemoteException\":{\"exception\":\"AccessControlException\",\"javaClassName\":\"org.apache.hadoop.security.AccessControlException\",\"message\":\"org.apache.hadoop.security.AccessControlException: Permission denied: user=webuser, access=WRITE, inode=\\\"\\\":hadoop:supergroup:rwxr-xr-x\"}}" instance=23456251808060

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:317:in `request‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:242:in `operate_requests‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush‘

  2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run‘


出現以上情況,一般是你的hdfs沒有設定好許可權,把存放日誌的hdfs目錄chmod 777,就可以了!


如果日誌寫入hdfs正常,日誌顯示的是:2014-12-04 14:48:40 +0800 [warn]: retry succeeded. instance=23456251808060



本文出自 “shine_forever的部落格” 部落格,請務必保留此出處http://shineforever.blog.51cto.com/1429204/1586347

利用開源日誌收集軟體fluentd收集日誌到HDFS檔案系統中

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.