Build a distributed log Collection System

Last Update:2014-10-23 Source: Internet

Author: User

Tags kibana rabbitmq logstash

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Preface

When the system is large, it will be split into multiple independent processes, such as Web + WCF/web API, and become a distributed system.

It is difficult to see how a request goes from start to end. If debugging and tracking are performed, the problem is more complicated. The difficulty depends on the number of processes.

The distributed log Collection System was launched.

Today we will introduce

Open-source log collection and display system-logstash (based on Java) + kibana (based on jruby, logstash already comes with) + elasticsearch + rabbitmq

The architecture diagram is as follows:

This image is copied.

- Although redis is in it, it doesn't matter. It can be changed to rabbitmq.
- This broker redis/rabbitmq can be removed, but removing it will cause elasticsearch to be dragged down during peak hours. The purpose here is to eliminate the peak traffic.
- There are three in the shipper logstash diagram, which means there can be multiple, which can be distributed on different servers, either Windows or Linux.
- After reading the above three points, in fact has been very assured of the scalability of this architecture, in all fairness, really very flex, specific can see http://logstash.net/docs/1.4.2/

Installation Method

A little bit of Baidu is a bunch of items. Note that kibana is already included in the latest version of logstash. You don't need to download the kibana code and run the logstash web directly.

Body

The logstash input type used in this article is file input. Log collection is performed by detecting text files (logstash supports many inputs, and text files are only one of them, for details, see the URL of the above document)

Assume that the current log file log.txt contains a line of log records, for example:

[192.168.1.1] [23:59:00] [Error] [page1.page _ load] Null exception, Bal...

At this time, logstash shipperwill find this log.txt in the configuration file of the local directory, and then a new line is detected, the content is the above, and then it will:

- Regular Expression matching
  - 192.168.1.1 ==> serverip
  - 23:59:00 ==> eventtime
  - Error => loglevel
  - Page1.page _ load ==> Method
  - Null exception, Bal... => messagebody
  - Haha, of course, the above matching rules need to be configured in the configuration file.
- Send to subsequent nodes
  - In this article, it is sent to the rabbitmq Node
  - Haha, of course, it also needs to be configured in the configuration file

Rabbitmq actually plays a role in buffering off-peak.

So who will the rabbitmq message be sent? It is logstash indexer. logstash indexer is actually very simple. It only receives messages from MQ and then sends them to the es inverted sorting engine at the backend.

Then, in the last kibana web query console, the developer finally queries the logs collected by logstash through the kibana query interface. The following describes kibana.

Kibana Data source:

Elasticsearch: supports distributed extended inverted sorting search engines. The kernel is based on Lucene.

Custom kibana query interface:

Columns that can be flexibly changed

You can use the mouse to circle the time range (view the log list according to the time range)

Automatically refresh the log list

You can customize the version of the monitored log (such as the production system, UAT system, and development demo)

You can view the pie chart and other statistical charts of a field in a certain period of time.

Flexible sorting

You can define the front and back positions of a column.

You can define whether a column is displayed.

Take a look at the previous figure.

The establishment of the entire collection system, in addition to the establishment of the logstash component, also needs to pay attention to the log file storage format, that is, the format of the record containing one row and one row containing brackets in the upper section, because this format will correspond to the parsing of the logstash parameter, and the name of the logstash parameter will be mapped to the kibana query interface.

On the program side, you need to pay attention to the use of unified logging functions to record, so that the text file content format is guaranteed, and the entire closed loop is formed.

For specific configuration methods, you can join some logstash QQ groups, or refer to the documentation links at the beginning of this article.

Done.

Build a distributed log Collection System

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More