Logstash configuration and use of log analysis

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Logstash is a data analysis software that is primarily designed to analyze log logs. The whole set of software can be used as an MVC model, Logstash is the controller layer, Elasticsearch is a model layer, Kibana is the view layer.

First, the data is passed to Logstash, which filters and formats the data (in JSON format), and then passes it to Elasticsearch for storage, search indexing, Kibana provides front-end pages for search and chart visualization, It is called to visualize the data returned by the Elasticsearch interface. Logstash and Elasticsearch are written in Java, Kibana Use the node. JS Framework.

This software official website has the very detailed use instruction, https://www.elastic.co/, besides Docs, also has the video tutorial. This blog is a collection of some of the more important settings and uses of docs and videos. First, the configuration of Logstash 1. Define the data source

Write a configuration file that can be named logstash.conf and enter the following:

Input {
        file {
                path = "/data/web/logstash/logfile/*/*"
                start_position = "Beginning" #从文件开始处读写
        }
#       stdin {}  #可以从标准输入读数据
}

Defined data sources that support files, stdin, Kafka, Twitter, and so on, and can even write an input plugin yourself. If you write file with a wildcard character like above, it will automatically scan if a new log file is copied in.

2. Define the format of the data

Match with regular expressions according to the format of the log

Filter {

  #定义数据的格式
  grok {
    match + = {"Message" = "%{data:timestamp}\|%{ip:serverip}\|%{ip:clientip}" \|%{data:logsource}\|%{data:userid}\|%{data:requrl}\|%{data:requri}\|%{data:refer}\|%{data:device}\|%{data: Textduring}\|%{data:duringtime:int}\|\| "}
  }

}

Because the format of the logging is this:

2015-05-07-16:03:04|10.4.29.158|120.131.74.116| web|11299073|http://quxue.renren.com/shareapp?isappinstalled=0&userid=11299073&from=groupmessage|/ shareapp|null| mozilla/5.0 (IPhone; CPU iPhone os 8_2 like Mac os X applewebkit/600.1.4 (khtml, like Gecko) mobile/12d508 micromessenger/6.1.5 nettype/wifi|d uringtime|98| |

separated by | symbol, the first is access time, timestamp, as Logstash timestamp, followed by: Server IP, client IP, machine type (web/app/admin), the user's ID (no 0), the full URL of the request, The requested controller path, reference, device information, Duringtime, time spent by the request.

As the above code, the field is defined in turn, with a regular expression to match, data is logstash defined regular, actually is (. *), and defines the field name.

We take the time of the visit as the timestamp of the Logstash, and with this we can see how the request for parsing a certain period of time is based on time, and if there is no match to this time, Logstash will use the current time as the timestamp for that record. You need to filter inside the format that defines the timestamp, that is, the format of the log:

Filter {

  #定义数据的格式
  grok {#同上 ...}

  #定义时间戳的格式
  Date {
    match = = ["Timestamp", "YYYY-MM-DD-HH:MM:SS"]
    locale = "cn"
  }

}

In the above field you need to tell Logstash which is the client Ip,logstash will automatically fetch the relevant location information of the IP:

Filter {

  #定义数据的格式
  grok {#同上}

  #定义时间戳的格式
  date {#同上}

  #定义客户端的IP是哪个字段 (the data format defined above)
  GeoIP {
    Source = "ClientIP"
  }
}

Also has the client's UA, because the UA format is more, Logstash also automatically analyzes, extracts the operating system and so on related information

  #定义客户端设备是哪一个字段
  useragent {
    Source = "Device"
    target = "Userdevice"
  }

Which fields are integral type, also need to tell Logstash, for later analysis can be sorted, the use of data inside only one time

  #需要进行转换的字段, here is the time to turn the visit to int, and then to Elasticsearch
  mutate {
    convert = = ["Duringtime", "Integer"]
  }

3. Output Configuration

Finally, the output is configured to output the data of the filter buckle to the Elasticsearch

Output {
  #将输出保存到elasticsearch, if no match to time is not saved, because the URL parameters in the log with a newline
  if [timestamp] =~/^\d{4}-\d{2}-\d{2}/{
        elasticsearch {host = localhost}
  }

   #输出到stdout
#  stdout {codec = rubydebug}

   #定义访问数据的用户名和密码
#  user = WebService
#< C21/>password = 1q2w3e4r
}

We will save the above configuration to logstash.conf and then run the Logstash

After the Logstash boot is complete, enter the above access record, and Logstash will output the filtered data:

You can see the Logstash, automatically query IP attribution, and the request inside the device field for analysis. ii. configuration of Elasticsearch and Kibana 1. Elasticsearch

This does not need to be done, use the default configuration. Configuration is: config/elasticsearch.yml

If you need to set the data expiration time, you can add these two lines (visually matching, not verified, the reader can try):

#设置为30天过期

indices.cache.filter.expire:30d

index.cache.filter:30d

Elasticsearch the default listener on port 9200, which can be queried and managed, such as the health status of the index:

Curl ' Localhost:9200/_cluster/health?level=indices&pretty '

Output

{" cluster_name ":" Elasticsearch "," status ":" Yellow "," timed_out ": false," numb Er_of_nodes ": 2," Number_of_data_nodes ": 1," active_primary_shards ": 161," Active_shards ": 161," relocating_sh
    Ards ": 0," Initializing_shards ": 0," unassigned_shards ": 161," Number_of_pending_tasks ": 0," indices ": { "logstash-2015.05.05": {"status"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Logstash configuration and use of log analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Logstash configuration and use of log analysis

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support