Danbo Time: 2016-03-13
1. Save into Elasticsearch
Logstash can try different protocol implementations to complete the work of writing data to Elasticsearch, which describes the HTTP approach in this section.
Example configuration:
Output {elasticsearch {hosts= ["192.168.0.2:9200"] Index="logstash-%{type}-%{+yyyy. MM.DD}"Document_type="%{type}"Workers=1flush_size=20000Idle_flush_time=TenTemplate_overwrite=true }}
Explain:
Bulk Send
Flush_size and Idle_flush_time collectively control the behavior of Logstash sending bulk data to Elasticsearch. The above example shows that Logstash will save the data to 20k data and send it once, and set the maximum aging time to 10s.
By default: Flush_size is 500, idle_flush_time is 1s, note that this is also a lot of people single-change big flush_size also failed to improve the performance of ES.
Index name
The name of the ES Index to write, where variables can be used. To better fit the log scene, Logstash provides%{+yyyy. MM.DD} this notation. At the time of parsing, when you see the beginning of the + sign, you will automatically think of the following as the time format and try to parse the subsequent string with the time format.
The index name can not have capital letters, or ES in the log will error invalidindexnameexception, but Logstash will not error.
Template
Elasticsearch supports indexing pre-defined settings and mapping. The Logstash comes with an optimized template. The contents are as follows:
The key settings include
Template for Index-pattern
This template is applied only if the index matches the logstash-*. Sometimes we change the default index name of Logstash, and remember that you have to put a template that matches your custom index name. Of course, it is more recommended that you put the custom name behind "logstash-" and change it to index = "logstash-custom-%{+yyyy." MM.DD} "
Refresh_interval for indexing
Elasticsearch is a near real-time search engine. It actually refreshes the data every 1s. For log analysis applications we need not be so real-time, so the logstash comes with a template modified to 5s, in fact, can continue to improve the refresh interval to improve the number of write performance.
Multi-field with not_analyzed
Elasticsearch automatically uses its own default word breakers (spaces, dots, slashes, and so on) to analyze fields. A word breaker is very important for searching and scoring, but it greatly reduces the performance of index write and aggregate requests. So the Logstash template defines a field called a "multi-field" (Multi-field) type, and sets the field to not enable the word breaker. That is, when you want to get the aggregated result of the URL field, do not use "url" directly, but use "Url.raw" as the field name.
Geo_point
Elasticsearch supports geo_point types, geo distance aggregation, and so on, for example, you can request the total number of data points within a 10km radius of a geo_point point.
Doc_values
Doc_values is a new feature introduced in the Elasticsearch 1.0 release. When the attribute is enabled, the index is written back to the disk when the Fielddata is built. Doc_values can only give no participle ("index" is set for the string field)
*******
Elk Example-Lite version 2