Vim/usr/local/logstash/etc/hello_search.conf
Enter the following:
Input {
stdin {
Type = "Human"
}
}
Output {
stdout {
codec = Rubydebug
}
Elasticsearch {
Host = "192.168.33.10"
Port = 9300
}
}
Note Port is 9300 instead of 9200
Start:
/usr/local/logstash/bin/logstash agent-f/usr/local/logstash/etc/hello_search.conf
Standard flow mode start
Bin/logstash-e ' input {stdin {}} ' output {stdout {}} '
Change the output performance
Bin/logstash-e ' input {stdin {}} output {stdout {codec = rubydebug}} '
Multiple outputs
Bin/logstash-e ' input {stdin {}} output {elasticsearch {host = localhost} stdout {}} '
Default configuration-index by day date
You will find that Logstash is smart enough to index on elasticsearch ... The index is indexed daily in the default format of LOGSTASH-YYYY.MM.DD. At midnight (GMT), Logstash automatically updates the index with a timestamp. We can figure out how much data to keep based on how long the data is traced, and of course you can move older data to other places (re-index) to make it easier to query, and if you simply delete a period of time data we can use Elasticsearch curator.
The life cycle of an event
Inputs,outputs,codecs,filters constitutes the core configuration item of the Logstash. Logstash provides the basis for efficient query data by creating an event-handling pipeline that extracts data from your logs and saves it to Elasticsearch. To give you a quick overview of the many options Logstash offers, let's discuss some of the most common configurations. For more information, please refer to the Logstash event pipeline.
Inputs
Input and inputs refer to the log data transfer to Logstash. The common configuration is as follows: File: Reads a file from the filesystem, much like the unix command "tail-0a" Syslog: Listens on port 514, parses the log data according to the RFC3164 standard Redis: reads data from a REDIS server, supports channel ( Publish subscription) and list mode. Redis is generally used as the "broker" role in the Logstash consumer cluster, saving the events queue for a total of logstash consumption. Lumberjack: The Lumberjack protocol is used to receive data and has been changed to Logstash-forwarder. Filters
The fillters serves as an intermediate processing component in the Logstash processing chain. They are often grouped together to implement specific behaviors that deal with the flow of events that match a particular rule. The common filters are as follows: Grok: Parse the irregular text and convert it into a structured format. Grok is by far the best way to transform unstructured data into structured queryable data. There are more than 120 matching rules that will meet your needs. Mutate:mutate filter allows you to change the input document, you can rename, delete, move or modify the field in the process of handling the event. Drop: Discard part of events is not processed, for example: Debug events. Clone: Copy the event, and you can add or remove fields from the process. GeoIP: Add geographic information (used for foreground kibana graphical display) Outputs
The outputs is the terminal component of the Logstash processing pipeline. An event can pass multiple outputs during processing, but once all the outputs are executed, the event completes the lifecycle. Some commonly used outputs include: Elasticsearch: If you plan to save data efficiently, and can easily and simply query ... Elasticsearch is a good way. Yes, there is a suspicion of advertising here, hehe. File: Saves the event data to a document. Graphite: Send event data to a graphical component, a popular open source storage graphical presentation component. http://graphite.wikidot.com/. STATSD:STATSD is a statistical service, such as technical and time statistics, with UDP communication, aggregating one or more backend services, if you have started to use STATSD, this option should be useful for you. Codecs
Codecs is a data flow-based filter that can be configured as part of the input,output. Codecs can help you easily split the data sent over the already serialized. Popular codecs include Json,msgpack,plain (text). JSON: Encode/decode data in JSON format multiline: Summarize data from multiple events in the sink into a single row. For example: Java exception information and stack information for complete configuration information, refer to the "Plugin Configuration" section of the Logstash documentation.
More interesting logstash content
Using configuration Files
Using the-e parameter to specify configuration on the command line is a very common way, but it requires a lot of content if you need to configure more settings. In this case, we first create a simple configuration file and specify Logstash to use this configuration file. If we create a configuration file with a filename of "logstash-simple.conf" and keep it in the same directory as Logstash. The contents are as follows:
Input {stdin {}}
Output {
elasticsearch {host = localhost}
stdout {codec = Rubydebug}
}
Next, execute the command:
Bin/logstash-f logstash-simple.conf
We see Logstash running the example in the configuration file you just created, which is more convenient. Note that we use the-f parameter to get the configuration from the command line using the-e parameter before taking it from the file. The above shows a very simple example, of course parsing to let us continue to write some of the more complex examples.
Filter filters
Filters is a line-processing mechanism that will provide the data that you need for formatted data, let's take a look at one of the following examples, called Grok filter filters.
Input {stdin {}}
Filter {
Grok {
Match + = {"Message" = "%{combinedapachelog}"}
}
Date {
Match = ["timestamp", "Dd/mmm/yyyy:hh:mm:ss Z"]
}
}
Output {
elasticsearch {host = localhost}
stdout {codec = Rubydebug}
}
Execute Logstash According to the following parameters:
Bin/logstash-f logstash-filter.conf
Now paste the following line of information into your terminal (of course Logstash will handle this standard input):
127.0.0.1--[11/dec/2013:00:01:45-0800] "get/xampp/status.php http/1.1", 3891 "http://cadenza/xampp/navi.php" Moz illa/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) gecko/20100101 firefox/25.0 "
You will see feedback similar to the following:
{
"Message" = "127.0.0.1--[11/dec/2013:00:01:45-0800] \" get/xampp/status.php http/1.1\ "3891 Ampp/navi.php\ "\" mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) gecko/20100101 firefox/25.0\ "",
"@timestamp" = "2013-12-11t08:01:45.000z",
"@version" = "1",
"Host" = "Cadenza",
"ClientIP" = "127.0.0.1",
"Ident" = "-",
"Auth" = "-",
"Timestamp" = "11/dec/2013:00:01:45-0800",
"Verb" = "GET",
"Request" = "/xampp/status.php",
"Httpversion" = "1.1",
"Response" = "200",
"bytes" = "3891",
"Referrer" = "\" Http://cadenza/xampp/navi.php\ "",
"Agent" = "\" mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) gecko/20100101 firefox/25.0\ ""
}
As you can see, Logstash (using the Grok filter) is able to split a row of log data (Apache's "combined Log" format) into different data fields. This is useful for future parsing and querying our own log data. For example: HTTP return status code, IP address related, and so on, very easy. There are few matching rules that are not included in Grok, so if you are trying to parse some of the common log formats, perhaps someone has done this job. If you view the detailed matching rules, refer to Logstash grok patterns.
Another filter is the date filter. This filter is responsible for parsing the timestamp in the log and assigning the value to the Timestame field (regardless of when the data was collected to Logstash). You may notice in this example that the @timestamp field is set to December 11, 2013, stating that Logstash is processed for a period of time after the log is generated. This field is added back to the data in the processing log, for example ... This value is the timestamp of the Logstash processing event.
A practical example
Apache log (get from file)
Now, let's use some very useful configuration ... apache2 access log! We will read the log files locally and handle the event that satisfies our needs through the conditional settings. First, we create a file name that is the logstash-apache.conf profile, as follows (you can modify your file name and path according to the actual situation):
Input {
File {
Path = "/tmp/access_log"
Start_position = Beginning
}
}
Filter {
If [path] =~ "Access" {
Mutate {Replace + = {"Type" = "Apache_access"}}
Grok {
Match + = {"Message" = "%{combinedapachelog}"}
}
}
Date {
Match = ["timestamp", "Dd/mmm/yyyy:hh:mm:ss Z"]
}
}
Output {
Elasticsearch {
host = localhost
}
stdout {codec = Rubydebug}
}
Next, we create a file according to the above configuration (in the example "/tmp/access.log"), you can use the following log information as the file content (also can be generated by your own webserver log):
71.141.244.242-kurt [18/may/2011:01:48:10-0700] "get/admin http/1.1" 301 566 "-" "mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.9.2.3) gecko/20100401 firefox/3.6.3 "
134.39.72.245--[18/may/2011:12:40:18-0700] "Get/favicon.ico http/1.1" 1189 "-" "(mozilla/4.0; MSIE 8.0; Windows NT 5.1; trident/4.0;. NET CLR 2.0.50727;. NET CLR 3.0.4506.2152;. NET CLR 3.5.30729; infopath.2;. net4.0c;. NET4.0E) "
98.83.179.51--[18/may/2011:19:35:08-0700] "Get/css/main.css http/1.1" 1837 . htm "" mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) gecko/20100101 firefox/4.0.1 "
Now use the-f parameter to perform the above example:
Bin/logstash-f logstash-apache.conf
You can see that the Apache log data has been imported into ES. Here Logstash will be read according to your configuration, processing the specified file, any subsequent additions to the file content will also be captured processing and finally saved to ES. In addition, the field value of type in the data is replaced with "apache_access" (this feature has been specified in the configuration).
This configuration only allows Logstash to monitor Apache Access_log, but in practice it is often not enough to monitor the error_log, as long as the changes in the configuration above can be implemented as follows:
Input {
File {
Path = "/tmp/*_log"
...
Now you can see that Logstash handles the error log and the access log. However, if you check your data (perhaps with Elasticsearch-kopf), you will find that the Access_log log is divided into different fields, but Error_log does not. This is because we use the "Grok" filter and simply configure the matching Combinedapachelog log format so that the log that satisfies the condition is automatically split into different fields. Isn't it nice that we can parse the log in one of its own formats through the control log? Right.
In addition, you may also find that Logstash does not repeatedly handle events that have already been processed in the file. Because Logstash has recorded the location of the file processing, it only processes the number of newly added rows in the file. Pretty!
Conditional judgment
We use the previous example to introduce the concept of conditional judgment. This concept should generally be familiar to most logstash users. You can use the If,else if and else statements just like any other normal programming language. Let's mark the log file types for each event dependency (Access_log,error_log other log files that end with log).
Input {
File {
Path = "/tmp/*_log"
}
}
Filter {
If [path] =~ "Access" {
Mutate {Replace + = {type = ' apache_access '}}
Grok {
Match + = {"Message" = "%{combinedapachelog}"}
}
Date {
Match = ["timestamp", "Dd/mmm/yyyy:hh:mm:ss Z"]
}
} else if [path] =~ "error" {
Mutate {Replace + = {type = ' Apache_error '}}
} else {
Mutate {Replace + = {type = ' random_logs '}}
}
}
Output {
elasticsearch {host = localhost}
stdout {codec = Rubydebug}
}
I think you've noticed that we use the "type" field to mark each event, but we don't actually parse the "error" and "random" types of logs ... In fact, there may be many types of error logs, how to parse as an exercise for the reader, you can rely on the existing log.
Syslog
Ok, now we continue to understand a very practical example: Syslog. Syslog is a long-used configuration for Logstash and it has a good performance (protocol format conforms to RFC3164). The syslog is actually a UNIX logging standard that sends log data to a local file or log server by the client. In this case, you don't have to create a syslog instance at all, and we can implement a syslog service through the command line, and you'll see what happens in this example.
First, let's create a simple configuration file to implement Logstash+syslog, and the file name is logstash-syslog.conf
Input {
TCP {
Port = 5000
Type = Syslog
}
UDP {
Port = 5000
Type = Syslog
}
}
Filter {
if [type] = = "Syslog" {
Grok {
Match + = {"Message" = "%{syslogtimestamp:syslog_timestamp}%{sysloghost:syslog_hostname}%{data:syslog_ Program} (?: \ [%{posint:syslog_pid}\])?:%{greedydata:syslog_message} "}
Add_field = ["Received_at", "%{@timestamp}"]
Add_field = ["Received_from", "%{host}"]
}
Syslog_pri {}
Date {
Match = ["Syslog_timestamp", "Mmm D HH:mm:ss", "MMM dd HH:mm:ss"]
}
}
}
Output {
elasticsearch {host = localhost}
stdout {codec = Rubydebug}
}
Execution Logstash:
Bin/logstash-f logstash-syslog.conf
Typically, a client is required to link to port 5000 on the Logstash server and then send the log data. In this simple demonstration we simply use Telnet to link to the Logstash server to send the log data (similar to the one we sent in the previous example to send the log data in the command line standard input state). First we open a new shell window and enter the following command:
telnet localhost 5000
You can copy and paste the following sample information (of course, you can use other characters, but this may be grok filter does not parse correctly):
Dec 12:11:43 Louis postfix/smtpd[31499]: Connect from unknown[95.75.93.154]
Dec 14:42:56 Louis named[16000]: Client 199.48.164.7#64817:query (Cache) ' Amsterdamboothuren.com/mx/in ' denied
Dec 14:30:01 Louis cron[619]: (www-data) CMD (php/usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/ Poller-error.log)
Dec 18:28:06 Louis RSYSLOGD: [Origin software= "Rsyslogd" swversion= "4.2.0" x-pid= "2253" x-info= "/HTTP/ Www.rsyslog.com "] Rsyslogd was huped, type ' lightweight '.
You can see the output in the window before you run Logstash, the information is processed and parsed!
{
"message" = "Dec 14:30:01 Louis cron[619]: (www-data) CMD (php/usr/share/cacti/site/poller.php >/dev/ Null 2>/var/log/cacti/poller-error.log) ",
@timestamp" = "2013-12-23t22:30:01.000z",
"@version" = "1",
"type" and "Syslog",
"host" = "0:0:0:0:0:0:0:1:52617",
"syslog_timestamp" = "Dec 23 14:30:01",
"Syslog_hostname" = "Louis",
"Syslog_program" and "CRON",
"Syslog_pid" and "619",
"Syslog_ Message "= =" (www-data) CMD (php/usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/ Poller-error.log) ",
" received_at "=" 2013-12-23 22:49:22 UTC ",
" Received_from "and" = " 0:0:0:0:0:0:0:1:52617 ",
" syslog_severity_code "= 5,
" Syslog_facility_code "and" = 1,
"Syslog_facility" = = "User-level",
"syslog_severity" = "notice"
}
Congratulations, see here you have become a qualified Logstash user. You'll be able to easily configure, run Logstash, and send an event to Logstash, but there's a lot more to be dug out of this process as you use it.
Reference http://www.2cto.com/os/201411/352015.html
Kafka Configuration Http://www.tuicool.com/articles/IRZvIz
http://my.oschina.net/abcfy2/blog/372138
http://kibana.logstash.es/content/Chinese Document Information
Logstash installation Configuration