Logstash Quick Start

Source: Internet
Author: User
Tags rfc3164 syslog apache log kibana logstash statsd redis server

Original address: http://www.2cto.com/os/201411/352015.html

Original address: Http://logstash.net/docs/1.4.2/tutorials/getting-started-with-logstash (pure English)

English proficiency is limited, if there are errors please correct me

Brief introduction

Logstash is a tool that receives, processes, and forwards logs. Support system logs, webserver logs, error logs, application logs, all of which include all types of logs that can be thrown out. How does that sound really good?
In a typical usage scenario (ELK): Using Elasticsearch as a storage for background data, Kibana is used for front-end report presentation. The role of Logstash as a porter in its process creates a powerful pipeline chain for data storage, report queries, and log parsing. Logstash offers a wide range of input,filters,codecs and output components that make it easy for users to achieve powerful functionality. All right, let's get started.

Dependency Condition: JAVA

The Logstash run relies solely on the Java Runtime Environment (JRE). You can run the java-version command at the command line to display a result similar to the following:

Java-versionjava version "1.7.0_45" Java (tm) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot (tm) 64-bit Server VM (Build 24.45-b08, Mixed mode)

To ensure a successful run, Logstash recommends that you use the more recent version of JRE. Can get the open source version of the JRE in: http://openjdk.java.net or you can download the Oracle JDK version on the website: http://www.oracle.com/technetwork/java/index.html Once the JRE has been successfully installed in your system, we can continue

Example of two commands to start and run Logstash

First step we download Logstash

Curl-o https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz

Now you should have a file called logstash-1.4.2.tar.gz. Let's unzip it, please.

Tar zxvf logstash-1.4.2.tar.gzcd logstash-1.4.2

Now let's run it:

Bin/logstash-e ' input {stdin {}} ' output {stdout {}} '

We can now enter some characters at the command line, and then we will see the output of the Logstash:

Hello world2013-11-21t01:22:14.405+0000 0.0.0.0 Hello World

OK, it's kinda interesting ... In the example above, we define an input called "stdin" and an output of "stdout" in the run Logstash, and Logstash will return the characters we have entered in a format, regardless of the character we enter. Note here that we used the- e parameter on the command line, which allows Logstash to accept settings directly from the command line. This is especially quick to help us repeatedly test the configuration correctly without writing the configuration file.
Let's try a more interesting example. First we use the CTRL-C command at the command line to exit the previously running Logstash. Now we re-run Logstash using the following command:

Bin/logstash-e ' input {stdin {}} output {stdout {codec = rubydebug}} '

We enter some characters, this time we enter "Goodnight Moon":

Goodnight moon{  "message" = "Goodnight Moon",  "@timestamp" = "2013-11-20t23:48:05.335z",  "@ Version "=" 1 ",  " host "=" My-laptop "}

In the example above, we can change the output performance of the Logstash by re-setting an output called "stdout" (Adding the "codec" parameter). Similarly, we can make the arbitrary format log data possible by adding or modifying inputs, outputs, and filters in your profile, thus making it easier to tailor a more reasonable storage format for the query.

Using Elasticsearch to store logs

Now, you might say, "It looks pretty big, but it does enter characters manually and echoes the characters out of the console." The actual situation is not practical ". Well said, then we'll build Elasticsearch to store the log data entered into the Logstash. If you have not installed Elasticsearch, you can download the Rpm/deb package or manually download the TAR package via the following command:

Curl-o Https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.tar.gztar ZXVF ELASTICSEARCH-1.1.1.TAR.GZCD Elasticsearch-1.1.1/./bin/elasticsearch


Note This article example uses Logstash 1.4.2 and Elasticsearch 1.1.1. Different versions of Logstash have the corresponding recommended Elasticsearch version. Please make sure you use the Logstash version!
For more information on installing and setting up Elasticsearch, refer to Elasticsearch website. Because we mainly introduce the introduction of Logstash, the default installation and configuration of Elasticsearch has been met with our requirements.
Elasticsearch is now running and listening on port 9200 (everybody's done, right?) ), you can use Elasticsearch as its back end by simply setting the Logstash. The default configuration is sufficient for Logstash and Elasticsearch, and we omit some additional options to set the Elasticsearch as output:

Bin/logstash-e ' input {stdin {}} output {elasticsearch {host = localhost}} '

Randomly enter some characters, Logstash will process the log as before (but this time we will not see any output, because we do not set stdout as the output option)

You know, for logs

We can use the Curl command to send a request to see if ES has received the data:

Curl ' Http://localhost:9200/_search?pretty '

The return content is as follows:

{  "took": 2,  "Timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  } ,  "hits": {    "total": 1,    "Max_score": 1.0,    "hits": [{      "_index": "logstash-2013.11.21",      "_ Type ":" Logs ","      _id ":" 2ijaokqarqgvbmgp3bspja ","      _score ": 1.0," _source ": {" message ":" You know, for logs "," @ Timestamp ":" 2013-11-21t18:45:09.862z "," @version ":" 1 "," host ":" My-laptop "}    }]}  }

Congratulations, you've successfully used Elasticsearch and Logstash to collect log data.

Elasticsearch plugin (off topic)

Here is another useful tool for querying your Logstash data (data in elasticsearch) called the Elasticsearch-kopf plugin. For more information, see the Elasticsearch plugin. To install Elasticsearch-kopf, simply execute the following command in the directory where you installed Elasticsearch:

Bin/plugin-install Lmenezes/elasticsearch-kopf

Next visit Http://localhost:9200/_plugin/kopf to browse the data saved in Elasticsearch, settings and mappings!

Multiple outputs

As a simple example to set multiple outputs, let's set stdout and Elasticsearch as output to rerun the Logstash as follows:

Bin/logstash-e ' input {stdin {}} output {elasticsearch {host = localhost} stdout {}} '

When we enter a few phrases, the input is echoed to our terminal and saved to elasticsearch! (You can use Curl and Kopf plug-ins to verify).

Default configuration-index by day date

You will find that Logstash is smart enough to index on elasticsearch ... The index is indexed daily in the default format of LOGSTASH-YYYY.MM.DD. At midnight (GMT), Logstash automatically updates the index with a timestamp. We can figure out how much data to keep based on how long the data is traced, and of course you can move older data to other places (re-index) to make it easier to query, and if you simply delete a period of time data we can use Elasticsearch curator.

Next

Next we begin to learn more about advanced configuration items. In the following chapters, we focus on some of the core features of Logstash and how to interact with the Logstash engine.

The life cycle of an event

Inputs,outputs,codecs,filters constitutes the core configuration item of the Logstash. Logstash provides the basis for efficient query data by creating an event-handling pipeline that extracts data from your logs and saves it to Elasticsearch. To give you a quick overview of the many options Logstash offers, let's discuss some of the most common configurations. For more information, please refer to the Logstash event pipeline.
Inputs
Input and inputs refer to the log data transfer to Logstash. The common configuration is as follows: File: Reads a file from the filesystem, much like the unix command "tail-0a" Syslog: Listens on port 514, parses the log data according to the RFC3164 standard Redis: reads data from a REDIS server, supports channel ( Publish subscription) and list mode. Redis is generally used as the "broker" role in the Logstash consumer cluster, saving the events queue for a total of logstash consumption. Lumberjack: The Lumberjack protocol is used to receive data and has been changed to Logstash-forwarder.Filters
The fillters serves as an intermediate processing component in the Logstash processing chain. They are often grouped together to implement specific behaviors that deal with the flow of events that match a particular rule. The common filters are as follows: Grok: Parse the irregular text and convert it into a structured format. Grok is by far the best way to transform unstructured data into structured queryable data. There are more than 120 matching rules that will meet your needs. Mutate:mutate filter allows you to change the input document, you can rename, delete, move or modify the field in the process of handling the event. Drop: Discard part of events is not processed, for example: Debug events. Clone: Copy the event, and you can add or remove fields from the process. GeoIP: Add geographic information (used for foreground kibana graphical display)Outputs
The outputs is the terminal component of the Logstash processing pipeline. An event can pass multiple outputs during processing, but once all the outputs are executed, the event completes the lifecycle. Some commonly used outputs include: Elasticsearch: If you plan to save data efficiently, and can easily and simply query ... Elasticsearch is a good way. Yes, there is a suspicion of advertising here, hehe. File: Saves the event data to a document. Graphite: Send event data to a graphical component, a popular open source storage graphical presentation component. http://graphite.wikidot.com/. STATSD:STATSD is a statistical service, such as technical and time statistics, with UDP communication, aggregating one or more backend services, if you have started to use STATSD, this option should be useful for you.Codecs
Codecs is a data flow-based filter that can be configured as part of the input,output. Codecs can help you easily split the data sent over the already serialized. Popular codecs include Json,msgpack,plain (text). JSON: Encode/decode data in JSON format multiline: Summarize data from multiple events in the sink into a single row. For example: Java exception information and stack information for complete configuration information, refer to the "Plugin Configuration" section of the Logstash documentation.

More fun Logstash Content usage profiles

Using the-e parameter to specify configuration on the command line is a very common way, but it requires a lot of content if you need to configure more settings. In this case, we first create a simple configuration file and specify Logstash to use this configuration file. If we create a configuration file with a filename of "logstash-simple.conf" and keep it in the same directory as Logstash. The contents are as follows:

Input {stdin {}}output {  elasticsearch {host = localhost}  stdout {codec = Rubydebug}}

Next, execute the command:

Bin/logstash-f logstash-simple.conf

We see Logstash running the example in the configuration file you just created, which is more convenient. Note that we use the-f parameter to get the configuration from the command line using the-e parameter before taking it from the file. The above shows a very simple example, of course parsing to let us continue to write some of the more complex examples.

Filter filters

Filters is a line-processing mechanism that will provide the data that you need for formatted data, let's take a look at one of the following examples, called Grok filter filters.

Input {stdin {}}filter {  Grok {    match + = {"Message" = "%{combinedapachelog}"}  }  date {    mat  ch = = ["Timestamp", "Dd/mmm/yyyy:hh:mm:ss Z"]  }}output {  elasticsearch {host = localhost}  stdout { codec = Rubydebug}}

Execute Logstash According to the following parameters:

Bin/logstash-f logstash-filter.conf

Now paste the following line of information into your terminal (of course Logstash will handle this standard input):

127.0.0.1--[11/dec/2013:00:01:45-0800] "get/xampp/status.php http/1.1", 3891 "http://cadenza/xampp/navi.php" Moz illa/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) gecko/20100101 firefox/25.0 "

You will see feedback similar to the following:

{"message" = "127.0.0.1--[11/dec/2013:00:01:45-0800] \" get/xampp/status.php HTTP /1.1\ "3891 \" Http://cadenza/xampp/navi.php\ "\" mozilla/5.0 (Macintosh; Intel Mac OS X 10.9;           rv:25.0) gecko/20100101 firefox/25.0\ "", "@timestamp" = "2013-12-11t08:01:45.000z", "@version" and "1", "Host" = "Cadenza", "clientip" = "127.0.0.1", "Ident" and "-", "auth" and "-" , "timestamp" = "11/dec/2013:00:01:45-0800", "verb" and "GET", "request" and "="/xampp/status . php "," httpversion "=" 1.1 "," Response "and" $ "," bytes "and" 3891 "," referrer "and" = " \ "Http://cadenza/xampp/navi.php\" "," agent "=" \ "mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) gecko/20100101 firefox/25.0\ ""} 

As you can see, Logstash (using the Grok filter) is able to split a row of log data (Apache's "combined Log" format) into different data fields. This is useful for future parsing and querying our own log data. For example: HTTP return status code, IP address related, and so on, very easy. There are few matching rules that are not included in Grok, so if you are trying to parse some of the common log formats, perhaps someone has done this job. If you view the detailed matching rules, refer to Logstash grok patterns.
Another filter is the date filter. This filter is responsible for parsing the timestamp in the log and assigning the value to the Timestame field (regardless of when the data was collected to Logstash). You may notice in this example that the @timestamp field is set to December 11, 2013, stating that Logstash is processed for a period of time after the log is generated. This field is added back to the data in the processing log, for example ... This value is the timestamp of the Logstash processing event.

Useful examples of Apache logs (obtained from files)

Now, let's use some very useful configuration ... apache2 access log! We will read the log files locally and handle the event that satisfies our needs through the conditional settings. First, we create a file name that is the logstash-apache.conf profile, as follows (you can modify your file name and path according to the actual situation):

Input {  file {    path = "/tmp/access_log"    start_position = beginning  }}filter {  If [path] =~ " Access "{    mutate {replace + = {" Type "=" Apache_access "}}    Grok {      match + = {" Message "="%{co Mbinedapachelog} "}}"  Date {    match = = ["Timestamp", "Dd/mmm/yyyy:hh:mm:ss Z"]  }}output { C13/>elasticsearch {    host = localhost  }  stdout {codec = Rubydebug}}

Next, we create a file according to the above configuration (in the example "/tmp/access.log"), you can use the following log information as the file content (also can be generated by your own webserver log):

71.141.244.242-kurt [18/may/2011:01:48:10-0700] "get/admin http/1.1" 301 566 "-" "mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.9.2.3) gecko/20100401 firefox/3.6.3 "134.39.72.245--[18/may/2011:12:40:18-0700]" Get/favicon.ico HTTP/1.1 "200 1 189 "-" "mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; trident/4.0;. NET CLR 2.0.50727;. NET CLR 3.0.4506.2152;. NET CLR 3.5.30729; infopath.2;. net4.0c;. NET4.0E) "98.83.179.51--[18/may/2011:19:35:08-0700]" Get/css/main.css http/1.1 "1837 Information.htm "" mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) gecko/20100101 firefox/4.0.1 "

Now use the-f parameter to perform the above example:

Bin/logstash-f logstash-apache.conf

You can see that the Apache log data has been imported into ES. Here Logstash will be read according to your configuration, processing the specified file, any subsequent additions to the file content will also be captured processing and finally saved to ES. In addition, the field value of type in the data is replaced with "apache_access" (this feature has been specified in the configuration).
This configuration only allows Logstash to monitor Apache Access_log, but in practice it is often not enough to monitor the error_log, as long as the changes in the configuration above can be implemented as follows:

Input {  file {    path = '/tmp/*_log ' ...


Now you can see that Logstash handles the error log and the access log. However, if you check your data (perhaps with Elasticsearch-kopf), you will find that the Access_log log is divided into different fields, but Error_log does not. This is because we use the "Grok" filter and simply configure the matching Combinedapachelog log format so that the log that satisfies the condition is automatically split into different fields. Isn't it nice that we can parse the log in one of its own formats through the control log? Right.
In addition, you may also find that Logstash does not repeatedly handle events that have already been processed in the file. Because Logstash has recorded the location of the file processing, it only processes the number of newly added rows in the file. Pretty!

Conditional judgment

We use the previous example to introduce the concept of conditional judgment. This concept should generally be familiar to most logstash users. You can use the If,else if and else statements just like any other normal programming language. Let's mark the log file types for each event dependency (Access_log,error_log other log files that end with log).

Input {  file {    path ' = '/tmp/*_log '  }}filter {  If [path] =~ ' access ' {    mutate {replace + = {Typ E = "Apache_access"}}    Grok {      match + = {"Message" = "%{combinedapachelog}"}    }    date {      mat ch = = ["Timestamp", "Dd/mmm/yyyy:hh:mm:ss Z"]    }  } else if [path] =~ "error" {    mutate {replace = = {type = "Apache_error"}}  } else {    Mutate {replace = {type = ' random_logs '}}  }}output {  Elasticsearch {host = Localhos T}  stdout {codec = Rubydebug}}


I think you've noticed that we use the "type" field to mark each event, but we don't actually parse the "error" and "random" types of logs ... In fact, there may be many types of error logs, how to parse as an exercise for the reader, you can rely on the existing log.

Syslog

Ok, now we continue to understand a very practical example: Syslog. Syslog is a long-used configuration for Logstash and it has a good performance (protocol format conforms to RFC3164). The syslog is actually a UNIX logging standard that sends log data to a local file or log server by the client. In this case, you don't have to create a syslog instance at all, and we can implement a syslog service through the command line, and you'll see what happens in this example.
First, let's create a simple configuration file to implement Logstash+syslog, and the file name is logstash-syslog.conf

Input {  TCP {    port ' =  ' = ' = ' = ' syslog} '  udp {port ' = ' + ' = '    = '    syslog
   
    }}filter {  if [type] = = "Syslog" {    Grok {      match + = {"Message" = "%{syslogtimestamp:syslog_timestamp" }%{sysloghost:syslog_hostname}%{data:syslog_program} (?: \ [%{posint:syslog_pid}\])?:%{greedydata:syslog_message} "}      Add_field = [" Received_at ","%{@timestamp} "]      Add_field = ["Received_from", "%{host}"]    }    Syslog_pri {}    date {      match = = ["Syslog_timestamp", "Mmm  d HH:mm:ss", "MMM dd HH:mm:ss"]    }  } }output {  Elasticsearch {host = localhost}  stdout {codec = Rubydebug}}
   


Execution Logstash:

Bin/logstash-f logstash-syslog.conf


Typically, a client is required to link to port 5000 on the Logstash server and then send the log data. In this simple demonstration we simply use Telnet to link to the Logstash server to send the log data (similar to the one we sent in the previous example to send the log data in the command line standard input state). First we open a new shell window and enter the following command:

telnet localhost 5000

You can copy and paste the following sample information (of course, you can use other characters, but this may be grok filter does not parse correctly):

Dec 12:11:43 Louis postfix/smtpd[31499]: Connect from Unknown[95.75.93.154]dec 14:42:56 Louis named[16000]: Client 1 99.48.164.7#64817:query (Cache) ' Amsterdamboothuren.com/mx/in ' Denieddec 14:30:01 Louis cron[619]: (www-data) CMD (ph p/usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log) Dec 18:28:06 Louis RSYSLOGD: [Origin software= "Rsyslogd" swversion= "4.2.0" x-pid= "2253" x-info= "http://www.rsyslog.com"] rsyslogd was huped, type ' Lightweight '.


After that you can see the output in the window before you run Logstash, the information is processed and parsed!

{"message" = "Dec 14:30:01 Louis cron[619]: (www-data) CMD (Php/usr/shar e/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log) "," @timestamp "=" 2013-12-23t 22:30:01.000z "," @version "=" 1 "," type "=" syslog "," host "=          > "0:0:0:0:0:0:0:1:52617", "syslog_timestamp" = "Dec 14:30:01", "Syslog_hostname" and "Louis",  "Syslog_program" = "CRON", "Syslog_pid" and "619", "syslog_message" and "=" (Www-data)  CMD (php/usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log) "," Received_at "  = = "2013-12-23 22:49:22 UTC", "Received_from" and "0:0:0:0:0:0:0:1:52617", "syslog_severity_code" + = 5, "syslog_facility_code" = 1, "syslog_facility" = "User-level", "syslog_severity" and "Noti" Ce "} 


Congratulations, see here you have become a qualified Logstash user. You'll be able to easily configure, run Logstash, and send an event to Logstash, but there's a lot more to be dug out of this process as you use it.

Logstash Quick Start

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.