ELK + FileBeat log analysis system construction, elkfilebeat
The log analysis system is rebuilt. The selected technical solutions are ELK, namely ElasticSearch, LogStash, and Kibana. Added Filebeat and Kafka.
In the past two days, the log analysis system was rebuilt. If no code is written, all of them use mature technical solutions for data collection. As for how to use the data in the future, we are still considering it.
Shows the overall solution:
The log collection tool uses FileBeat, the transmission tool uses Kafka, the data sorting tool uses LogStash cluster, and the data storage tool uses ES. Of course, this system can be simplified, for example, to Filebeat + ElasticSearch, FileBeat + LogStash + ElasticSearch, without considering the data Filter and data processing capabilities, both solutions are supported. However, the original intention of this platform is to support all the logs of the entire website, so Kafka is added to ensure data processing capabilities.
FileBeat collects data and then transmits it to the Kafka message queue. Then LogStash collects the data in the message queue for filtering and finally transmits the data to ES. FileBeat is Json-formatted when collecting data. This log collection tool is lightweight and consumes very little system resources. LogStash has a variety of Filter plug-ins for rough data processing. The advantages of Kafka and ES are needless to say. To ensure system stability, all components here use the cluster form. The kafka cluster uses three virtual machines, the LogStash cluster uses two virtual machines, and the ElasticSearch cluster uses two virtual machines. The following describes the installation and configuration of each component.
FileBeat
FileBeat is a lightweight log collector used to replace LogStash-Forwarding. Compared with the LogStash-Forwarding log collector, Filebeat consumes much less system resources. The server I use here is Windows. Of course, it can also be placed on a Linux machine. The configuration method is the same.
First download filebeat: https://www.elastic.co/downloads/beats/filebeat?ga-release. Here we will clearly describe the Installation and Use steps. Simply put, we will download, modify the configuration file, start, and observe. Therefore, see the figure below:
The version I used is 5.2.2. The entire installation directory is shown in:
Filebeat. yml indicates the configuration file, and filebeat. template. json indicates the json format of data output. The configuration file details are as follows:
Filebeat. prospectors:-input_type: log # Paths that shocould be crawled and fetched. glob based paths. paths:-D: \ Log \ Error \ * # type and encoding: GB2312document_type: Error # multiline reading. pattern: '^ [0-2] [0-9]: [0-5] [0-9]: [0-5] [0-9] 'multiline. negate: truemultiline. match: after-input_type: log # Paths that shocould be crawled and fetched. glob based paths. paths:-D: \ Log \ Info * encoding: GB2312document_type: Infomultiline. pattern: '^ [0-2] [0-9]: [0-5] [0-9]: [0-5] [0-9] 'multiline. negate: truemultiline. match: after # -------------------------- Elasticsearch output ------------------------------ # output. elasticsearch: # Array of hosts to connect. # hosts: ["localhost: 9200"] # Optional protocol and basic auth credentials. # protocol: "https" # username: "elastic" # password: "changeme" # ----------------------------- Logstash output ------------------------------ output. logstash: # The host address of Logstash. Here are two false hosts: ["106.205.10.138: 5044", "106.205.10.139: 5044"]
This configuration file defines two types of logs: Error and Info. The log output can be ES or LogStash or Kafka. However, since Kafka O & M has not yet been enabled, I first temporarily write it to LogStash.
After the configuration file is written, you can start FileBeat. You can simply start filebeat in Linux. Windows Installation is a little complicated and requires powershell to execute scripts. Open the CMD window. The installation code is as follows:
PowerShell.exe -ExecutionPolicy UnRestricted -File .\install-service-filebeat.ps1
The specific execution is as follows:
After installation, start it in the service window, or execute net start filebeat In the CMD window.
LogStash
The main function of LogStash is to filter the rough data collected by FileBeat, and then write the filtered data to ES. If you do not need to filter data, you can directly write data to ES from FileBeat or Kafka. However, to achieve log time parsing, redundant field filtering, and other functions, we still choose to add LogStash. The installation and configuration process of LogStash is as follows.
First, download the installation package. The version I used here is 2.4.0, which is executed on the Linux host;
wget https://download.elastic.co/logstash/logstash/logstash-2.4.0.tar.gzsudo tar -zxf logstash-2.4.0.tar.gz -C /usr/local/cd /usr/local/logstash-2.4.0sudo vim simple.conf
This code is used to download the installation package, decompress the installation package, and create a configuration file. After simple. conf is created, the configuration in it is as follows:
input{ beats{ codec => plain{charset => "UTF-8"} port => "5044" }}filter{ mutate{ remove_field => "@version" remove_field => "offset" remove_field => "input_type" remove_field => "beat" remove_field => "tags" } ruby{ code => "event.timestamp.time.localtime" }}output{ elasticsearch{ codec => plain{charset => "UTF-8"} hosts => ["106.205.10.138", "106.205.10.139"] }}
Input selects the filebeat data source as the access data, pushes the data to ES through output, and filters out redundant fields. Code => "event. timestamp. time. localtime" indicates that the timestamp is changed from UTC time to the local time. This is the GMT + 8, so eight hours are added.
After the configuration is complete, start LogStash and run the following code;
bin/logstash -f simple.conf
The logstash installation and configuration are completed.
ElasticSearch
Finally let's talk about the installation and configuration of ES, which is clearly written in the previous article, the installation configuration process see: http://blyang.cn/index.php/archives/26/
Conclusion
SO, here, the entire installation and configuration process is complete. Let's take a look at the effect first:
This is the status two days after the test data is collected. Subsequent data analysis is carried out on this basis. Kibana may be used or developed on its own, and the data will be written when determined.