Use Frigga in CentOS to implement WEB-based service monitoring

Source: Internet
Author: User
Tags gpg

I. Introduction

The important internal services of the company have been monitored through the zabbix introduced earlier, but some recent situations have prompted us to re-Improve the monitoring methods.

During the 11th holiday, more than 300 Alert Messages were received one night, all of which were disconnected from the network and re-connected. From the alert content, I immediately suspected that the Network was unstable, however, the person on duty (Business Department) reported that the service application was normal that night, so that network faults could be ruled out. Afterwards, the monitoring system was suspected to have insufficient performance due to the MYSQL database, at that time, the idea was that the database service should be able to immediately find out whether it was the cause of performance. At that time, professional maintenance staff were not able to remotely maintain the service (only with their mobile phones during the holidays ), this problem is not worth the immediate maintenance of the holiday. At that time, I thought how nice it would be to refresh and view the Custom Service on the webpage.

A few days ago, I received a report saying that the business system could not be accessed and the maintenance staff's mobile phone did not receive an alarm. After troubleshooting, I found that the monitoring service was stopped manually. Therefore, it is imperative to monitor the monitoring service.

1.1 Introduction to Frigga

Xiaomi open-source process monitoring tool (https://github.com/xiaomi-sa/frigga), Frigga is a simple and highly scalable process monitoring framework. Based on the open-source god, She modified and added the web interface and rpc interface to meet the service management needs of large clusters.

In Nordic mythology, frigga is the wife of odin, who is in charge of Marriage and Family, and textile clouds.

1.2 Frigga Functions

Integrated with god, used as the supervise program of the program

C/S structure and integration with multiple authentication methods to support O & M management of large clusters

Api interfaces are provided for basic functions to facilitate expansion.

Supports standalone web-based god for easy viewing and Management

Support Log Viewing

Supports adding custom xmlrpc interfaces for Secondary Development

1.3 environment dependency of Frigga

Ruby 1.9.3 and bundle

Ii. install and configure Frigga

2.1 configure YUM Source

Configure the system-related yum source as follows:

12345678wget http://mirrors.163.com/.help/CentOS6-Base-163.repomvCentOS6-Base-163.repo /etc/yum.repos.d/wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpmwget http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el6.rf.x86_64.rpmrpm -ivh epel-release-6-8.noarch.rpmrpm -ivh rpmforge-release-0.5.3-1.el6.rf.x86_64.rpmrpm -ivh http://rpms.famillecollet.com/enterprise/remi-release-6.rpmrpm --import/etc/pki/rpm-gpg/RPM-GPG-KEY-remi

2.2 install Frigga

Install the dependency packages related to Frigga as follows:

12345678910yum installgit # Install gityum -y installruby gems # Installing rubycurl -L https://get.rvm.io | bash-s stable # Upgrade rubyrvm installruby-1.9.3 # You can log on again before this step.gem installbundle # Install bundlecd/opt/# Install Friggagit clone https://github.com/xiaomi-sa/frigga.gitcd/opt/frigga./script/run.rb start # Start Friggagod status # Viewing the startup status

2.3 configure Frigga

By default, port WEB9001 is enabled after Frigga is started. The authentication configuration of the user name and password is stored in the/opt/frigga/conf/frigga. yml file, as shown below:

1234cat/opt/frigga/conf/frigga.yml---port: 9001http_auth: ["admin", "password"]

2.4 configure SSH monitoring

The configuration file, ending with the. god suffix, can be stored in the gods folder. The following example uses the ssh service under CentOS6.4 as an example. When an ssh process starts or restart five times in five minutes, if it fails to start, it is changed to unmonitored. After 10 minutes, it is started again. If it is within two hours, all attempts failed once and gave up completely.

12345678910111213141516171819202122232425262728293031cd/opt/frigga/godsvim sshd.godGod.watchdo |w|w.name = 'sshd'w.start = "/etc/init.d/sshd start"w.stop = "/etc/init.d/sshd stop"w.restart = "/etc/init.d/sshd restart"w.interval = 30.secondsw.start_grace = 10.secondsw.restart_grace = 10.secondsw.pid_file = '/var/run/sshd.pid'w.:clean_pid_file)w.start_if do|start|start.condition(:process_running) do|c|c.interval = 5.secondsc.running = falseendend# lifecyclew.lifecycle do|on|on.condition(:flapping) do|c|c.to_state = [:start, :restart]c.times= 5c.within = 5.minutec.transition= :unmonitoredc.retry_in = 10.minutesc.retry_times = 5c.retry_within = 2.hoursendendend

Note that do not write an error in the sshd command path. After completing the preceding settings, enter the following command to load sshd monitoring:

1god load sshd.god


2.5 configure apache monitoring

1234567891011121314151617181920212223242526272829God.watchdo |w|w.name = 'apache'w.start = "/etc/init.d/httpd start"w.stop = "/etc/init.d/httpd stop"w.restart = "/etc/init.d/httpd restart"w.interval = 30.secondsw.start_grace = 10.secondsw.restart_grace = 10.secondsw.pid_file = '/var/run/httpd/httpd.pid'w.:clean_pid_file)w.start_if do|start|start.condition(:process_running) do|c|c.interval = 5.secondsc.running = falseendend# lifecyclew.lifecycle do|on|on.condition(:flapping) do|c|c.to_state = [:start, :restart]c.times= 5c.within = 5.minutec.transition= :unmonitoredc.retry_in = 10.minutesc.retry_times = 5c.retry_within = 2.hoursendendend


2.7 configure MySQL monitoring

1234567891011121314151617181920212223242526272829God.watchdo |w|w.name = 'mysql'w.start = "/etc/init.d/mysqld start"w.stop = "/etc/init.d/mysqld start"w.restart = "/etc/init.d/mysqld restart"w.interval = 30.secondsw.start_grace = 10.secondsw.restart_grace = 10.secondsw.pid_file = '/var/run/mysqld/mysqld.pid'w.:clean_pid_file)w.start_if do|start|start.condition(:process_running) do|c|c.interval = 5.secondsc.running = falseendend# lifecyclew.lifecycle do|on|on.condition(:flapping) do|c|c.to_state = [:start, :restart]c.times= 5c.within = 5.minutec.transition= :unmonitoredc.retry_in = 10.minutesc.retry_times = 5c.retry_within = 2.hoursendendend



This article is from the "virtual reality" blog, please be sure to keep this source http://waringid.blog.51cto.com/65148/1313557


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.