Talk about surveillance.

Source: Internet
Author: User

Directory
    • 1. Background
    • 2. Overview
    • 3. How to monitor
      • 3.1. Satellite monitoring
      • 3.2. Stepwise diagnosis
      • 3.3. Simulating artificial
      • 3.4. Data analysis
      • 3.5. Monitoring and development
    • 4. Summary
1. Background

Every enterprise is aware of the importance of monitoring, but the monitoring of 80% Enterprises is still in the initial stage of monitoring.

What's the primary stage?
    1. Passive monitoring, fault operation the operator is never the first person to find a fault.
    2. Monitoring IP addresses and TCP ports, many times HTTP 80 ports normally accept requests, but the Web server does not work properly.
    3. Human flesh monitoring (human operation), the use of human sea tactics, desktop display a lot of monitors, and even projectors, the need to monitor the various dashboards interface, to develop a variety of workflow and KPI assessment monitoring personnel.
    4. Human flesh test, the monitoring personnel are required to manually operate every few minutes, to confirm that the system works, for example (not 15 minutes to log in once, the next top order, make a payment, etc.).
    5. All-in-a-universal reboot to restart all servers.
What's the intermediate stage?
    1. Alarm: Mobile phone message more reliable, because the mobile phone carry (mail does not count, mail arrives slow, various factors unstable)
    2. Monitoring services: Probing service availability, not just monitoring ports, note I refer to Private protocol monitoring (Http,smtp,ftp,mysql not included)
    3. Fault analysis: Through the log and debugging tools to analyze software bugs, guide developers to improve software quality, so that their failure will not happen again, to achieve no restart Restart method to solve the problem
    4. Semi-automated testing

What's the advanced stage? I think the advanced stage is the integration of monitoring and recovery systems. In addition to monitoring and development is closely related, in the development stage need for monitoring data collection to pave the way, each development of a new function to think about whether the future of this feature needs to monitor, how to monitor. Data pre-acquisition and data mining is very important, monitoring can not only do software and hardware performance analysis, but also provide decision support.

In addition to monitoring, another closely related is automatic failover, interested to see my other articles http://netkiller.github.io/journal/

2. Overview

Your search and monitoring on Baidu is mostly the installation configuration guide for some open source or commercial software. These articles will show you how to monitor CPU, memory, hard disk space, and network IP address and port numbers.

Open source software is nothing more than Nagios, Cacti, MRTG, Zibbix ..... The software is detailed in my e-book, Netkiller Monitoring Codex, which describes how to install and configure.

Commercial software also has a lot like SolarWinds, Whit ' s up,prtg ...

All the servers, network equipment, monitoring you have done, then according to my above monitoring rating, you are in the monitoring stage?

3. How to monitor

What are the ways and means of monitoring?

3.1. Satellite monitoring

Usually through the IP address to access the remote host, the implementation of monitoring, the common method is snmp,ssh, and various agents (agents), the way is to request and then receive the results of the results to determine the host State.

      Monitor Server            |-------------------------------  |         |           | [Web]    [Mail]    [Database]

With the monitoring server as the center, Star scattering connection other monitoring nodes, there is no advantage, the disadvantage is that the web and mail node communication is not monitored

3.2. Stepwise diagnosis

This word is what I came up with, I don't know if it's accurate, first-level downward probing, finding fault points

      Monitor Server              |-------------------------------        |  | | V v v             | |             | [WEB]---> [Cache]---> [Database]  \                         ^   '------------------------|

First, the monitoring server as the star topology monitoring, and then let the Web node to access the cache node and then return to the monitoring results, and so on, let the cache node Access database, let the Web Access database node.

Simulate all business logic one at a time and issue a warning immediately.

3.3. Simulating artificial

Here the main monitoring services are available, you can check the work of the software, involving the testing process.

Through automated testing tools to assist monitoring, such as analog mouse clicks, keyboard input, you can monitor the graphical interface programs and Web programs.

Windows monitoring can be implemented through the Windows Automation API, through program control, can simulate manual operation software, achieve operation matching return results for automated monitoring

Web page monitoring scheme is too much, the more classic is webdriver derived from the various tools Selenium-web Browser Automation most famous. I use this tool to simulate user actions, such as user registration, login, post, order, and so on, and then match the return results for automated monitoring and alarm

3.4. Data analysis

Through data analysis, the fault is eliminated before the failure occurs. For example, developers forget to set up Redis time, and while the program has been working fine, redis memory is growing, and the total day will fail.

We discovered this problem by collecting Redis status information and analyzing data changes over time.

3.5. Monitoring and development

When it comes to monitoring many people think that this is the operation of the matter, in fact, do not understand the operation of the test is not good development.

The development process needs to take into account monitoring, such as Nginx status module, MySQL show status command, Redis's info command, are reserved for monitoring. So did you develop a program that considered monitoring this piece?

You can use the log form or the pipeline, or the socket to provide the program's operational status to the Monitoring collection program.

4. Summary

Good monitoring can make you know the system well, and do it in your heart. There's data to talk.

http://netkiller-github-com.iteye.com/blog/2190593

Talk about monitoring (turn)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.