Nagios practices: Nagios skills you should know

Source: Internet
Author: User
Welcome to the network technology community forum and interact with 2 million technical staff. nagios is generally used to monitor Intranet machines. In fact, if the network conditions are good, it is also feasible to monitor machines on the public network. We used to monitor 28 SQLServer2008 databases on the public network (Telecom to telecom). It turns out that the results are also good.

Welcome to the network technology community forum and interact with 2 million technical staff> visit nagios, which is generally used to monitor Intranet machines. In fact, if the network conditions are good, it is also feasible to monitor machines on the public network. We used to monitor 28 SQL Server databases on the public network (Telecom to China Telecom). It turns out that the results are also good.

Welcome to the network technology community forum and interact with 2 million technical staff> enter

Nagios is generally used to monitor Intranet machines. In fact, if the network condition is good, it is also feasible to monitor Internet machines; we used to monitor 28 SQL Server databases on our public network (China Telecom vs. China Telecom). Facts have proved that the results are also acceptable.

Nagios can not only detect the real-time status of linux/unix servers, but also the windows server performance. If you are skilled in configuration, it is very convenient to configure nsclient in windows, but remember to enable port 12489 of the windows Firewall and check it. What, no? Telnet the IP address 12489 of the windows server on another server. Check whether port 12489 is enabled for your windows server;

It is best for your business websites to be placed in your own machine room because nagios is very effective in Intranet monitoring, because nagios uses ping to detect whether the server is alive, if nagios cannot detect the Monitoring Server due to poor network conditions or other causes, it will cause a ridiculous problem. It will trigger an alarm and say that the server is down, which is very critical; but in fact, this server is in good condition, but it is only unable to connect to the network of the nagios machine. In this case, we hope you can identify it.

How can we determine if our website is actually suspended? Nagios can only detect your server in real time. What if your firewall or data center fails? We recommend that you purchase an instant scan service (such as Alertbot), which will scan your website instantly and send emails to our mailbox if any problem occurs; if you receive alerts from Alertbot and nagios at the same time in your mailbox, be careful.

Sometimes our system group has this need, especially when the system is busy, to leave logs for analysis: whether the system is under attack or the developer is improperly set, or the O & M personnel modified the system configuration. When there are few machines, the problem may not be big, but the company's CDN server cluster is more than one hundred, and the current situation is still growing, so we designed the nagios + vmstat shell script for Nagios, when the system is busy, logs are separated for the System team's colleagues to analyze the problem and find out the crux of the problem. For details, refer to other articles on 51cto.com. I will not repeat them here.

I didn't use a text message alert cat, but I recommend you try it with Apsara stack. I am not using text message cat or Apsara stack for the time being because there is a better and easier way; this is not to say that these are not good, but I think they are complicated.

The system admin on our side has been using a mobile 139 mailbox to receive alarm emails from nagios, and the effect has always been good. However, it seems that the global communication has the best effect. If it is in shenzhouxing or dynamic zone, the results are also very bad, and often do not receive alarm emails. Because I saw my colleagues using global communication, it worked very well. I also went to buy a card for shenzhouxing. As a result, I got a cup and couldn't receive an alert text message. I have been using China Unicom's ruyitong mobile phone. I only received text messages from the beginning, and basically ignored the existence of nagios for the rest of the time. Later, I paid close attention to my BlackBerry business mobile phone. After I opened the Mobile Phone Mail, I directly bound my 163 mailbox to my Unicom mobile phone number, which completely solved the problem of mobile phone alarm text messages; these methods are all good. You can try them if you are interested;

Nagios can be used with traffic monitoring software such as cacti or MRTG to find system faults, or awstats to analyze Apache or postfix logs. However, I think awstats configuration is too cumbersome, directly use shell scripts for analysis.

If one of the servers, for example, jail, was originally used together with eight sub-VMS in the online environment, the load would be high, but nagios would not stop alarming, I think this is not normal, and people are very upset. In this case, you can directly click the server load item and then select "Disable configurations for this service". The world is quiet. We directly use FreeBSD's jail for online environments. Due to its convenient and efficient configuration, jail has a disadvantage. Because all the following sub-machines share the cpu, memory, and disk of the original machine, if the load of any sub-jail machine is too high or the disk space is used, the nagios alarm of the original host will be triggered. At present, there is no good way to optimize it as much as possible.

When I go to work, I need to pay attention to the Nagios system that monitors server hosts and services in real time. I feel that it is a bit troublesome and a waste of resources to automatically refresh a webpage. Although my colleagues share a small Nagios auxiliary software that can be minimized to the taskbar, there will be a floating Prompt window when there is an exception, which is quite convenient. But for those who have always liked to embed everything into the browser as much as possible, they prefer to find a Firefox plug-in to implement functions similar to this auxiliary software, this plugin is the Nagios Check plugin. (Here are some other things: I don't know why I am more and more dependent on browsers today. I want to check whether I can open only one browser to complete all things related to my work and life, try not to open too many applications, and add as many applications as possible to the Firefox browser using plug-in functions.) The installation method is very simple, however, I have never found any plug-ins on IE or chrome. This is not a pity. Each person has his/her own way, so there is no extra space here, as shown below (pay attention to the bottom right corner ):

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.