Author: Tian Yi Source: BKJIA
I left the office at noon because I attended the MS new product release conference that day. When I was on my way to the conference room in a taxi, my colleague called to say that the primary database had a write protection error. This is incredible. All applications rely on this database. In my mind, do not miss anything. Otherwise, you will not be able to participate! So I told my colleagues on the phone to restart the mysql database and try again. Fortunately, the problem is solved after the restart.
As soon as the meeting is over, you can quickly find the cause of the fault. Here we will first describe the platform environment and clarify the logical relationship. This application is composed of a web Front-End Server, a tomcat application server, and a mysql server. All the systems are linux. Your request is sent to the front-end apache server. If the request page is. jsp, apache forwards the request to the tomcat server. tomcat then obtains data from the database or inserts a record into the database. This is a typical layer-3 application logic.
Log on to the mysql server, use the mysql client to connect to the mysql database, and run the command mysql> show processlist. No exception is found, and the load is low. It seems that there is no name here. Next, check the mysql Error Log and find the following exceptions:
080313 11:25:35 InnoDB: Error: cannot allocate 1064960 bytes ofInnoDB: memory with malloc! Total allocated memoryInnoDB: by InnoDB 1233305429 bytes. Operating system errno: 12InnoDB: Check if you should increase the swap file orInnoDB: ulimits of your operating system.InnoDB: On FreeBSD check you have compiled the OS withInnoDB: a big enough maximum process size.InnoDB: Note that in most 32-bit computers the processInnoDB: memory space is limited to 2 GB or 4 GB.InnoDB: We keep retrying the allocation for 60 seconds...080313 11:26:08 [ERROR] /usr/local/mysql/bin/mysqld: Sort aborted080313 11:26:19 InnoDB: Error: cannot allocate 1064960 bytes ofInnoDB: memory with malloc! Total allocated memory |
The error message is that the memory is basically exhausted and there is no space available for allocation. This determines what results in a huge load, resulting in the system memory being squeezed out. However, the database server is now stable and cannot be found for any reason.
I got the basic information and stopped for a moment. So I took the mail and sent an alert email. The content of the email was as follows:
***** Nagios 2.9 *****Notification Type: PROBLEMService: check_loadHost: tomcat nch100Address: 61.154.105.100State: WARNINGDate/Time: Thu Mar 13 10:59:53 CST 2008Additional Info:WARNING - load average: 3.94, 8.56, 9.17 |
The alarm message indicates that the load on the host 61.154.105.100 is too large. This host is a tomcat server. It seems that the problem lies in it. In order to confirm your ideas in the near future, let's take a look at the network traffic:
498) this. style. width = 498; "border = 0> |
|
The traffic diagram shows that,The time when abnormal traffic is generated is exactly the same as the time when alarm information is generated.,Call a colleague again,Q: "You are all in61.154.105.100What did this machine do??", A:" An incorrect executionSQLStatement.SQLStatement ". So far, find out the cause!