Yesterday (August 6, 2013) afternoon, the two cloud servers hosting www.cnblogs.com main station were automatically restarted 1 times, because the two cloud servers used load balancing (SLB), the restart did not affect the normal access to the site.
The Windows Event log associated with this reboot is as follows:
Cloud server 1 (8 kernel cpu,8g memory):
14:36:22--windows successfully diagnosed a low virtual memory condition. The following programs consumed most virtual Memory:w3wp.exe (1968) Consumed 3200438272, bytes (w3wp.exe) 6272 Ed 3027517440 bytes, and w3wp.exe (584) consumed 643993600.
(15:09:04--reboot ...) )
15:09:56--crash Dump initialization failed! (this error is because virtual memory is disabled and dump cannot be created)
15:09:58--the system has rebooted without cleanly shutting down a. This error could is caused if the system stopped responding, crashed, or lost power unexpectedly.
15:10:43--the previous system shutdown at 15:09:04 on 2013/8/6 is unexpected.
Cloud server 2 (8 kernel cpu,8g memory):
17:33:45--windows successfully diagnosed a low virtual memory condition. The following programs consumed most virtual Memory:w3wp.exe (5720) consumed 3194974208, Bytes (w3wp.exe) 2020 Ed 3034882048 bytes, and w3wp.exe (1832) consumed 517074944.
(17:40:44--reboot ...) )
17:42:11--crash Dump Initialization failed!
17:42:12--the system has rebooted without cleanly shutting down a. This error could is caused if the system stopped responding, crashed, or lost power unexpectedly.
17:42:40--the previous system shutdown at 17:40:44 on 2013/8/6 is unexpected.
As you can see from the event log, the w3wp process consumes a lot of virtual memory (a single w3wp process consumes nearly 3G) for a period of time prior to the reboot, and we disable Windows virtual memory, in which case some unknown factor triggers the system crash. This causes Windows to automatically restart.
(You might ask, Windows defaults to enabling virtual memory, why bother disabling it?) It was because the discovery enabled virtual memory can cause cloud server CPU occupy high, fluctuation, see Boven Calculation Road-Ali Cloud: two important breakthroughs. )
We enabled the virtual memory of both cloud servers last night in order to temporarily resolve this reboot problem.
But today we write this blog, in 11:35~11:40 about, the two cloud server unexpectedly all appeared CPU 100% problem, resulting in web site can not normal access.
The cause of today's failure is further analyzed, if it is caused by enabling Windows virtual memory, then we are in a dilemma-virtual memory, forbidden or not?
See more highlights of this column: http://www.bianceng.cnhttp://www.bianceng.cn/Servers/cloud-computing/