Why php cannot be recovered after being overloaded _ PHP Tutorial

Source: Internet
Author: User
Analysis of the causes that cannot be recovered after php overload. Recently, php servers are unable to provide services after frequent overload. as long as a request is sent, the php process responsible for processing the request occupies 100% of the cpu. The original server load balancer policy recently caused frequent overload of php machines and thus failed to provide services. as long as a request is sent, the php process responsible for processing the request consumes 100% of the cpu. The original load balancing policy is that once a machine's php request times out, the weight of the machine will be reduced, and the probability of sending requests to the machine will be reduced. although there is a certain lag effect, but it should eventually be able to reduce the pressure and restore the service, but this policy suddenly failed recently. In this case, all requests sent to php-fpm are cpu100 % even if an empty php file is requested. This may be caused by eaccelerator.

Our Php-fpm request_terminate_timeout is set to 5 s, so as long as a request is executed for more than 5s, it will be killed by php-fpm, A large number of 5s times out before and after the problem occurs. Preliminary conjecture may be caused by the shared memory of the eaccelerator. when the sub-process is killed, the shared memory is written incorrectly, causing errors in all requests, but this does not explain the problem that the new file will get stuck. so I went to the eacceleraotr code and found the following code:

[Cpp]

# Define spinlock_try_lock (rw) asm volatile ("lock; decl % 0": "= m" (rw)-> lock): "memory ")

# Define _ spinlock_unlock (rw) asm volatile ("lock; incl % 0": "= m" (rw)-> lock): "memory ")

Static int mm_do_lock (mm_mutex * lock, int kind)

{

While (1 ){

Spinlock_try_lock (lock );

If (lock-> lock = 0 ){

Lock-> pid = getpid ();

Lock-> locked = 1;

Return 1;

}

_ Spinlock_unlock (lock );

Sched_yield ();

}

Return 1;

}

Static int mm_do_unlock (mm_mutex * lock ){

If (lock-> locked & (lock-> pid = getpid ())){

Lock-> pid = 0;

Lock-> locked = 0;

_ Spinlock_unlock (lock );

}

Return 1;

}

[Cpp]

Among them, mm_mutex points to the shared memory. that is to say, the eac uses the shared memory as the lock between processes and uses the spinlock method. This makes everything understandable. Suppose that a process is killed by php-fpm after it gets the lock, and it does not have the unlock. in this way, all php-fpm sub-processes cannot get the lock, so everyone is stuck in this while (1) loop. I guess I have it. how can I confirm it? The original idea was to directly read the shared memory. The result showed that IPC_PRIVATE was used in php, so there was no way to read it. So we can only wait until the online problem occurs and gdb goes up to check the memory. now we have final evidence.

[Html]

(Gdb) p * mm-> lock

$8 = {lock = 4294966693, pid = 21775, locked = 1}

Here we can see that the memory has been obtained by the process with process number 21775, but the fact is that the process was killed a long time ago.

The problem has been confirmed, so let's look back at the conditions for this problem.

1. the request is executed for a long time and will be killed by php-fpm for a long time.

2. when the process is killed, php is in the require file and the eac gets the lock.

From this we can see that there are some specific situations that will enlarge this probability.

1. the request_terminate_timeout time is short.

2. use the auoload method or the require file in the execution logic, because if all the files are loaded before the request starts, unless the require file has timed out, otherwise, the request file should not be killed. However, there is also an ugly way to avoid this problem using the autload method, that is, to judge in the autload function, if the execution time is too long, directly exit rather than require

In my opinion, the best way to solve this problem is to set the request_terminate_timeout time long enough, for example, 30 s, 300 s, and put all timeout judgments on the application layer, php-fpm cannot be used to handle this problem. the fact that php-fpm can only be used as the last heavy insurance and has to be used. In addition, max_execution_time has a timeout value in php, but this timeout value is cpu time in cgi mode, so it does not play a major role.

Bytes. The original load balancing policy...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.