Fault detection
1. Other first, top look at the CPU, RAM, swap which is more tense.
By analysis, you can see a total of 602 processes, of which 601 processes hibernate. This seems to be something wrong, the kernel process is about 80, plus memcached, Nginx, mysqld, will not exceed 90. In addition to these, only php-fpm management of the php-cgi, is it ...?
CPU display, CPU pressure is not big, can say no pressure. We looked at the memory usage profile again, found that 4G of memory, the consumption of the remaining few (free+buffers), more than 95% of the memory has been allocated. The use of interactive space, we temporarily do not care. The command top also lists the most resource-intensive processes, and the longest-running (time+) mysqld (about 2 hours) consumes less resources. In addition, looking at php-cgi, a single php-cgi occupies a lot of memory. So, you can be bold to guess: server memory resources are relatively tense, and not by a process to occupy a large amount of memory, it is possible that some suspended processes occupy memory is not released. Verify our ideas by further monitoring memory usage with free.
2. Instruction free, learn about RAM Resource usage. Of course, you can also view the file/proc/meminfo
Let's take a look at the MEM statistics, total represents the amount of physical memory, about 4G. Used, which indicates that memory has been allocated, is not represented as being used, including (buffer&cached). Free refers to unallocated memory, and buffers and cached represent memory that has been allocated but not yet in use. The second line (Buffers/cache), used means that the memory is actually used, is obtained by the first line (used-buffer-cached), and free indicates that memory that has not been used is obtained by the first line (free+buffer+cached). Swap lines represent memory exchange usage, with little (infrequently) swpd, which does not affect server performance because the system needs to swap the V-type memory pages out or adjust the size of buffer and cached. However, frequent swpd may mean that the server is running out of physical memory, less than the specified swap rating, and needs to be swapped out of memory pages.
When we view the free results, we mainly look at the second line. A glance can see 4G of memory, of which 3898M memory is used, and 49M of memory is not, are running out. This also confirms our first step of guessing that memory is exhausted. Here, we further suspect that the memory space is severely insufficient, the process will be blocked, the system will continue to exchange the unused data out so, the data will be used to read into the SI. We can further verify through the vmstat that this conjecture of ours.
3. Instruction Vmstat Monitoring Memory usage
As a memory monitor, we are more concerned about SWPD, free, si, and so. When the general system is not busy, we see that the value of SWPD,SO will not last very high, often 0. Here, we see that the SWPD value is 1.5G, and the free value is small, again indicating that there is not enough physical memory. Where SI reports the total amount of memory moved from swap to physical memory per second, so reports the total amount of memory moved from physical memory to swap area per second. Of course, si is sometimes large, and do not excessive anxiety, often encountered a program requires large memory to read and write media files, Si value will become larger. So, it is usually a memory shortage of a signal, if the long time this value has remained large, it is likely that the memory is not enough, small fluctuations can be ignored. Next, you can find the culprit that consumes memory through PS.
4. Instruction PS Identify the culprit that consumes memory
[Email protected] ~]# ps-a--sort-rss-o comm,pmem,pcpu |uniq-c |head-151 COMMAND%MEM%cpu1 mysqld 0.6 0.0503 PHP-CG I 0.3 0.05 php-cgi 0.2 0.01 php-cgi 0.1 0.01 php-cgi 0.0 0.01 memcached 0.0 0.01 sshd 0.0 0.01 nginx 0.0 0.01 sshd 0.0 0.0 1 nginx 0.0 0.02 bash 0.0 0.03 nginx 0.0 0.01 sshd 0.0 0.01 nginx 0.0 0.0
Instruction PS is more commonly used, but also relatively simple. With the results reported above, we can hit php-cgi this process at a glance. Although a single php-cgi memory is not too large, but the 503 php-cgi process, it is a bit scary. It takes up almost all of the memory (503*0.3%). We can guess that php-cgi is managed by PHP-FPM and is not properly configured to php-fpm one of the parameters, resulting in an excessive number of php-cgi processes being opened.
5. Set PHP-FPM Process Quantity management
By re-setting the Max_children value of php-conf.conf to 150, the system memory is restored to normal usage. Free, si, so, b all indicate that the memory system resources are normal and there is no pressure.
The memory freed by the php-cgi process is not immediately reclaimed by the system, and a php-cgi probably consumes 20kb of memory (depending on the PHP extensions you load). Therefore, it is necessary to limit the number of php-cgi processes you start. So, how much is this number appropriate, you can count the number of php-cgi at the peak of the server by top. can also be as PHP-FPM suggested, through NETSTAT-NP | grep 127.0.0.1:9000 to collect data by setting Max_children to make the number of waits as small as possible
6. How much memory a php-cgi consumes
A php-cgi process, about how much memory it takes, probably 20MB. You can see where memory is being consumed by the PMAP directive. Therefore, try not to load unnecessary PHP extension modules, you can reduce unnecessary memory waste.
[[email protected] etc]# Pmap $ (pgrep php-cgi |head-1) 6746:/usr/local/php/bin/php-cgi–fpm–fpm-config/usr/local /php/etc/php-fpm.conf0000000000400000 6680K r-x–/usr/local/php/bin/php-cgi0000000000c86000 268K rw-/usr/local/php/ bin/php-cgi0000000000cc9000 56K rw-[anon]0000000005012000 2240K rw-[anon]0000003efd200000 112K r-x–/lib64/ld-2.5.so ..... 00002ac28a7a5000 2048k-–/usr/local/php/lib/php/extensions/no-debug-non-zts-20060613/ xhprof.so00002ac28a9a5000 4K rw-/usr/local/php/lib/php/extensions/no-debug-non-zts-20060613/ xhprof.so00002ac28a9a6000 84K r-x–/usr/local/php/lib/php/extensions/no-debug-non-zts-20060613/ apc.so00002ac28a9bb000 2048k-–/usr/local/php/lib/php/extensions/no-debug-non-zts-20060613/ apc.so00002ac28abbb000 8K rw-/usr/local/php/lib/php/extensions/no-debug-non-zts-20060613/apc.so00002ac28abbd000 32K rw-[anon]00002ac28abd4000 40K r-x–/lib64/libnss_files-2.5.so00002ac28abde000 2044k-–/lib64/libnss_ files-2.5.so00002ac28addd000 4K r--/lib64/libnss_files-2.5.so00002ac28adde000 4K rw-/lib64/libnss_files-2.5.so00007fffa717e000 84K rw-[Stack]ffffffffff600000 8192k-–[ Anon]total 154172K
Contract several directories
- /usr/local/php/sbin/php-fpm
- /usr/local/php/etc/php-fpm.conf
- /usr/local/php/etc/php.ini
One, the php-fpm start parameter
12345678910111213 |
#测试php-fpm配置
/usr/local/php/sbin/php-fpm
-t
/usr/local/php/sbin/php-fpm
-c
/usr/local/php/etc/php
.ini -y
/usr/local/php/etc/php-fpm
.conf -t
#启动php-fpm
/usr/local/php/sbin/php-fpm
/usr/local/php/sbin/php-fpm
-c
/usr/local/php/etc/php
.ini -y
/usr/local/php/etc/php-fpm
.conf
#关闭php-fpm
kill
-INT `
cat
/usr/local/php/var/run/php-fpm
.pid`
#重启php-fpm
kill -USR2 `
cat
/usr/local/php/var/run/php-fpm
.pid`
|
Second, php-fpm.conf important parameters of the detailed
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465 6667686970717273747576777879 |
pid = run
/php-fpm
.pid
#pid设置,默认在安装目录中的var/run/php-fpm.pid,建议开启
error_log = log
/php-fpm
.log
#错误日志,默认在安装目录中的var/log/php-fpm.log
log_level = notice
#错误级别. 可用级别为: alert(必须立即处理), error(错误情况), warning(警告情况), notice(一般重要信息), debug(调试信息). 默认: notice.
emergency_restart_threshold = 60
emergency_restart_interval = 60s
#表示在emergency_restart_interval所设值内出现SIGSEGV或者SIGBUS错误的php-cgi进程数如果超过 emergency_restart_threshold个,php-fpm就会优雅重启。这两个选项一般保持默认值。
process_control_timeout = 0
#设置子进程接受主进程复用信号的超时时间. 可用单位: s(秒), m(分), h(小时), 或者 d(天) 默认单位: s(秒). 默认值: 0.
daemonize =
yes
#后台执行fpm,默认值为yes,如果为了调试可以改为no。在FPM中,可以使用不同的设置来运行多个进程池。 这些设置可以针对每个进程池单独设置。
listen = 127.0.0.1:9000
#fpm监听端口,即nginx中php处理的地址,一般默认值即可。可用格式为: ‘ip:port‘, ‘port‘, ‘/path/to/unix/socket‘. 每个进程池都需要设置.
listen.backlog = -1
#backlog数,-1表示无限制,由操作系统决定,此行注释掉就行。backlog含义参考:http://www.3gyou.cc/?p=41
listen.allowed_clients = 127.0.0.1
#允许访问FastCGI进程的IP,设置any为不限制IP,如果要设置其他主机的nginx也能访问这台FPM进程,listen处要设置成本地可被访问的IP。默认值是any。每个地址是用逗号分隔. 如果没有设置或者为空,则允许任何服务器请求连接
listen.owner = www
listen.group = www
listen.mode = 0666
#unix socket设置选项,如果使用tcp方式访问,这里注释即可。
user = www
group = www
#启动进程的帐户和组
pm = dynamic
#对于专用服务器,pm可以设置为static。
#如何控制子进程,选项有static和dynamic。如果选择static,则由pm.max_children指定固定的子进程数。如果选择dynamic,则由下开参数决定:
pm.max_children
#,子进程最大数
pm.start_servers
#,启动时的进程数
pm.min_spare_servers
#,保证空闲进程数最小值,如果空闲进程小于此值,则创建新的子进程
pm.max_spare_servers
#,保证空闲进程数最大值,如果空闲进程大于此值,此进行清理
pm.max_requests = 1000
#设置每个子进程重生之前服务的请求数. 对于可能存在内存泄漏的第三方模块来说是非常有用的. 如果设置为 ‘0‘ 则一直接受请求. 等同于 PHP_FCGI_MAX_REQUESTS 环境变量. 默认值: 0.
pm.status_path =
/status
#FPM状态页面的网址. 如果没有设置, 则无法访问状态页面. 默认值: none. munin监控会使用到
ping
.path =
/ping
#FPM监控页面的ping网址. 如果没有设置, 则无法访问ping页面. 该页面用于外部检测FPM是否存活并且可以响应请求. 请注意必须以斜线开头 (/)。 ping
.response = pong
#用于定义ping请求的返回相应. 返回为 HTTP 200 的 text/plain 格式文本. 默认值: pong.
request_terminate_timeout = 0
#设置单个请求的超时中止时间. 该选项可能会对php.ini设置中的‘max_execution_time‘因为某些特殊原因没有中止运行的脚本有用. 设置为 ‘0‘ 表示 ‘Off‘.当经常出现502错误时可以尝试更改此选项。
request_slowlog_timeout = 10s
#当一个请求该设置的超时时间后,就会将对应的PHP调用堆栈信息完整写入到慢日志中. 设置为 ‘0‘ 表示 ‘Off‘
slowlog = log/$pool.log.slow
#慢请求的记录日志,配合request_slowlog_timeout使用
rlimit_files = 1024
#设置文件打开描述符的rlimit限制. 默认值: 系统定义值默认可打开句柄是1024,可使用 ulimit -n查看,ulimit -n 2048修改。
rlimit_core = 0
#设置核心rlimit最大限制值. 可用值: ‘unlimited‘ 、0或者正整数. 默认值: 系统定义值.
chroot =
#启动时的Chroot目录. 所定义的目录需要是绝对路径. 如果没有设置, 则chroot不被使用.
chdir =
#设置启动目录,启动时会自动Chdir到该目录. 所定义的目录需要是绝对路径. 默认值: 当前目录,或者/目录(chroot时)
catch_workers_output =
yes
#重定向运行过程中的stdout和stderr到主要的错误日志文件中. 如果没有设置, stdout 和 stderr 将会根据FastCGI的规则被重定向到 /dev/null . 默认值: 空.
|
Iii. Common errors and solutions the resource problems caused by the arrangement of 1,request_terminate_timeout
The value of Request_terminate_timeout, if set to 0 or too long, can cause file_get_contents resource problems.
If the remote resource requested by file_get_contents is too slow, file_get_contents will remain stuck there and will not time out. We know that php.ini inside Max_execution_time can set the maximum execution time for PHP scripts, but in php-cgi (PHP-FPM), this parameter does not work. The real ability to control the maximum execution time of a PHP script is the request_terminate_timeout parameter in the php-fpm.conf configuration file.
The default value for Request_terminate_timeout is 0 seconds, meaning that the PHP script will continue to execute. This way, when all the php-cgi processes are stuck in the file_get_contents () function, the nginx+php WebServer can no longer process the new PHP request, and Nginx will return "502 bad Gateway" to the user. To modify this parameter, it is necessary to set the maximum execution time for a PHP script, but the symptom is not a cure. For example, to 30s, if file_get_contents () to get a slow page content, which means that 150 php-cgi process, only 5 requests per second, WebServer also difficult to avoid "502 bad Gateway." The workaround is to set the request_terminate_timeout to 10s or a reasonable value, or to add a timeout parameter to file_get_contents.
1234567 |
$ctx = stream_context_create ( array ( ' http ' => array ( ' timeout ' = > 10 //set a time-out, in seconds file_get_contents ( $str , 0, $ctx |
Improper configuration of the 2,max_requests parameter may cause an intermittent 502 error:
Sets the number of requests for the service before each child process is reborn. is useful for third-party modules that may have a memory leak. If set to ' 0′, the request is always accepted. Equivalent to the PHP_FCGI_MAX_REQUESTS environment variable. Default value: 0.
This configuration means that the process is automatically restarted when the number of requests processed by a php-cgi process accumulates to 500.
But why restart the process?
Generally in the project, we will more or less use some PHP third-party libraries, these third-party libraries often have a memory leak problem, if you do not periodically restart the php-cgi process, it is bound to cause memory usage is increasing. So php-fpm, as the manager of PHP-CGI, provides a monitoring function that restarts the PHP-CGI process that requests a specified number of times to ensure that the amount of memory is not increased.
It is because of this mechanism, in high-concurrency sites, often lead to 502 of errors, I guess the reason is that php-fpm to the request from NGINX queue is not handled well. However, I am still using PHP 5.3.2, I do not know if there is a problem in PHP 5.3.3.
Our solution now is to set this value as large as possible, to minimize the number of php-cgi re-SPAWN, and to improve overall performance. In our own actual production environment we found that the memory leak was not obvious, so we set the value to very large (204800). We must set this value according to their actual situation, can not blindly increase.
In other words, the purpose of this mechanism is only to ensure that the php-cgi does not take up too much memory, so why not handle it by detecting memory? I agree with Gao Chunhui that it would be a better solution to restart the php-cgi process by setting the peak intrinsic consumption of the process.
3,PHP-FPM slow log, Debug and exception troubleshooting artifact:
Request_slowlog_timeout set a time-out parameter, Slowlog set the slow log storage location
1 |
tail -f /var/log/www .slow.log |
The above command can see the slow-running PHP process.
You can see the frequent occurrence of network read more than, MySQL query too slow problem, according to the prompt information to troubleshoot the problem there is a clear direction.
Resource analysis of PHP-FPM occupancy system