Problem description: The boss told me a problem they encountered two days ago. php started with fastcgi and opened the slowlog. When fsockopen was called to read a connection, when the connection exceeds the slowlog setting time, the fpm process throws a warning to record some basic information about the full request. Detailed description
Problem description: The boss told me a problem they encountered two days ago. php started with fastcgi and opened the slow log. When fsockopen was called to read a connection, when the connection exceeds the slowlog setting time, the fpm process throws a warning to record some basic information about the full request. Detailed description
Problem description:
Two days ago, the boss told me a problem they encountered. php started with fastcgi and opened the slow log. When fsockopen was called to read a connection, when the connection exceeds the slowlog setting time, the fpm process throws a warning to record some basic information about the full request.
The detailed description is as follows:
For example, if the slowlog time is set to 5 s, and a nonexistent ip address and port are accessed through fsockopen, the connection timeout is set to 10 s. The Code is as follows:
We access a non-existent IP address and port and access it through a browser. In the error log, we will receive two warning logs, one is the slowlog thrown by fpm, and the other is PHP Warning :? Fsockopen (): unable to connect to 2.3.2.1: 9999 (Operation now in progress ).
Normally, Connection timed out should occur when an error occurs in the fsockopen Connection, but this is not the case. It seems that the fpm throws the script and ends after timeout. Why?
Later, the script was run on the command line, but there was no Operation now in progress, which is the error of fsockopen Connection timed out. This indicates that it was not a problem of fsockopen and occurred on fpm.
This issue tracking is divided into two parts. This article mainly introduces the running mechanism of fpm. For the answer, see
Fpm enables slowlog Fsockopen. Operation now in SS problem tracking 2
Problem tracking
Because it is an online problem, the online php version is 5.2.8, And I installed php 5.3 locally. After testing, php 5.3 has the same problem.
Operation now in progress exception, itself is no problem, it is a non-blocking connect return value, this value indicates that connect is in progress, and so on, non-blocking? The problem may occur here,
If non-blocking connect is used, the select statement should be called later to check the write event of the select statement. Once returned, available data is received.
To locate this problem, there seems to be no other way. I can only look at the source code of fpm, and I did find the root cause of the problem when I looked at the source code.
However, to solve this problem, we must first understand the fpm workflow.
FPM workflow Overview
Fpm is a multi-process program. When fpm is started, it uses a master to fork out max_children as the working process, and save the sub-process id and other related information to the children pointer pointed to by struct fpm_worker_pool_s.
Working processes are used to receive network requests. When a link is asked via nginx, nginx finds that we are accessing the PHP file, it will pass the absolute directory of the file to port 9000 of fpm through the network port specified by fastcgi_pass,
Fpm receives the request, searches for idle processes from children, and processes the request through the idle process.
Details
When looking at the fpm source code process, it is found that the main process is not listening to a network event, but a time event. After the current time exceeds the time set by the event, the fpm_got_signal function is triggered, this function communicates between the main process and sub-process through the sp [2] pipeline. It is mainly used to process signals such as SIGCHLD, SIGINT, SIGTERM, SIGQUIT, SIGUSR1, and SIGUSR2, different signals have different processing logic:
Code Segment 1:
Static void fpm_got_signal (struct fpm_event_s * ev, short which, void * arg)/* {*/{char c; int res, ret; // fd is sp [2] pipe int fd = ev-> fd ;........ switch (c) {case 'C':/* SIGCHLD */zlog (ZLOG_DEBUG, "received SIGCHLD"); // call waitpid and wait for the sub-process to generate SIGCHLD fpm_children_bury (); break; case 'I':/* SIGINT */zlog (ZLOG_DEBUG, "pinned ed SIGINT"); zlog (ZLOG_NOTICE, "Terminating... "); fpm_pctl (FPM_PCTL_STATE_TERMINATING, FPM_PCTL_ACTION_SET); break; case 'T':/* SIGTERM */zlog (ZLOG_DEBUG," received SIGTERM "); zlog (ZLOG_NOTICE, "Terminating... "); fpm_pctl (FPM_PCTL_STATE_TERMINATING, FPM_PCTL_ACTION_SET); break; case 'q':/* SIGQUIT */zlog (ZLOG_DEBUG," received SIGQUIT "); zlog (ZLOG_NOTICE, "Finishing... "); fpm_pctl (FPM_PCTL_STATE_FINISHING, FPM_PCTL_ACTION_SET); break ;........}
The following shows an extremely complex figure:
In order to clarify the details, I read most of the fpm source code and drew the figure above, which is very complicated,
It doesn't matter if you don't understand it. I will talk about it based on the source code.
The main function of Fpm is in row 1548 of sapi/fpm/fpm_main.c,
The above sections can be ignored and used to process fpm startup parameters and php environment initialization. For details, see row 1824,
Code Segment 2:
Int main (int argc, char ** argv) {...... // fpm initialization if (0> fpm_init (argc, argv, fpm_config? Fpm_config: CGIG (fpm_config), fpm_prefix, fpm_pid, test_conf, delimiter, force_daemon) {// error ----- if (writable [1]) {int writeval = 0; zlog (ZLOG_DEBUG, "Sending \" 0 \ "(error) to parent via fd = % d", fpm_globals.send_config_pipe [1]); write (fpm_globals.send_config_pipe [1], & writeval, sizeof (writeval); close (fpm_globals.send_config_pipe [1]);} return FPM_EXIT_CONFIG ;}.........}
Fpm_init is initialized, that is, the position of A in the figure. Its main work is marked by 3. below is the main code
Code Segment 3:
Int fpm_init (int argc, char ** argv, char * config, char * prefix, char * pid, int test_conf, int run_as_root, int force_daemon) /*{{{*/{.......... if (0> fpm_php_init_main () | 0> fpm_stdio_init_main () | // initialize io 0> fpm_conf_init_main (test_conf, force_daemon) | // load the fpm configuration file 0> fpm_unix_init_main () | 0> fpm_scoreboard_init_main () | 0> fpm_pctl_init_main () | 0> fpm_env_init_main () | 0> fpm_signals_init_main () | // registers the process signal callback, which is critical. 0> fpm_children_init_main () | 0> fpm_sockets_init_main () | // socket initialization 0> fpm_worker_pool_init_main () | // workpool initialization 0> fpm_event_init_main () {// event IO initialization if (fpm_globals.test_successful) {exit (FPM_EXIT_ OK);} else {zlog (ZLOG_ERROR, "FPM initialization failed"); return-1 ;}} if (0> fpm_conf_write_pid ()) {// zlog (ZLOG_ERROR, "fpm initialization failed"); return-1 ;}..........}
The main work has been commented on to the source code,
Then there is a fpm_run function in the main function.
Code segment 4:
....... // Fcgi_fd is the socket handle fcgi_fd = fpm_run (& max_requests); parent = 0; // The sub-process continues to run, the parent process will be in fpm_run while/* onced forked tell zlog to also send messages through logging () */zlog_set_external_logger (sapi_cgi_log_fastcgi );........
The following operations are encapsulated in this function.
Code segment 5:
Int fpm_run (int * max_requests)/* {*/{struct fpm_worker_pool_s * wp;/* create initial children in all pools */for (wp = fpm_worker_all_pools; wp; wp = wp-> next) {int is_parent; is_parent = fpm_children_create_initial (wp); // sub-process if (! Is_parent) {goto run_child;}/* handle error */if (is_parent = 2) {// fpm_pctl (FPM_PCTL_STATE_TERMINATING, FPM_PCTL_ACTION_SET) failed to be created by the sub-process ); fpm_event_loop (1) ;}}/* run event loop forever * // main process listener event fpm_event_loop (0); run_child: /* only workers reach this point */fpm_cleanups_run (FPM_CLEANUP_CHILD); * max_requests = fpm_globals.max_requests; // return the socket handle return fpm_globals.listening_socket ;}
This function first creates a sub-process and places the process-related information in the struct fpm_worker_pool_s * wp pointer,
The parent process calls fpm_event_loop (0), the child process directly returns the sock handle, and then continues to run the code after fpm_main.c 1848 for the accept operation;
Fpm_event_loop is the time event listening operation,
Code segment 6:
Void fpm_event_loop (int err)/* {*/{static struct fpm_event_s signal_fd_event;/* sanity check */if (fpm_globals.parent_pid! = Getpid () {return;} // register callback = fpm_got_signal/** set the signal_fd_event parameter, ev-> fd = sp [0] fpm_signals_get_fd () is the MPs queue handle sp [0]: read; **/fpm_event_set (& signal_fd_event, events (), FPM_EV_READ, & fpm_got_signal, NULL); fpm_event_add (& signal_fd_event, 0 ); /* add timers */if (fpm_globals.heartbeat> 0) {// fpm timeout detection processing fpm_pctl_heartbeat (NULL, 0, NULL );}....... while (1 ){............... // wait event ret = module-> wait (fpm_event_queue_fd, timeout );............. while (q) {fpm_clock_get (& now); if (q-> ev) {if (timercmp (& now, & q-> ev-> timeout,>) | timercmp (& now, & q-> ev-> timeout, = )) {// Signal Processing: SIGCHILD fpm_got_signal fpm_event_fire (q-> ev );.............}}}}
Register the fpm_got_signal callback function, which will be executed during the following listening time events.
Pay attention to the fpm_pctl_heartbeat function. Here is the place where the slow script record is executed. Let's see it later,
Fpm_process_ctl.c row 442
Code segment 7:
Void fpm_pctl_heartbeat (struct fpm_event_s * ev, short which, void * arg)/* {*/{static struct fpm_event_s heartbeat; struct timeval now; if! = Getpid () {return;/* sanity check */} if (which = FPM_EV_TIMEOUT) {fpm_clock_get (& now); // obtain the time // detect slowlog, values (& now); return;}/* ensure heartbeat is not lower than FPM_PCTL_MIN_HEARTBEAT */fpm_globals.heartbeat = MAX (fpm_globals.heartbeat, FPM_PCTL_MIN_HEARTBEAT ); /* first call without setting to initialize the timer */zlog (ZLOG_DEBUG, "heartbeat have been set up with a timeout of % dms", timeout); terminate (& heartbeat, FPM_EV_PERSIST, & fpm_pctl_heartbeat, NULL); fpm_event_add (& heartbeat, fpm_globals.heartbeat );}
When the fpm_ptcl_check_request_timeout function is called, it may be called only when witch = FPM_EV_TIMEOUT. The witch passed by the above function is 0, and it cannot be executed. Yes, the first execution will not be performed. The first execution will call rows 463 and 464 to register the time event and use fpm_ptcl_hearbeat as a callback function. Therefore, this does not affect us. Continue to check the fpm_pctl_check_request_timeout function.
Code Segment 8:
Static void fpm_pctl_check_request_timeout (struct timeval * now )/*{{{*/{........ // detect slow log fpm_request_check_timed_out (child, now, terminate_timeout, slowlog_timeout );..........}
Another call, fpm_request_check_timed_out, continues
Fpm_request.c, row 230, finally arrived,
Code segment 9:
Void fpm_request_check_timed_out (struct fpm_child_s * child, struct timeval * now, int terminate_timeout, int slowlog_timeout )/*{{{*/{......... # if HAVE_FPM_TRACE/** detects the sub-process running status and time **/if (child-> slow_logged. TV _sec = 0 & slowlog_timeout & proc. request_stage = FPM_REQUEST_EXECUTING & TV. TV _sec> = slowlog_timeout) {str_purify_filename (purified_script_filename, proc. script_filename, sizeof (proc. script_filename); child-> slow_logged = proc. accepted; child-> tracer = fpm_php_trace; // register the callback function that outputs the trace information // pause the child process fpm_trace_signal (child-> pid ); // log zlog for slow log (ZLOG_WARNING, "[pool % s] child % d, script '% s' (request: \" % s \") executing too slow (% d. % 06d sec), logging ", child-> wp-> config-> name, (int) child-> pid, purified_script_filename, proc. request_method, proc. request_uri, (int) TV. TV _sec, (int) TV. TV _usec);} else # endif if (terminate_timeout & TV. TV _sec> = terminate_timeout) {str_purify_filename (purified_script_filename, proc. script_filename, sizeof (proc. script_filename); fpm_pctl_kill (child-> pid, FPM_PCTL_TERM); zlog (ZLOG_WARNING, "[pool % s] child % d, script '% s' (request: \ "% s \") execution timed out (% d. % 06d sec), terminating ", child-> wp-> config-> name, (int) child-> pid, purified_script_filename, proc. request_method, proc. request_uri, (int) TV. TV _sec, (int) TV. TV _usec );}}.........}
Here is the key code. The above judgment is that if the slow log & slowlog_timeout & the current process status is Running & the execution time of the process is greater than the configured time,
Slowlog_timeout is the request_slowlog_timeout parameter configured in php-fpm.conf,
Then a callback function fpm_php_trace will be registered to execute in the main process, pause the sub-process, and record the slowlog logs.
If (terminate_timeout & TV. TV _sec> = terminate_timeout) {here is the code for interrupting the request. if request_terminate_timeout is set, it will be executed here.
What does the above pause process mean? Why do we pause the sub-process?
The purpose of suspending a child process is to allow the parent process to obtain detailed information about the current running of the child process and print the trace,
Now return to the fpm_event_loop function. Let's take a look at this process and return to the fpm_event_loop function.
Row 3 is an epoll_wait event. It supports select, poll, and epoll.
Let's look at Row 3. The fpm_event_fire function will execute the fpm_got_signal function. Do you still remember the signal processing function mentioned above for callback?
Right, it will be executed when a time event is generated:
Fpm_events.c 52 rows
The fpm_children_bury function calls waitpid,
WNOHANG | WUNTRACED indicates that if any process is paused, the system returns immediately,
If the process is detected to be paused, The fpm_php_trace will be executed to throw trace information to the log (fpm_php_trace is the callback registered in code block 9 above), and finally send a signal to the sub-process to continue running.
Figure 8
When the sub-process receives a signal to continue running, it will continue to run the code after fpm_main.c 1848, receive user requests, and process php scripts.
This is the execution process of Fpm, which is complex and involves a large amount of code.
The Fpm workflow is clear, but the above question has not been answered. Why does the process crash after the log of the slow script is thrown?
For the answer to this question, see: fpm enables slowlog Fsockopen and Operation now in progress problem tracking 2.
Source: http://www.imsiren.com/archives/1088