Resolve shared data issues between Nginx and fpm-php and more internal processes
Concept Description:
1. minit:php extended initialization method, the entire module startup time is called once
2. Rinit:php extended initialization method, each request is called once
3. Clustermap (cm): Provide service location and cluster map function, collect node state information by receiving heartbeat and active detection mode, manage heterogeneous clusters uniformly, replace hard load balancing equipment
4. CMSUBPROXY:CLUSTERMAP Internal Subscriber Client Agent, regular and server-side communication, get the latest cluster information, update the internal maintenance of the machine list
Problem Description
Nginx or php-cgi are using multiple processes to provide a large concurrency service, if the service internally want to provide a common function module, users need to write a extension or module, the most recent in the Clustermap Subscriber client, Subscriber client is an extension of PHP, when the request arrives, the PHP extension will get the latest list of machines with cmserver communication, but it is expensive to get the machine list every time you request it.
In Apache module mode, the implementation is simple, Apache first starts the parent process a calls the Minit method, after the call completes fork other httpd subprocess B,a and B is a parent-child relationship, so that the parent process a can update the cluster information periodically, It then communicates through the pipeline and the subprocess, which, when each request comes in, reads the pipeline message (that is, the machine list) to achieve service positioning, but the php-fpm pattern is slightly different, and the PHP-FPM process Manager starts process A to invoke the Minit method. Then fork out a fpm-master process B, process B starts more than one php-cgi subprocess C, after the startup process is complete, the start processes a exits, the child process calls Rinit when each request comes in, then the parent-child AC process Pipeline communication is not established, the pipeline data is not able to consume, Causes the child process C to block if it is fully written. In fact, this problem is very common, if you modify the source code of PHP to solve is not necessarily a good solution.
Problem Analysis
To sum up the above problems, frankly speaking is a number of service processes, each process at the request, the first need to service positioning, access to the latest list of machines (need a network overhead), and then forward the request to other services, then we take fpm-php as an example to address the above problems
Scenario One: Each request in rinit gets the latest list of machines on the server side first
1.2. 3.4. 5.6. 7.8. 9.10. 10 seconds passed.
If you haven't found a problem then you don't have to learn to nginx this thing that rides faster than a donkey, it's obvious that each request in Rinit has a network overhead, and the server side gets the latest list of machines, greatly increasing the response time of the entire request.
At this time some people said that the problem is good to solve, I do not need so frequent updates on the good, 10 requests, 100 requests to update once, or 1s seconds, 10s update once, neither affect performance, but also to achieve the update effect, in performance and update the frequency of doing trade-off, so always can it, so there is a plan two
Scenario Two: Each request in rinit first to the server to get the latest list of machines, while getting an expiration time from the server side, the subsequent request if no more than the expiration time will not need to go to the server side to get updates
Plan II can be said to be able to solve the problem, strictly speaking only part of the problem, the symptoms do not cure
Because if you want better performance, for each process, it is necessary to slow down the update cycle, loss of accuracy, if you want a higher accurate line, you need to update each process frequently, in search and advertising such a large concurrency, overtime sensitive services in front of this solution is too unfriendly, the most important thing, Every worker process has to be updated, although each process gets exactly the same information, this is not to say that the Nginx model is not good, this model has its meaning, and the fact that many processes are the nginx, each worker is an independent process, The programming is simple and does not need to lock, the process does not affect each other, reduces the risk.
Scenario three: Using shared memory, start an update process individually, update cluster node information in real time, write shared memory, rinit each request read shared memory get the latest machine list
Scenario three takes advantage of the same characteristics as the machine list obtained by multiple worker processes, sharing data between processes through shared memory, so that the worker process does not require network overhead and can quickly get the latest list of machines
resolution Method
At present, the Clustermap adopts scheme three to solve this problem by sharing memory.
Cmsubproxy is a separate update process, every 500ms will send a request to Cmserver, get the latest list of machines, after receiving a response message, Cmsubproxy will update the process of internal maintenance of the list of machines, after the successful update will be written to the shared memory;
The PHP-FPM process, when each request arrives, reads the list of machines in the shared memory first, and then forwards the request to one of the available machines in the list, and the machine chooses multiple strategies (polling, randomization, weights, consistency hashes, etc.);
Shared memory is mmap open, it should be noted that in the update and read the need to read and write locks, and lock semaphore to be in shared memory, about multiple processes shared memory lock
pthread_rwlockattr_t attr;
Pthread_rwlockattr_init (&ATTR);
Pthread_rwlockattr_setpshared (&attr, pthread_process_shared);
Description of pthread_rwlockattr_setpshared in the Handbook
Pthread_rwlockattr_setpshared (pthread_rwlockattr_t *attr, int pshared);
DESCRIPTION
The pthread_rwlockattr_setpshared () function sets the process-shared attribute of attr to the value referenced by pshared. Pshared may be one of two values:
pthread_process_shared any thread of no PROCESS that has access to the memory where the Read/write lock resides can mans Ipulate the lock.
The data written in shared memory is cmsubproxy and incremental, and the incremental data is written after the full amount of data, which is not described in detail here.
resolves shared memory between processes, causing a deadlock problem due to a process exception exit
Now that the problem has been confirmed is to get read lock after the process abnormal exit caused, I write a test program to reproduce the problem
(! 2293)-> Cat Test/read_shared.cpp
#include
sharedupdatedata* _sharedupdatedata = NULL;
cm_sub::cmmapfile* _mmapfile = NULL;
int32_t initsharedmemread (const std::string& Mmap_file_path)
{
_mmapfile = new (Std::nothrow) cm_sub::cmmapfile ();
if (_mmapfile = NULL | |!_mmapfile->open (MMAP_FILE_PATH.C_STR (), file_open_write))
{
return-1;
}
_sharedupdatedata = (sharedupdatedata*) _mmapfile->offset2addr (0);
return 0;
}
int main (int argc, char** argv)
{
if (Initsharedmemread (argv[1])!= 0) return-1;
int cnt = 100;
while (CNT > 0)
{
Pthread_rwlock_rdlock (& (_sharedupdatedata->_lock));
fprintf (stdout, "version =%ld, readers =%u\n",
_sharedupdatedata->_version, _sharedupdatedata->_lock.__data.__nr_readers);
if (cnt = 190)
{
Exit (0);
}
Sleep (1);
Pthread_rwlock_unlock (& (_sharedupdatedata->_lock));
-CNT;
Usleep (100*1000);
}
Delete _mmapfile;
}
(! 2293)-> Cat Test/write_shared.cpp
#include
sharedupdatedata* _sharedupdatedata = NULL;
cm_sub::cmmapfile* _mmapfile = NULL;
int32_t initsharedmemwrite (const char* Mmap_file_path)
{
_mmapfile = new (Std::nothrow) cm_sub::cmmapfile ();
if (_mmapfile = NULL | |!_mmapfile->open (MMAP_FILE_PATH, File_open_write, 1024))
{
return-1;
}
_sharedupdatedata = (Sharedupdatedata *) _mmapfile->offset2addr (0);
Madvise (_sharedupdatedata, 1024, madv_sequential);
pthread_rwlockattr_t attr;
memset (&attr, 0x0, sizeof (pthread_rwlockattr_t));
if (Pthread_rwlockattr_init (&attr)!= 0 | | pthread_rwlockattr_setpshared (&ATTR, pthread_process_shared)!= 0)
{
return-1;
}
Pthread_rwlock_init (& (_sharedupdatedata->_lock), &attr);
_sharedupdatedata->_updatetime = Autil::timeutility::currenttime ();
_sharedupdatedata->_version = 0;
return 0;
}
int main ()
{
if (Initsharedmemwrite ("Data.mmap")!= 0) return-1;
int cnt = 200;
while (CNT > 0)
{
Pthread_rwlock_wrlock (& (_sharedupdatedata->_lock));
+ + _sharedupdatedata->_version;
fprintf (stdout, "version =%ld, readers =%u\n",
_sharedupdatedata->_version, _sharedupdatedata->_lock.__data.__nr_readers);
Sleep (1);
Pthread_rwlock_unlock (& (_sharedupdatedata->_lock));
-CNT;
Usleep (100*1000);
}
Delete _mmapfile;
}
Whether it is read process or write process, get the lock after the release is too late to hang out will have this problem
How to solve
The problem has been reproduced, think about how to solve it in a good way, search on the Internet, for reading and writing lock there is no good solution, only in the logic of their own solution, can think of is the use of time-out mechanism, that is, write the process to add a time-out, if the process to write to this time still can not get the lock, that the deadlock, Will read the process count minus 1, which is a violent solution that does not explain if there is a good solution to guide me under
Read and write lock code, read and write lock and mutex lock, compared to more suitable for reading and writing less scenes, if the reading process needs to be locked for a long time, it is more appropriate to use the read and write lock, I should be the scene is, read more write less, reading and writing time are very short; it is considered that the performance of mutexes and read-write locks should not In fact, read and write lock inside the same use of mutual-exclusion lock, but is a relatively short lock time, lock the mutex area, go in to see if someone is writing, and then released,
Note that the read-write lock defaults to write first, that is, when writing, or into the write queue ready to write, reading locks are not added, need to wait
OK, let's see if the mutex solves our problem, there is a property called robust lock inside the mutex.
Set lock to robust lock: PTHREAD_MUTEXATTR_SETROBUST_NP
The robustness attribute defines the behavior when the owner
of a mutex dies. The value of robustness could be either
PTHREAD_MUTEX_ROBUST_NP or PTHREAD_MUTEX_STALLED_NP, which
are defined by the header <pthread.h>. The default value of
The robustness attribute is pthread_mutex_stalled_np.
When the owner of a mutexes with the PTHREAD_MUTEX_STALLED_NP
Robustness attribute dies, all future calls to
Pthread_mutex_lock (3C) for this mutex is blocked from
Progress in a unspecified manner.
Repair of inconsistent robust locks: pthread_mutex_consistent_np
A consistent mutex becomes inconsistent and is unlocked if
Its owner dies while holding it, or if the process contain-
ing the owner of the mutex unmaps the memory containing the
Mutex or performs one of the EXEC (2) functions. A subsequent
Owner of the mutex would acquire the mutex with
Pthread_mutex_lock (3C), which'll return Eownerdead to
Indicate the acquired mutex is inconsistent.
The PTHREAD_MUTEX_CONSISTENT_NP () function should be called
While holding the mutex acquired by a previous call to
Pthread_mutex_lock () that returned eownerdead.
Since the critical section protected by the mutex could have
been left in a inconsistent state by the dead owner, the
Caller should make the mutex consistent only if it is able
To make the critical section protected by the mutex con-
Sistent.
In simple terms, when eownerdead is found, the PTHREAD_MUTEX_CONSISTENT_NP function internally determines whether the mutex is a robust lock, and if so, and he ownerdie, then he sets the owner of the lock to his own process ID. , so that this lock can be restored to usable, very simple.
Lock deallocation can be resolved, but when there is shared data between processes within a share, there is also the need to note that the correctness of the data, that is, the integrity of the process to share a different thread, if it is a process of multiple threads, then the process of abnormal exit, the other threads also exit, process sharing is independent, If a write thread exits abnormally while writing the shared data, cause the written data is incomplete, read the process will read the incomplete data, in fact, the data integrity is very good to solve, just want to add a complete tag in the shared memory, lock the shared area, write the data, write after the mark as complete, It's OK, read the process to judge the completion mark when reading
Test code See:
(! 2295)-> Cat Test/read_shared_mutex.cpp
#include
sharedupdatedata* _sharedupdatedata = NULL;
cm_sub::cmmapfile* _mmapfile = NULL;
int32_t initsharedmemread (const std::string& Mmap_file_path)
{
_mmapfile = new (Std::nothrow) cm_sub::cmmapfile ();
if (_mmapfile = NULL | |!_mmapfile->open (MMAP_FILE_PATH.C_STR (), file_open_write))
{
return-1;
}
_sharedupdatedata = (sharedupdatedata*) _mmapfile->offset2addr (0);
return 0;
}
int main (int argc, char** argv)
{
if (argc!= 2) return-1;
if (Initsharedmemread (argv[1])!= 0) return-1;
int cnt = 10000;
int ret = 0;
while (CNT > 0)
{
ret = Pthread_mutex_lock (& (_sharedupdatedata->_lock));
if (ret = = Eownerdead)
{
fprintf (stdout, "%s:version =%ld, lock =%d,%u,%d\n",
Strerror (ret),
_sharedupdatedata->_version,
_sharedupdatedata->_lock.__data.__lock,
_sharedupdatedata->_lock.__data.__count,
_sharedupdatedata->_lock.__data.__owner);
ret = PTHREAD_MUTEX_CONSISTENT_NP (& (_sharedupdatedata->_lock));
if (ret!= 0)
{
fprintf (stderr, "%s\n", Strerror (ret));
Pthread_mutex_unlock (& (_sharedupdatedata->_lock));
Continue
}
}
fprintf (stdout, "version =%ld, lock =%d,%u,%d\n",
_sharedupdatedata->_version,
_sharedupdatedata->_lock.__data.__lock,
_sharedupdatedata->_lock.__data.__count,
_sharedupdatedata->_lock.__data.__owner);
Sleep (5);
Pthread_mutex_unlock (& (_sharedupdatedata->_lock));
Usleep (500*1000);
-CNT;
}
fprintf (stdout, "Go on\n");
Delete _mmapfile;
}
(! 2295)-> Cat Test/write_shared_mutex.cpp
#include
sharedupdatedata* _sharedupdatedata = NULL;
cm_sub::cmmapfile* _mmapfile = NULL;
int32_t initsharedmemwrite (const char* Mmap_file_path)
{
_mmapfile = new (Std::nothrow) cm_sub::cmmapfile ();
if (_mmapfile = NULL | |!_mmapfile->open (MMAP_FILE_PATH, File_open_write, 1024))
{
return-1;
}
_sharedupdatedata = (Sharedupdatedata *) _mmapfile->offset2addr (0);
Madvise (_sharedupdatedata, 1024, madv_sequential);
pthread_mutexattr_t attr;
memset (&attr, 0x0, sizeof (pthread_mutexattr_t));
if (Pthread_mutexattr_init (&attr)!= 0 | | pthread_mutexattr_setpshared (&ATTR, pthread_process_shared)!= 0)
{
return-1;
}
if (PTHREAD_MUTEXATTR_SETROBUST_NP (&attr, pthread_mutex_robust_np)!= 0)
{
return-1;
}
Pthread_mutex_init (& (_sharedupdatedata->_lock), &attr);
_sharedupdatedata->_version = 0;
return 0;
}
int main ()
{
if (Initsharedmemwrite ("Data.mmap")!= 0) return-1;
int cnt = 200;
int ret = 0;
while (CNT > 0)
{
ret = Pthread_mutex_lock (& (_sharedupdatedata->_lock));
if (ret = = Eownerdead)
{
fprintf (stdout, "%s:version =%ld, lock =%d,%u,%d\n",
Strerror (ret),
_sharedupdatedata->_version,
_sharedupdatedata->_lock.__data.__lock,
_sharedupdatedata->_lock.__data.__count,
_sharedupdatedata->_lock.__data.__owner);
ret = PTHREAD_MUTEX_CONSISTENT_NP (& (_sharedupdatedata->_lock));
if (ret!= 0)
{
fprintf (stderr, "%s\n", Strerror (ret));
Pthread_mutex_unlock (& (_sharedupdatedata->_lock));
Continue
}
}
+ + _sharedupdatedata->_version;
fprintf (stdout, "version =%ld, lock =%d,%u,%d\n", _sharedupdatedata->_version,
_sharedupdatedata->_lock.__data.__lock,
_sharedupdatedata->_lock.__data.__count,
_sharedupdatedata->_lock.__data.__owner);
Usleep (1000*1000);
Pthread_mutex_unlock (& (_sharedupdatedata->_lock));
-CNT;
Usleep (500*1000);
}
Delete _mmapfile;
}
BTW: We all know that lock has overhead, not only the waiting overhead caused by mutual exclusion, but also the lock process is called to the kernel state, the process overhead is very large, there is a mutual exclusion lock called Futex lock (Fast User Mutex), Linux starting from the 2.5.7 version to support Futex, Fast User-level mutex, Fetux lock has better performance, is the user state and the core state of mixing the synchronization mechanism, if there is no lock competition, in the user state can be judged to return, do not need system calls,
Of course, any lock is cost, can not be avoided, use double buffer, release linked list, reference count, can be used to a certain extent to replace the use of locks