In server development, in order to make full use of multicore or even multiple CPUs, or to simplify the difficulty of writing logic, multi-processes (such as a process responsible for a logical) multithreading (assigning different users to different processes) or a co-process (different users are assigned different threads and switches to other threads when needed) are applied. These technologies, such as multi-process multithreading, are often used at the same time.
A classic server framework can be described as the following framework:
These server processes work together to provide services to the user, where the central server provides the scheduling of the cluster, while the logical server 1 provides the logon service, logical server 2 provides the purchase service, or 2 servers provide the same services, and then depending on the number of users of the load to assign different users to different servers.
And these servers in order to facilitate the development of often use to multi-threaded technology, a classic pattern is that multiple network threads to handle network events, and then a thread processing user logic, and further can be multi-threading user logic, in the multi-threading is more troublesome to access the same resource when different threads need to lock, But it is well known that the source of the root of all evil brings a hard-to-find deadlock and low problem. However, these have special techniques to solve, such as deadlock, the same thread repeatedly get the same lock caused by the deadlock, we can use a reentrant count lock, when a thread acquires this lock, if another thread tries to lock this mutex will enter the system's dispatch queue, And this thread acquires this lock will reference count +1, the line threads unlocked 1, if the reference count is 0 to unlock, to implement a lock, the specific algorithm pseudo-code is roughly the following:
//int GetThreadId () gets the thread ID//switchtothread () Toggle Thread//Wakeup (int id) wake-up thread//BOOL Exchange (BOOL * m, bool v) atoms modify a value of type bool variable and return the original value//BOOL Load (BOOL * m) Atom gets the value of a bool variablestructmutex{Mutex () {Islock=false; Lockid= -1; ref=0; } BOOLIslock; volatile intlockid; int ref; Std::que<int>waitth;};void Lock(Mutex &mu) { intID =GetThreadId (); if(id! =mu.lockid) { if(Exchange (&mu.islock,true) {waitth.push (ID); SwitchToThread (); } Else{lockid=ID; } } Else { ref++; }}voidUnlock (Mutex &mu) { intID =GetThreadId (); if(id = =mu.lockid) {if(--ref==0) {Wakeup (Mu.que.pop ()); } }}
Such a set of pseudo-code, where the implementation of atomic operations and thread scheduling is roughly the following in Windows:
int GetThreadId () { return (int) getcurrentthread ();} int SwitchToThread () { return switchtothread ();} int Wakeup (int ID) { return resumethread ((HANDLE) id);}
In boost, the mutex and critical section in the Mutex:boost::recursive_mutex,windows can be re-entered, and the pthread_mutex_t of Linux is not reentrant.
As a result of the simultaneous access to 2 resources caused by the deadlock, one can take access to resources in the same sequential access, can be avoided.
For the performance problem of the lock, a trickery approach is that each user data is independent and access to public resources uses a number of efficient algorithms such as RCU, such as the lock-free algorithm
For example, a multi-read and multi-write and read requests more than write requests, we can use the RCU algorithm.
The RCU algorithm reads the data directly at the time of reading, while the write acquires a copy of the read locally and registers a write callback function that invokes the callback function to complete the write after the read operation is complete. A traditional synchronization mechanism is used between write operations.
and the lock-free algorithm, the more classic is some lock-free queue, such as Ms-que,optimistic-que
The principle of ms-que is to use atomic operations to create a new node at the time of writing, complete the writing of the data, then use the CAS operation at the tail, point the next pointer of the tail node to the new node, and point the tail node to the new node.
At the time of reading, the CAS operation is used to point the head node to the next node of the head node.
However, using these, although the full use of CPU performance, but programming also brings great inconvenience, such as multi-threaded under the extension to fully utilize the cluster to raise the problem of high performance, multi-process is a complex asynchronous call.
An ideal situation is that in the case of cross-process communication, you can start a remote request, wait for the far end of the corresponding, and then continue to execute, such as in the case of multi-threaded, can be a user a thread, and then after a send, call recv after the request is returned to continue execution,
However, multi-threaded scheduling can bring huge scheduling overhead, in which case, a more lightweight process becomes a better choice, a user a co-process, and then after initiating a remote request, switch to the other awakened user process to continue execution, after the request of this process is returned, the process continues to execute the logic code.
The problem is that the association process is a user-controlled scheduling, so users need to implement their own scheduling algorithm, as well as the corresponding lock algorithm (because the multi-process can be single-threaded execution, but because of the resource contention between the different processes so that the lock is still required, but also synchronous execution logic, Using spinlock means that the entire system is blocked, even if the network event is not responding under the asynchronous network ID.
The scheduling of the co-processes is as follows:
context::context* scheduling::onscheduling () {Context::context * _context = 0;{ while (!time_wait_task_que.empty ()) {uint64_t t = timer::clock (); Time_wait_handle top = Time_wait_task_que.top (); if (t > Top.handle->waittime) {time_wait_task_que.pop (); uint32_t _state = Top.handle->_state_queue.front (); Top.handle->_state_queue.pop (); if (_state = = time_running) {continue;} _context = Top.handle->_context;goto Do_task;}}} {while (!in_signal_context_list.empty ()) {Context::context * _context_ = In_signal_context_list.front (); In_signal_ Context_list.pop (); Actuator * _actuator = Context2actuator (_context_); if (_actuator = = 0) {continue;} Task * _task = _actuator->current_task (); if (_task = = 0) {continue;} if (_task->_state = = time_wait_task) {_task->_state = Running_task;_task->_wait_context._state_queue.front ( ) = time_running;} _context = _context_;goto Do_task;}} {if (!_fn_scheduling.empty ()) {_context = _fn_scheduling (); goto Do_task;}} {if (!low_priority_context_list.empty ()) {_context = In_signal_context_list.front (); In_signal_context_list.pop (); goto Do_task;}} Do_task:{if (_context = = 0) {Actuator * _actuator = _abstract_factory_actuator.create_product (); _context = _actuator- >context (); _list_actuator.push_back (_actuator);}} return _context;}
Lock algorithm principle Recursive_mutex, when the acquisition of a lock fails to schedule the current process to other user co-process, the wake-up waiting for the process when unlocking
void Mutex::lock () {if (_mutex) {_mutex = true;} else {_service_handle->scheduler ()}} void Mutex::unlock () {if (_mutex) {if (!wait_context_list.empty ()) {Auto weak_up_ct = Wait_context_list.back (); wait_ Context_list.pop_back (); _service_handle->wake_up_context (WEAK_UP_CT);} _mutex = false;}}
Multi-process, multi-threaded and multi-coprocessor in server development