Recently, the Muduo Network library has been carefully studied again, and harvested a lot. This article will summarize and analyze the design idea and the key technical details of Muduo, of course, because of the space reason here more is the brief mention to the key technology, the concrete detail also needs the reader to find the study material himself. Muduo/base Date Class
The encapsulation of date classes, using Julian (RU-day) can easily calculate the date difference. Specific formulas and ideas see the calculation of exception in the days of Confucianism
The encapsulation of exception class, provide what () output error information and StackTrace () function for stack trace, use throw muduo::exception ("oops"), external use catch (const Muduo:: exception& ex) captures and uses the Ex.what ()/stacktrace () to obtain detailed information. Atomic Class
Atomic operations are less expensive than locks, so we can use the self-reducing atomic operation provided by GCC; the smallest execution unit is the assembly statement, not the language statement. Countdownlatch class
Can be used for all child threads to wait for the main thread to initiate a "start", or for the main thread to wait for the child thread to initialize until it is finished. The use of RAII techniques encapsulates mutextlockguard,hodler to indicate which thread the lock belongs to. Timestamp class
TimeStamp inherits to Less_than_comparable<>, uses the template element programming, only needs to implement, can automatically realize >,<=,>=.
Blockingqueue Blockingqueue and Boundedblockingqueue are unbounded bounded queues, which are essentially producer consumer problems, using semaphores or conditional variables to solve them. The essence of ThreadPool is also producer consumer problem, Task queue is task function (producer), thread queue is equivalent to consumer. The basic flow diagram is as follows:
Asynchronous Log classes for the implementation of general log classes, (1) Overload << format output (2) level processing (3) buffers. To increase efficiency and prevent blocking of business threads, a background thread is responsible for collecting log messages and writing to log files, and other business threads simply send log messages to this log thread, which is called asynchronous logging. The basic implementation is still producer (business thread) and consumer (log thread) and buffer, but such a simple model will cause more frequent write files, because each time signal we need to write, all the messages written to the file, the efficiency is low. Muduo uses a multiple buffering mechanism, that is, mutliple buffering, which uses multiple buffers to signal when a buffer is full or time timed out, and only 2 blocks of memory are discarded if a message heap occurs. Also use the swap public buffer to avoid competition, get all the messages at once and write to the file.
__type_traits Techniques
Stringpiece is an efficient string class for Google that uses __type_traits to further optimize different types. In STL, in order to provide common operation without loss of efficiency, traits is by defining some structure or class, and using the template to give the type some characteristics, these characteristics vary according to the type. In programming, these traits can be used to judge the characteristics of a type, and to realize the effect that the same operation differs by type. can refer to this article to understand. muduo/net
reactor reactor+ thread pool suitable for CPU-intensive, multiple reactors suitable for burst I/O type, generally a gigabit network of a rector;multiple rectors (thread) + thread pool More adaptable to burst I/O and intensive computing. The main reactor in multiple reactors registers only the Op_accept events and distributes the registered I/O events to the sub reactor, and multiple sub reactor are assigned by the round-robin mechanism. See Eventloopthread concrete realization.
tcpconnectionTcpconnection is an abstraction of a connected socket; Channnel is a selectable IO channel that registers and responds to IO events, but does not own file descriptor.
The channel are members of Acceptor, Connector, EventLoop, Timerqueue, and Tcpconnection, and the life cycle is controlled by the latter.Timerqueue class
Timers_ and Activetimers_ Save the same data, Timers_ are sorted by due time, activetimers_ sorted by object address, and Timerqueue only focus on the earliest
Timer, so when a readable event occurs, you need to use getexpired () to get all the timeout events, because there may be multiple timers at the same time.Runinloop
Runinloop implementation: Two situations where EVENTFD wakes are required (1) The thread calling Queueinloop is not the current IO thread. (2) is the current IO thread and is calling Pendingfunctor.Rvo Optimization
C + + functions return vectors or custom types to avoid the creation of additional copy constructors, destructor overhead, and innocence.shared_from_this ()
Gets the shared_ptr object of its own object, a direct cast can result in a reference count of +1tcpconnection life cycle
The Tcpconnection object cannot be destroyed by removeconnection because if the handleevent () in channel is still executing, it will cause core dump, we use the shared_ptr Admin reference count of 1, Maintain a weak_ptr (Tie_) in channel, assign this shared_ptr object to Tie_, reference technology is still 1, and when the connection closes, Handleevent will be promoted in Tie_ to get a shared_ptr object with a reference count of 2.Buffer Class
Their own design of variable buffers, member variables vector<char>, Readindex, Writeindex, while dealing with the problem of sticky packets. The Extrabuffer in BUFFER::READFD () avoids the huge overhead of memory resources through the combination of space on the heap and on the stack. The difference between adding a stack space and expanding it directly is to know exactly how much data to avoid the huge buffer waste and reduce the read system call.Muduo/examples
chargen test server throughput, gigabit Nic running around 100m/s (1000M/8), mechanical hard disk read and write speed is almost the same, solid-state hard disk can reach 500m/s.
filetransfer file transfer, send 64k each time, and then set Writecompletecallback_ and then small block to send can avoid the application layer buffer to occupy a lot of memory.
Chat chat room, mutex protection vector, multiple messages can not be sent in parallel, there is a higher lock competition. Optimizing one: Using shared_ptr to implement
Copy_on_write, the purpose of sending messages in parallel is achieved by establishing a copy. Optimization Two: The message arrives between the first client and the last client with a delay and can be sent in its own IO thread.
NTP Network time synchronization
Client Server
| |
T1 | | T2
| | T3
T4 | |
RTT = (t4-t1)-(T3-T2)
clock offset = [(T4+T1)-(T2+T3)]/2 (add a coefficient k yourself, representing the difference between the client and the server, the formula is easy to launch)
other key technical points
volatile
Prevents the compiler from optimizing the code, and for variables declared with this keyword, the system always reads data from its memory and does not use backups in registers. Used in Atomic.h.
__thread
__thread-Modified variables are thread-locally stored, only to the pod type and to a pointer to a class, and non-pod types can use thread-specific data tsd.
Single case Mode
A class has only one instance and provides a global access point to access it. (1) Private constructor. (2) The class definition contains a static private object of that class. (3) static public function acquisition
Static Private object.
understanding of asynchronous callbacks
The so-called asynchronous callback, the main thread uses Poll/epoll for the event loop, events include various IO time and TIMERFD implementation of timer events. No event occurs when blocking at Poll/epoll, and when an event occurs, the Activechannel is traversed to call the callback function. The timer is implemented using a hardware clock interrupt, which is different from the sleep software blocking. So what we often say to wake a thread by writing a message is to trigger an IO event that causes the Poll/epoll to unblock and to execute the callback function downward.
CAS no lock operation
CAS Primitives have three parameters, memory address, expected value, and new values. If the value of the memory address = = is expected, the value is not modified, it can be modified to a new value. Otherwise the modification fails and returns FALSE. The implementation of a lock-free queue can refer to Coolshell. Note that the unlocked structure is not necessarily faster than the lock structure, the lock instruction itself is simple, and the real impact of the performance is lock contention (lock contention). When the contention happened, lock situation will fall into the kernel of sleep, no lock situation will continue to spin, which fall into the kernel of sleep is a cost, this cost when the critical area is very small when the proportion of the very large, which is the lockfree in this case the performance of high reasons. Lockfree is not the significance of absolute high performance, he is a bit more than a mutex using Lockfree can avoid deadlock/live lock, priority flip and other issues, but the ABA problem, memory order and other issues make lockfree than the mutex more difficult to achieve. Unless the bottleneck has been identified, it is best to use Mutex+condvar.
avoid repeating include redundant header files
Use a reference or pointer in a header file, rather than using a value, use a predecessor declaration instead of directly containing his header file. Use the Impl technique, simply is the class inside contains the pointer, in the CPP implementation.
(void) RET
Prevent compilation warning, variable not used (limited to release version int n = ...; ASSERT (N==6))
Vim Add comment at the beginning of the line
% 1,10s/^/#/g Add # Comments at the beginning of 1-10
reference materialsMuduo source Muduo Manual "Linux multithreaded Server-side programming"