Question:
I recently read the source code of memcached and intend to write my learning history. It is also because it is the first time that I officially contacted memcached. I hope you can talk more about it. This series of articles divides memcached into several modules for analysis based on your own understanding. Here we take memcached-1.4.6 as an example.
1. libevent Introduction
The network data transmission and processing in memcached depend entirely on libevent. I will introduce libevent in another article. The following describes how to use libevent. First, we will introduce related definitions.
1) The file descriptor state is readable or writable (readable/writable), which means that when the user thread performs Io operations on the file descriptor in this state, the read/write system calls will immediately read data from the kernel buff or write data to the kernel buff and return the data, instead of blocking because there is no read data or write space, until the descriptor meets the IO condition (IO conditions are ready), it becomes readable or writable.
2) Io events refer to a state change in which the file descriptor state is never readable or writable. Therefore, an IO event must be associated with a file descriptor and can be divided into readable events, writable events, and other different event types.
Follow these steps:
1) The user thread creates an event_base object through the event_init () function. Event_base object manages all Io events registered to itself. In a multi-threaded environment, the event_base object cannot be shared by multiple threads. That is, an event_base object can only correspond to one thread.
2) then the thread registers the IO events related to the file descriptor that it is interested in to the event_base object through the event_add function, specify the event handler to be called when an event occurs ). Server programs usually listen to the readable events of sockets. For example, the server thread registers the ev_read event of socket sock1 and specifies event_handler1 () as the callback function of the event. Libevent encapsulates Io events into struct event type objects, and uses constant flags such as ev_read and ev_write for event types.
3) after the event is registered, the thread calls event_base_loop to enter the loop listening (MONITOR) state. In this loop, I/O multiplexing functions such as epoll are called to access the blocking state until events of interest occur on the descriptor. In this case, the thread calls the previously specified callback function to process the event. For example, when a readable event occurs in socket sock1, that is, the read data already exists in the buff of the sock1 kernel, the blocked thread immediately returns (wake up) and calls event_handler1 () function to process the event.
4) after processing the event obtained by this listener, the thread becomes congested again and listens until the next event occurs.
Ii. memcached thread model
1. multi-thread initialization and startup.
Memcached is a typical single-process multi-thread server. After memcached is started, the main thread initializes each module. For example, calling the slabs_init () function to initialize the memory management module also includes creating multiple worker thread threads and initializing related data, finally, call event_base_loop () to enter the listener loop.
This section describes how to create and initialize various threads and related data. Before describing the specific code, we will first introduce the main data structure. Memcached encapsulates the original thread ID (pthread_t) into a libevent_thread object, which corresponds to the thread one by one. The object is defined as follows:
/** File: memcached. H */typedef struct {pthread_t thread_id;/* thread ID */struct event_base * base;/* The event_base object manages all Io events of the thread */struct event policy_event; /* This event object is associated with the policy_receive_fd descriptor */INT policy_receive_fd;/* receiver descriptor of the pipeline (PIPE) that communicates with the main thread */INT policy_send_fd; /* the sender descriptor of the pipeline that communicates with the main thread */struct thread_stats stats;/* stats generated by this thread */struct conn_queue * new_conn_queue; /* this queue is a lock-protected synchronization object. It is mainly used to transmit the data required to initialize the conn object between the main thread and the worker thread */cache_t * suffix_cache; /* suffix cache */} libevent_thread;/** file: thread. C * array of thread objects corresponding to all worker thread threads */static libevent_thread * threads;
We focus on the fields added with Chinese comments in the libevent_thread definition. For specific functions, see the corresponding notes.
The main thread creation and initialization operations are mainly completed through the thread_init () and setup_thread () functions. The main code of thread_init () is as follows:
/** File: thread. C * thread_init () * // 1) This For Loop initializes the array of worker thread objects. For (I = 0; I <nthreads; I ++) {// 1.1) creates a pipeline to communicate with the main thread, and initializes the notify _ * _ FD descriptor. Int FDS [2]; If (pipe (FDS) {perror ("can't create pipeline y Pipe"); exit (1);} threads [I]. policy_receive_fd = FDS [0]; threads [I]. policy_send_fd = FDS [1]; // 1.2) is used to register Io events related to the policy_event_fd descriptor of the threads [I] thread. Setup_thread (& threads [I]) ;}// 2) This for loop starts the worker thread. The worker_libevent () function mainly calls the event_base_loop () function to listen to Io events registered by this thread cyclically. /* Create threads after we 've done all the libevent setup. */For (I = 0; I <nthreads; I ++) {create_worker (worker_libevent, & threads [I]);} // 3) Wait for all sub-threads, this function is returned only after the worker thread is started. /* Wait for all the threads to set themselves up before returning. */pthread_mutex_lock (& init_lock); While (init_count <nthreads) {pthread_cond_wait (& init_cond, & init_lock);} pthread_mutex_unlock (& init_lock );
The focus of the thread_init () function is to use the setup_thread () function to register Io events related to the policy_event_fd descriptor for each worker thread, the policy_event_fd descriptor here is the receiver descriptor of the pipeline that the worker thread communicates with the main thread. By registering Io events related to this descriptor, the worker thread can listen to the data (events) sent by the main thread ). The main code of the setup_thread () function is as follows:
/** File: thread. C * setup_thread () * // 1.2.1) initialize the yy_event event object in the thread object and register it with the event_base object. /* Listen for notifications from other threads */event_set (& Me-> policy_event, me-> events, ev_read | ev_persist, thread_libevent_process, me); event_base_set (Me-> base, & Me-> policy_event); If (event_add (& Me-> policy_event, 0) =-1) {fprintf (stderr, "can't monitor libevent running y PIPE \ n"); exit (1) ;}// 1.2.2) Create and initialize the new_conn_queue queue. Me-> new_conn_queue = malloc (sizeof (struct conn_queue); If (Me-> new_conn_queue = NULL) {perror ("failed to allocate memory for connection queue "); exit (exit_failure);} cq_init (Me-> new_conn_queue );
According to the code segment 1.2.1), the worker thread will listen to readable events on the policy_event_fd descriptor, that is, to listen to readable events on the pipeline that communicates with the main thread t, and specify to use the thread_libevent_process () function to process the event.
3) after the code segment is executed, each worker thread has been initialized and started, and each worker thread starts to listen and wait for the IO events related to the policy_receive_fd descriptor to be processed.
After the worker thread starts, the main thread needs to create a listening socket (listening socket) to wait for the client connection request. The listen client connection request here is different from the listener (MONITOR) Io event in libevent. In memcached, sockets are further encapsulated like thread IDs. The socket is encapsulated into a conn object, indicating the connection to the client. This struct has a large definition. Several fields related to the topic are selected as follows:
/** File: memcache. H */typedef struct conn; struct conn {int SFD; // original socket sasl_conn_t * sasl_conn; Enum conn_states state; // The state variable for this connection, used to mark the status of the Connection during running. This field is very important. The value range is defined by the conn_states enumeration. Enum bin_substates substate; // similar to the struct event in the state field; // This event object is associated with this socket, that is, the SFD field. Short ev_flags; // It is related to the previous field and specifies the event type of the listener, such as ev_read. Short which;/** which events were just triggered * // the following fields are omitted}
The following is where the main thread creates the listening socket:
/** File: memcached. C * server_socket () * // 4) the main thread creates and initializes the listening socket here, including registering Io events related to the conn object. Note that the conn_listening parameter specifies the initialization status of the conn object. If (! (Listen_conn_add = conn_new (SFD, conn_listening, ev_read | ev_persist, 1, transport, main_base) {fprintf (stderr, "failed to create listening connection \ n "); exit (exit_failure);} listen_conn_add-> next = listen_conn; listen_conn = listen_conn_add;
Conn_new () is an important function in memcached. This function encapsulates the original socket into a conn object and registers Io events related to the conn object, and specify the initial status of the connection (conn. Note that the conn object of the listening socket is initialized to the conn_listening state, which will be used later. Some code of the conn_new () function is as follows:
/** File: memcached. C * conn_new () * // 4.1) initialize fields related to the conn object. Note the state field. C-> SFD = SFD; C-> state = init_state; // The intermediate initialization step is omitted. // 4.2) register the IO event event_set (& C-> event, SFD, event_flags, event_handler, (void *) C); event_base_set (base, & C-> event); C-> ev_flags = event_flags; if (event_add (& C-> event, 0) =-1) {If (conn_add_to_freelist (c) {conn_free (c);} perror ("event_add "); return NULL ;}
Again, the state field of the connected object is a very important variable, which indicates the status of the conn object during running. The value range of this field is defined by the conn_states enumeration. The conn_listening constant passed to the conn_new () function in four code segments. The main thread creates a connection in the initial state of conn_listening. It can be disclosed in advance that after the worker thread accepts the distribution of the main thread (which will be introduced in the next section), it will create a conn object in the initial state of conn_new_cmd.
Everyone should be familiar with how to register Io events. Here, you will notice that all the conn object-related processing functions in memcached are event_handler () functions, which internally hand over the main event processing part to the drive_machine () function. This function is solely responsible for handling events related to customer connections. After initialization, the main thread enters the listening loop through event_base_loop (). At this time, the main thread starts to wait for connection requests on the listening socket.
2. Establish and dispatch client connections
After the startup steps described in the previous section are completed, the main memcached thread starts to listen to the read events on the listening socket, that is, wait for the client connection request, and the worker thread listens to the read events on the respective policy_receive_fd descriptor, wait for data from the main thread. Now let's take a look at how memcached handles connection requests sent from clients to the memcached server. Refer to the previous section about creating a listening socket. We know that when the client sends a connection request, the main thread will
The socket returns (wake up) when a readable event occurs and calls the event_handler () function to process the request. This function calls the drive_machie () function. The part that processes the client connection request is as follows:
/** File: memcached. C * drive_machine () */switch (c-> state) {Case conn_listening: // 5) connect the following lines to the client to obtain the SFD socket. Addrlen = sizeof (ADDR); If (SFD = accept (c-> SFD, (struct sockaddr *) & ADDR, & addrlen) =-1) {If (errno = eagain | errno = ewouldblock) {/* These are transient, so don't log anything */Stop = true ;} else if (errno = emfile) {If (settings. verbose> 0) fprintf (stderr, "Too listener open connections \ n"); accept_new_conns (false); stop = true;} else {perror ("accept ()"); stop = true;} break ;} If (flags = fcntl (SFD, f_getfl, 0) <0 | fcntl (SFD, f_setfl, flags | o_nonblock) <0) {perror ("setting o_nonblock "); close (SFD); break;} // 6) This function transmits the original socket created by the main thread and some initialization data to a specified worker thread. Dispatch_conn_new (SFD, conn_new_cmd, ev_read | ev_persist, data_buffer_size, tcp_transport); stop = true; break;
Here is where the state field of the conn object plays a role: :), The drive_machine () function is a huge switch statement, based on the current state of the conn object, that is, the value of the state field is selected to execute different branches. Because the conn object of the listening socket is initialized to the conn_listening state, the drive_machine () function will execute the branch of case conn_listenning in the switch statement, create and assign client connections. See section 5.
Here, the main thread uses the dispatch_conn_new () function to pass the client connection socket (this is only the original socket) and other related initialization data to a worker thread. Here we will use the pipeline (PIPE) between the main thread and the worker thread, and the new_conn_queue queue in the thread object. The Code is as follows:
Void dispatch_conn_new (int sfd, Enum conn_states init_state, int event_flags, int read_buffer_size, Enum network_transport transport) {// 6.1) creates a cq_item object, A simple remainder mechanism is used to select the worker thread to which the cq_item object is passed. Cq_item * Item = cqi_new (); int tid = (last_thread + 1) % settings. num_threads; libevent_thread * thread = threads + tid; last_thread = tid; // 6.2) initialize the new cq_item object item-> SFD = SFD; item-> init_state = init_state; item-> event_flags = event_flags; item-> read_buffer_size = read_buffer_size; item-> transport = transport; // 6.3) Push the cq_item object into the new_conn_queue queue. Cq_push (thread-> new_conn_queue, item); // 6.4) write one byte of data to the pipe connected to the worker thread. Memcached_conn_dispatch (SFD, thread-> thread_id); If (write (thread-> policy_send_fd, "", 1 )! = 1) {perror ("writing to thread every y Pipe ");}}
This function mainly creates and initializes a cq_item object, which contains many initialization data required to create a conn object, such as the original socket (SFD) and init_state, the function then passes the cq_item object to a selected worker thread. The previous section introduced the libevent_thread thread object. new_conn_queue queue is used to transmit data between two threads. Here it is used to pass a cq_item object to the worker thread. In addition, pay attention to the main thread direction and worker
The pipe connected by the thread is written into one byte of data. This is intended to trigger the readable event of the notify_receive_fd descriptor at the other end of the pipeline. Now let's look at what will happen to the worker thread at the other end of the pipeline.
After memcached is started, the worker thread listens to readable events on the policy_receive_fd descriptor. Because the main thread writes a byte of data to the pipeline, the worker thread returns the result of a readable event on the policy_receive_fd Descriptor and calls the thread_libevent_process () the main code of this function is as follows:
/** File: thread. C * thread_libevent_process () * // 7) read a byte of data from the pipeline. This byte is the byte previously written by the main thread to the policy_send_fd descriptor. If (read (FD, Buf, 1 )! = 1) if (settings. verbose> 0) fprintf (stderr, "can't read from libevent PIPE \ n"); // 8) A cq_item object is popped up from the new_conn_queue queue, this object is the object that the previous main thread pushed into the new_conn_queue queue. Item = cq_pop (Me-> new_conn_queue); // 9) creates and initializes a conn object based on the cq_item object, which is responsible for communication between the client and the worker thread. If (null! = Item) {conn * c = conn_new (item-> SFD, item-> init_state, item-> event_flags, item-> read_buffer_size, item-> transport, me-> base); // below
Note: In the code segment 7), one byte data read from the pipeline is the data written by the main thread at 2.4. Obviously, the data itself has no meaning. Its purpose is to trigger the readable event of the policy_receive_fd descriptor on the worker thread side. Create and initialize the conn object based on the obtained cq_item object. Note that, in section 6), the main thread initializes the init_state field of the cq_item object to conn_new_cmd, the state field of the conn object created by the worker thread is initialized to conn_new_cmd.
At this point, the connection request is sent from the client, the original socket is created in the main thread, and the initialization data such as the original socket is distributed to each worker thread, at the end, the worker thread creates a conn object and starts the whole process of communication with the client. The worker thread listens to the readable events connected to the client from here, and prepares to use the event_handler () function to process data sent from the client.
Reference: 1) http://bachmozart.iteye.com/blog/344172