Memcached set command rough processing logic note

Source: Internet
Author: User
Tags call back

This time, the main logic of the state machine is recorded, and the execution process of the SET command is tracked. Memory application is not involved for the moment. The following content is basically a code comment.

First, I want to know the representation of various states of conn when the customer connection sends data to the data being processed and returns the data.

Enum conn_states {conn_listening,/** only the socket that listens to the connection is in this status */conn_new_cmd,/** wait for the next command, the new client connection Initialization is also in this status */conn_waiting,/** waiting to read data */conn_read,/** reading command data, starting to read */conn_parse_cmd, /** try to parse a command from the read buffer */conn_write,/** wait for the output to return the result */conn_nread, /** reading/preparing to read n bytes of data, that is to say, the number of bytes to be read has been calculated, and the value to be read by the set command will be used */conn_swallow, /** swallowing unnecessary bytes w/o storing */conn_closing,/** closing connection */conn_mwrite, /** return multiple items in order */conn_closed,/** identify that the connection has been closed */conn_max_state/** <Max state value (used for Assertion )*/};

The state machine starts to receive data from the libevent notification. This will jump to the conn_read status;

Static void drive_machine (conn * c) {bool stop = false; int SFD; socklen_t addrlen; struct sockaddr_storage ADDR; int nreqs = settings. reqs_per_event; int res; const char * STR; while (! Stop) {Switch (c-> state) {Case conn_listening :... /** process the connection logic */break; Case conn_waiting:/** in this status, only readable events can be registered with libevent */If (! Update_event (C, ev_read | ev_persist) {If (settings. verbose> 0) fprintf (stderr, "couldn't update event \ n"); conn_set_state (C, conn_closing); break ;} /** after the registration is readable, the next state is set to read data */conn_set_state (C, conn_read); stop = true;/** must jump out of the state machine, when the event comes back, the return entry is conn_read. Now go to */break;/** to notify you that the data has been received. Read data here */case conn_read: /** here, let's take a look at the TCP transmission mode. Jump to try_read_network and scroll down to find the function comment */RES = is_udp (c-> transport )? Try_read_udp (c): try_read_network (c); Switch (RES ){
Case read_no_data_received:/** the next readable event, such as data not obtained, is triggered */conn_set_state (C, conn_waiting); break; Case read_data_received: /** get data */conn_set_state (C, conn_parse_cmd);/** jump to the Command Parsing status, jump to it */break; Case read_error:/** exception in reading data, close connection */conn_set_state (C, conn_closing); break; Case read_memory_error:/* rbuf failed to resize failed to allocate more memory * // * state already set by try_read_network */break ;} /** note that the state machine will not jump out here and you will continue to follow the settings above */Break;/** when reading data normally, it will jump to this */case conn_parse_cmd: /** jump to the command parsing function */If (try_read_command (c) = 0) {/* failed to read the command. The system considers that more data is needed. Wee need more data! */Conn_set_state (C, conn_waiting);/** register the read event to */}/** Similarly, you do not need to jump out of the state machine * // ** if the parsing command is successful, you will jump to conn_nread to prepare data for obtaining nbytes */break; Case conn_new_cmd: /* only process nreqs at a time to avoid starving otherconnections * // ** each time I/O is reused, no more than nreqs are processed, this prevents connections from being processed by other customers until */-- nreqs;/** each time a new */If (nreqs> = 0) is processed) {/** count nreqs */reset_handler (c);/* re-enter the while loop and enter the state machine. At the beginning of the connection, there is actually no data, so the next state is conn_waiting, jump to this status logic */} else {pthr Ead_mutex_lock (& C-> thread-> stats. mutex); C-> thread-> stats. conn_yields ++; pthread_mutex_unlock (& C-> thread-> stats. mutex); If (c-> rbytes> 0) {/* We have already read in data into the input buffer, so libevent will most likely not signal read eventson the socket (unless more data is available. as ahack we shoshould just put in a request to write data, because that shoshould be possible;-) * // *** because the data has been read So many events will not be processed each time, while the libevent will not retrieve the previously obtained but unprocessed events (Linux epoll et) in the next event * acquisition ), so here we use the * technique, that is, to register a writable event with libevent. In this way, memcached preferentially processes readable events and * processes read data, then consider writing events */If (! Update_event (C, ev_write | ev_persist) {If (settings. verbose> 0) fprintf (stderr, "couldn't update event \ n"); conn_set_state (C, conn_closing); break;} Stop = true; /** nreqs will be reset to ettings. reqs_per_event is 20 */} break by default;/** after the parsing command is successful, it will jump to here */case conn_nread: If (c-> rlbytes = 0) {complete_nread (c);/** you do not need to read data any more. You can directly jump to complete_nread. Then, out_string () changes the c state and may go to write */break; /** out_string () has a comment below */}/* exception Check if rbytes <0, to prevent crash */If (c-> rlbytes <0) {If (settings. verbose) {fprintf (stderr, "invalid rlbytes to read: Len % d \ n", C-> rlbytes);} conn_set_state (C, conn_closing); break ;} /* first check if we have leftovers in the conn_read buffer * // ** after the above judgment, the data needs to be read, check whether there is any unresolved data */If (c-> rbytes> 0) {int tocopy = C-> rbytes> C-> rlbytes? C-> rlbytes: C-> rbytes;/** confirm the required length */If (c-> ritem! = C-> rcurr) {/** copy the unparsed tocopy length data to C-> ritem. * Because c-> item points to the data part of the item applied to the memory, therefore, you can save a memory copy !? */Memmove (c-> ritem, C-> rcurr, tocopy);} c-> ritem + = tocopy; C-> rlbytes-= tocopy; c-> rcurr + = tocopy; C-> rbytes-= tocopy; If (c-> rlbytes = 0) {break; /** if you do not need to read the data, jump out of the switch and return to conn_nread. After complete_nread (), the data read in the while */}/** is not enough, continue reading from the socket * // * Now try reading from the socket * // ** directly read to C-> ritem! */RES = read (c-> SFD, C-> ritem, C-> rlbytes); If (RES> 0) {pthread_mutex_lock (& C-> thread-> stats. mutex); C-> thread-> stats. bytes_read + = res; pthread_mutex_unlock (& C-> thread-> stats. mutex); If (c-> rcurr = C-> ritem) {C-> rcurr + = res;} c-> ritem + = res; c-> rlbytes-= res;/** cyclically read until rlbytes = 0 */break;} If (RES = 0) {/* end of stream */conn_set_state (C, conn_closing); break;} If (RES =-1 & (errno = eagain | Errno = ewouldblock) {If (! Update_event (C, ev_read | ev_persist) {If (settings. verbose> 0) fprintf (stderr, "couldn't update event \ n"); conn_set_state (C, conn_closing); break;} Stop = true; break ;} /* Otherwise we have a real error, on which we close the connection */If (settings. verbose> 0) {fprintf (stderr, "failed to read, and not due to blocking: \ n" "errno: % d % s \ n "" rcurr = % lx ritem = % lx rbuf = % lx rlbytes = % d rsize = % d \ n ", errno, S Trerror (errno), (long) C-> rcurr, (long) C-> ritem, (long) C-> rbuf, (INT) C-> rlbytes, (INT) c-> rsize);} conn_set_state (C, conn_closing); break; Case conn_swallow: /* We are reading sbytes and throwing them away */If (c-> sbytes = 0) {conn_set_state (C, conn_new_cmd); break ;} /* first check if we have leftovers in the conn_read buffer */If (c-> rbytes> 0) {int tocopy = C-> rbytes> C-> sbytes? C-> sbytes: C-> rbytes; C-> sbytes-= tocopy; C-> rcurr + = tocopy; C-> rbytes-= tocopy; break ;} /* Now try reading from the socket */RES = read (c-> SFD, C-> rbuf, C-> rsize> C-> sbytes? C-> sbytes: C-> rsize); If (RES> 0) {pthread_mutex_lock (& C-> thread-> stats. mutex); C-> thread-> stats. bytes_read + = res; pthread_mutex_unlock (& C-> thread-> stats. mutex); C-> sbytes-= res; break;} If (RES = 0) {/* end of stream */conn_set_state (C, conn_closing); break ;} if (RES =-1 & (errno = eagain | errno = ewouldblock) {If (! Update_event (C, ev_read | ev_persist) {If (settings. verbose> 0) fprintf (stderr, "couldn't update event \ n"); conn_set_state (C, conn_closing); break;} Stop = true; break ;} /* Otherwise we have a real error, on which we close the connection */If (settings. verbose> 0) fprintf (stderr, "failed to read, and not due to blocking \ n"); conn_set_state (C, conn_closing); break;/** generally calls out_string () the returned results will be returned here */case con N_write:/** we want to write out a simple response. if we haven't already, * assemble it into a msgbuf list (this will be a single-entry * List for TCP or a two-entry list for UDP ). */If (c-> iovused = 0 | (is_udp (c-> transport) & C-> iovused = 1) {If (add_iov (C, c-> wcurr, C-> wbytes )! = 0) {If (settings. verbose> 0) fprintf (stderr, "couldn't build response \ n"); conn_set_state (C, conn_closing); break ;}/ * fall through... * // ** no break directly follows the conn_mwrite logic */case conn_mwrite: If (is_udp (c-> transport) & C-> msgcurr = 0 & build_udp_headers (c )! = 0) {If (settings. verbose> 0) fprintf (stderr, "failed to build UDP headers \ n"); conn_set_state (C, conn_closing); break;}/** call transmit () real Data Transmission */switch (transmit (c) {Case transmit_complete: If (c-> state = conn_mwrite) {conn_release_items (c);/* XXX: I don't know why this wasn't the general case */If (c-> protocol = binary_prot) {conn_set_state (C, C-> write_and_go );} else {conn_set_state (C, conn_new_c MD) ;}} else if (c-> state = conn_write) {If (c-> write_and_free) {free (c-> write_and_free ); c-> write_and_free = 0;} conn_set_state (C, C-> write_and_go);/** set it to the status after writing */} else {If (settings. verbose> 0) fprintf (stderr, "unexpected state % d \ n", C-> state); conn_set_state (C, conn_closing);} break; Case transmit_incomplete: Case transmit_hard_error: break;/* continue in state machine. */case transmit_soft_erro R: Stop = true; break;} break; Case conn_closing: If (is_udp (c-> transport) conn_cleanup (c); elseconn_close (c); stop = true; break; case conn_closed:/* This only happens if dormando is an idiot. */abort (); break; Case conn_max_state: assert (false); break ;}return ;}/ ** read from network as much as we can, handle buffer overflow and connection * close. * Before reading, move the remaining incomplete fragment of a C Ommand * (if any) to the beginning of the buffer. ** to protect us from someone flooding a connection with bogus data causing * the connection to eat up all available memory, break out and start looking * at the data I 've got after a number of reallocs... ** @ return Enum try_read_result * // *** read data from the socket whenever possible to handle buffer overflow and client connection * before starting to read data, move the rest of the parsed command data fragments to the front of the buffer (tail) ** to prevent some people from sending a large amount of forged data to consume available memory, this limits the re-allocation each time Memory times ** (not well translated-_-) */static Enum try_read_result try_read_network (conn * c) {Enum try_read_result gotdata = read_no_data_received; /** the initial status is unread data */INT res; int num_allocs = 0;/** record the number of times rbuf is reassigned */assert (C! = NULL);/** if there is resolvable data */If (c-> rcurr! = C-> rbuf) {If (c-> rbytes! = 0)/* if there is any unparsed data, connect the data to the tail of rbuf otherwise there's nothing to copy */memmove (c-> rbuf, c-> rcurr, C-> rbytes); C-> rcurr = C-> rbuf;/** rcurr points to rbuf, in this way, rcurr will start to read data under the existing data */}/**/while (1) {If (c-> rbytes> = C-> rsize) {/** if the unparsed Data Length is greater than the total length of rbuf, The rbuf space is allocated again */If (num_allocs = 4) {/** a maximum of five re-allocation opportunities. After five re-allocation, no matter whether data is read or not, the returned data has been read */return gotdata;/** the total number of re-allocation attempts is 5, the base number is 2048, which is 2 K. By default, the maximum value is 64 K */} + + num_allocs; char * new_rb UF = realloc (c-> rbuf, C-> rsize * 2);/** increase the original space to 2 times */If (! New_rbuf) {/** allocation failed */stats_lock (); stats. malloc_fails ++; stats_unlock (); If (settings. verbose> 0) {fprintf (stderr, "couldn't realloc input buffer \ n");} c-> rbytes = 0; /* ignore what we read */out_of_memory (C, "server_error out of memory reading request"); C-> write_and_go = conn_closing; return read_memory_error ;} /** the new space is allocated successfully */C-> rcurr = C-> rbuf = new_rbuf; C-> rsize * = 2; /** set the size */} int avail = C-> rsize -C-> rbytes;/** calculate the available space, total length-unresolved length */RES = read (c-> SFD, c-> rbuf + C-> rbytes, avail); If (RES> 0) {pthread_mutex_lock (& C-> thread-> stats. mutex); C-> thread-> stats. bytes_read + = res;/** statistics on the amount of system data read */pthread_mutex_unlock (& C-> thread-> stats. mutex); gotdata = read_data_received; C-> rbytes + = res;/** update the unresolved length */If (RES = avail) {/** continue to read data, at this time, you need to expand the space of rbuf */continue;} else {break;/** after the read is complete, the returned read_data_received */} If (RES = 0) {/** read error */return read_error;} If (RES =-1) {/*** disconnected from the peer end? */If (errno = eagain | errno = ewouldblock) {break;} return read_error ;}} return gotdata ;} /** if we have a complete line in the buffer, process it. * // *** the above official comment says that rbuf has a complete row to start processing the existing data */static int try_read_command (conn * c) {assert (C! = NULL); Assert (c-> rcurr <= (c-> rbuf + C-> rsize); Assert (c-> rbytes> 0 ); /** skip temporarily */If (c-> protocol = negotiating_prot | C-> transport = udp_transport ){...} /** skip temporarily */If (c-> protocol = binary_prot ){...} else {/** directly transfers to C-> protocol = ascii_prot */char * El, * cont; If (c-> rbytes = 0) /** no data to be parsed */return 0; /** "set bico 0 0 5 \ r \ nhello \ r \ n" First '\ n' separated value */El = memchr (c-> rcurr, '\ n', C-> rbytes );/ ** Try to find the \ n symbol to indicate the end of the command segment */If (! El) {/** not found */If (c-> rbytes> 1024) {/** we didn't have a' \ n' In the first K. this _ has _ To Be A * large multiget, if not we shoshould just nuke the connection. */char * PTR = C-> rcurr; while (* PTR = '') {/* ignore leading whitespaces */++ PTR ;} if (PTR-C-> rcurr> 100 | (strncmp (PTR, "get", 4) & strncmp (PTR, "gets", 5 ))) {conn_set_state (C, conn_closing); return 1 ;}} return 0 ;}/ ** locate the command Terminator */cont = el + 1;/** cont points to the content after El */If (El-C-> rcurr)> 1 & * (El-1) = '\ R') {el --; /** El points to the '\ R' position */}/*** and resets' \ R' to '\ 0' so that rcurr can parse the data temporarily. only data before the first '\ n' * such as the SET command "set bico 0 0 5 \ r \ nhello \ r \ n" rcurr points to only "set bico 0 0 5 "*/* El = '\ 0 '; assert (cont <= (c-> rcurr + C-> rbytes); C-> last_cmd_time = current_time; /** skip to the command to process C-> rcurr only contains the data of the previous section. For example: "set bico 0 0 5" */process_command (C, C-> rcurr ); /** jump to process_com Mand () */C-> rbytes-= (cont-C-> rcurr);/** update the length of unresolved data */C-> rcurr = cont; /** use cont for segmentation, put the remaining data in the later part back to rcurr */assert (c-> rcurr <= (c-> rbuf + C-> rsize);} return 1 ;} /*** the main logic of process_command is to determine the legality of the command and jump to the corresponding processing function */static void process_command (conn * C, char * command) according to the command) {token_t tokens [max_tokens];/***/size_t ntokens; int comm;/*** for commands set/Add/replace, we build an item and read the data * directly I NTO it, then continue in nread_complete (). */C-> msgcurr = 0; C-> msgused = 0; C-> iovused = 0; If (add_msghdr (c )! = 0) {out_of_memory (C, "server_error out of memory preparing response"); return;}/*** tokenize_command is relatively simple. Divide the command according to the space character, one by one into the tokens array, up to 8 token * eg "set bico 0 0 5" In order set command bico is key 0 is flag 0 is expire 5 is value length * So toekens = {"set", "bico", "0", "0", "5", null}; */ntokens = tokenize_command (command, tokens, max_tokens ); if (ntokens> = 3 & (strcmp (tokens [command_token]. value, "get") = 0) | (strcm P (tokens [command_token]. value, "bget") = 0) {/** GET command processing */process_get_command (C, tokens, ntokens, false );} else if (ntokens = 6 | ntokens = 7) & (strcmp (tokens [command_token]. value, "set") = 0 & (Comm = nread_set) |... (strcmp (tokens [command_token]. value, "APPEND") = 0 & (Comm = nread_append) {/** set command will jump to this and enter process_update_command */process_update_command (C, tokens, ntokens, Co Mm, false);} else ...} return;}/*** the place where the command is actually processed involves memory application operations */static void process_update_command (conn * C, token_t * tokens, const size_t ntokens, int comm, bool handle_cas) {char * key; size_t nkey; unsigned int flags; int32_t exptime_int = 0; time_t exptime; int vlen; /** determine whether the length of the key exceeds the maximum length of 250 bytes */If (tokens [key_token]. length> key_max_length) {out_string (C, "client_error bad command line format"); Return ;}/** Get the key value */Key = tokens [key_token]. value; nkey = tokens [key_token]. length;/** get other flags exptime value_length */If (! (Safe_strtoul (tokens [2]. value, (uint32_t *) & flags) & safe_strtol (tokens [3]. value, & exptime_int) & safe_strtol (tokens [4]. value, (int32_t *) & vlen) {out_string (C, "client_error bad command line format"); return ;} /* Ubuntu 8.04 breaks when I pass exptime to safe_strtol */exptime = exptime_int;/* negative exptimes can underflow and end up immortal. realtime () William mmediately expire values that are GRE Ater than realtime_maxdelta, but lessthan process_started, so lets aim for that. */If (exptime <0) exptime = realtime_maxdelta + 1; // does CAS value exist? If (handle_cas) {If (! Safe_strtoull (tokens [5]. value, & req_cas_id) {out_string (C, "client_error bad command line format"); Return ;}} vlen + = 2; /** vlen + 2 because '\ r \ n' */If (vlen <0 | vlen-2 <0) {out_string (C, "client_error bad command line format"); return ;}... /** allocate an item. The general logic is to apply for a piece of memory space, which is related to memory management. The subsequent notes will be parsed and will not go deep here */It = item_alloc (key, nkey, flags, realtime (exptime), vlen); If (IT = 0) {/** allocation failed */If (! Item_size_ OK (nkey, flags, vlen) out_string (C, "server_error object too large for cache"); elseout_of_memory (C, "server_error out of memory storing object "); /* swallow the data line */C-> write_and_go = conn_swallow; C-> sbytes = vlen;/* avoid stale data persisting in cache because we failed alloc. * unacceptable for set. anywhere else too? */If (Comm = nread_set ){
It = item_get (Key, nkey); If (IT) {item_unlink (it); item_remove (IT) ;}return ;}/ ** allocated successfully */item_set_cas (it, req_cas_id); C-> item = it; C-> ritem = item_data (it);/** it is very useful to point ritem to the data address of item, memory replication can be reduced */C-> rlbytes = it-> nbytes;/** nbytes indicates the length of value. For details, see the item data structure, note */C-> cmd = comm;/** the Command currently being processed */conn_set_state (C, conn_nread ); /** set the status to read it-> nbytes data and call back process_command */}/** to return the result of command processing to the client */static void out_string (conn * C, const char * Str) {size_t Len ;... /* nuke a partial output... */C-> msgcurr = 0; C-> msgused = 0; C-> iovused = 0; add_msghdr (c); Len = strlen (STR ); if (LEN + 2)> C-> wsize) {/* ought to be always enough. just fail for simplicity */STR = "server_error output line too long"; Len = strlen (STR);} memcpy (c-> wbuf, STR, Len ); /** copy the returned data to wbuf */memcpy (c-> wbuf + Len, "\ r \ n", 2); C-> wbytes = Len + 2; c-> wcurr = C-> wbuf; conn_set_state (C, conn_write);/** set the write state. The state machine goes to the write logic */C-> write_and_go = conn_new_cmd; /** next state after the setting is completed */return ;}

The above comments are mainly based on the simple command set, which makes it easier to clarify the entire logical process,

The next essay focuses on memory management.

It's easy to learn, and there may be deviations and errors in understanding it. Thank you!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.