Anatomy of a Redis network connection library
1. Introduction to Redis Network Connection library
The file that corresponds to the Redis network Connection library is networking.c. This file is primarily responsible for client creation and release commands receive and command reply REDIS Communication Protocol Analysis Client Command implementation
We will next on these pieces of content separately listed source code, for analysis. 2. Creation and release of clients
The source details of the Redis Network link library 2.1 Client creation
A Redis server is a program that establishes connections to multiple clients at the same time. When the client connects to the server, the server establishes a server.h/client structure to hold the client's state information. So when the client is created, it initializes such a structure, and the client creates the following source code:
Client *createclient (int fd) {Client *c = Zmalloc (sizeof (client));
Allocate space//If FD is-1, the creation is a pseudo-client with no network connection to use when executing LUA scripts.
If FD is not equal to-1, it means creating a client with network connectivity if (FD! =-1) {//Set FD to non-blocking mode Anetnonblock (NULL,FD);
Prohibit the use of the Nagle algorithm, the client submits to the kernel each packet will immediately send to the server out, Tcp_nodelay anetenabletcpnodelay (NULL,FD); If Tcpkeepalive is turned on, set so_keepalive if (server.tcpkeepalive)//Set keep alive option for TCP connection ane
Tkeepalive (null,fd,server.tcpkeepalive); Creates a file event status El and listens for read events, starting to accept the input if of the command (Aecreatefileevent (server.el,fd,ae_readable, Readqueryfromclient,
c) = = Ae_err) {close (FD);
Zfree (c);
return NULL;
}}//default number NO. 0 database selectdb (c,0);
Set the client id c->id = server.next_client_id++;
The client socket C->FD = FD;
Client's name c->name = NULL;
Returns the offset of the fixed (static) buffer C->bufpos = 0; Input buffer C->queRybuf = Sdsempty ();
Input buffer Peak C->querybuf_peak = 0;
Request protocol type, inline or multiple commands, initialized to 0 c->reqtype = 0;
Number of parameters C->ARGC = 0;
Parameter list c->argv = NULL;
The currently executing command and the last executed command c->cmd = C->lastcmd = NULL;
The number of unread commands remaining in the query buffer C->multibulklen = 0;
The length of the read-in Parameter C->bulklen =-1;
Number of bytes Sent C->sentlen = 0;
The state of the client c->flags = 0;
Set the time to create the client and the time of the last interaction c->ctime = C->lastinteraction = Server.unixtime;
Authentication status c->authenticated = 0;
Replication the state of the replication, initially without c->replstate = Repl_state_none;
Sets the write processor from the node to ACK, whether the ack C->repl_put_online_on_ack = 0 is sent to master in slave;
Replication Copy Offset C->reploff = 0;
The offset received by the ACK command C->repl_ack_off = 0;
The time taken by the offset received by the ACK command c->repl_ack_time = 0;
From the port number of the node C->slave_listening_port = 0;
From the node IP address c->slave_ip[0] = ' + ';
From the function of the node C->slave_capa = Slave_capa_none;Reply linked List c->reply = Listcreate ();
Number of bytes in the list of replies c->reply_bytes = 0;
Memory size of the reply buffer soft limit c->obuf_soft_limit_reached_time = 0;
The method of releasing and copying the list of Listsetfreemethod (c->reply,decrrefcountvoid);
Listsetdupmethod (C->reply,dupclientreplyvalue);
Block type C->btype = Blocked_none;
Blocking over time c->bpop.timeout = 0;
The key dictionary that caused the blockage C->bpop.keys = Dictcreate (&setdicttype,null);
Stores the unblocked key that holds the key for the push element, which is dstkey c->bpop.target = NULL;
Blocking state C->bpop.numreplicas = 0;
The copy offset to reach c->bpop.reploffset = 0;
Global replication Offset C->woff = 0;
The key to monitor C->watched_keys = Listcreate ();
Subscription Channel C->pubsub_channels = Dictcreate (&setdicttype,null);
Subscription mode c->pubsub_patterns = Listcreate ();
The cached Peerid,peerid is ip:port c->peerid = NULL;
Release and comparison methods for subscription release mode Listsetfreemethod (c->pubsub_patterns,decrrefcountvoid); Listsetmatchmethod (C->pubsub_patterNs,listmatchobjects);
Place the real client on the server's list of clients if (FD! =-1) listaddnodetail (SERVER.CLIENTS,C);
Initializes the object state of the client initclientmultistate (c);
return C; }
Based on the incoming file descriptor fd, you can create a client for different scenarios. This fd is the file descriptor returned by the server after receiving the client connect. FD = =-1. Represents a client that creates a network-free connection . Primarily used when executing LUA scripts. FD! =-1. Indicates that a normal client connection is received, a client with a network connection is created, that is, a file event is created to listen to whether the FD is readable and the event is triggered when the client sends the data. The Nagle algorithm is also disabled when the client is created.
The Nagle algorithm automatically connects many small buffer messages, a process called nagling that increases the efficiency of a network software system by reducing the number of packets that must be sent. However, both the server and the client have very high requirements for communication, so it is forbidden to use the Nagle algorithm, and each packet that the client submits to the kernel is immediately sent to the server.
The process of creating the client initializes all members of the server.h/client structure, followed by the members that are part of the focus. int ID: The server creates an ID for each connection that comes in, and the client's ID starts at 1. Each reboot will refresh the server. int FD: Current client status Descriptor. is divided into non-network-connected clients and network-connected clients. int flags: Flag for client status. There are 23 states defined in the Server.h in Redis 3.2.8. robj *name: The client created by default is not named and can be set by the client SetName command. The implementation of the command is described later. int Reqtype: The type of the request protocol. Because the Redis server supports Telnet connections, the Telnet command request protocol type is proto_req_inline, and the REDIS-CLI command requests a protocol type of Proto_req_multibulk.
Used to save a member of the server to accept client commands : SDS querybuf: Saves the input buffers that the client sends to the command request. is saved as a Redis communication protocol. size_t querybuf_peak: Saves the peak of the input buffer. int ARGC: The number of command arguments. robj *argv: command argument list.
Used to save the member that the server replies to the client : Char buf[16*1024]: The static buffer that saves the command reply message after executing the command, it is fixed, so the main saving is some relatively short reply. When the client structure space is allocated, a 16K size is allocated. int Bufpos: Records the offset of the static buffer, which is the number of bytes already used by the BUF array. list *reply: Saves the list of command replies. Because the static buffer size is fixed, a fixed-length command reply is mainly saved, and when processing some commands that return a large number of replies, the command reply is concatenated in the form of a linked list. unsigned long reply_bytes: Saves the number of bytes in the reply list. size_t Sentlen: The number of bytes that have been sent for replies. 2.2 release of the client
The client's release freeclient () function is mainly to release the various data structures and empty some buffers, and so on, here does not list the source code. But let's look at the asynchronous release client . The source code is as follows:
Asynchronously releases client
void Freeclientasync (client *c) {
//If it is a pseudo-client that is already shutting down or is a LUA script, return directly if
(C->flags & Client_close_asap | | C->flags & Client_lua) return;
C->flags |= Client_close_asap;
Add the client to the list of client links that will be closed
listaddnodetail (server.clients_to_close,c);
}
Server.clients_to_close: Is the server that holds all the client linked lists to be closed.
The purpose of setting up an asynchronous release client is to prevent the client from shutting down when the underlying function is writing data to the client's output buffer, which is not secure. Redis schedules the client to release it at the Servercron () function's security time.
Of course, you can also cancel the asynchronous release, then the Freeclient () function is called to release immediately. The source code is as follows:
Cancels the asynchronous release of client
void Freeclientsinasyncfreequeue (void) {
//traversal of all impending client while
(Listlength ( Server.clients_to_close)) {
ListNode *ln = Listfirst (server.clients_to_close);
Client *c = listnodevalue (LN);
Cancellation of immediate closure of the flag
c->flags &= ~client_close_asap;
Freeclient (c);
Remove Listdelnode (SERVER.CLIENTS_TO_CLOSE,LN) from the list of client links that will be closed
;}
}
3. Command receive and command reply
Redis Network link Library Source detailed Comment 3.1 command receive
When the client connects to the Redis server, the server gets a file descriptor fd, and the server listens for the read event of the file descriptor, which we have analyzed in the createclient () function. Then when the client sends a command that triggers the Ae_readable event, the callback function readqueryfromclient () is called to read the command from the file descriptor FD and is saved in the input buffer querybuf. This callback is the pointer Rfileproc and WFILEPROC to the callback function that we mentioned in the Redis event processing implementation article. So, let's start by analyzing the sendreplytoclient () function.
Reads the contents of the input buffer of the client void Readqueryfromclient (aeeventloop *el, int fd, void *privdata, int mask) {Client *c = (client
*) Privdata;
int nread, Readlen;
size_t Qblen;
UNUSED (EL);
UNUSED (mask);
Read-in length, default 16MB Readlen = Proto_iobuf_len; /* If This is a multi bulk request, and we were processing a bulk reply * That's large enough, try to maximize the PR Obability that the query * buffer contains exactly the SDS string representing the object, even * at the risk of Requiring more read (2) calls. This is the function * Processmultibulkbuffer () can avoid copying buffers to create the * Redis Object Represen Ting the argument. *//If it is multiple requests, set the length of the read-in according to the requested size Readlen if (C->reqtype = = Proto_req_multibulk && C->multibulklen && Amp C->bulklen! =-1 && c->bulklen >= proto_mbulk_big_arg) {int remaining = (unsigned) (c
->bulklen+2)-sdslen (C->QUERYBUF); if (Remaining <Readlen) Readlen = remaining;
}//input buffer length Qblen = Sdslen (C->QUERYBUF);
Update the peak if buffer (C->querybuf_peak < Qblen) C->querybuf_peak = Qblen;
Size of the extended buffer c->querybuf = Sdsmakeroomfor (C->querybuf, Readlen);
The command sent to the client is read into the input buffer nread = Read (FD, C->querybuf+qblen, Readlen);
Read operation error if (nread = =-1) {if (errno = = Eagain) {return;
} else {Serverlog (ll_verbose, "Reading from client:%s", Strerror (errno));
Freeclient (c);
Return
}//read operation completed} else if (nread = = 0) {Serverlog (Ll_verbose, "Client closed connection");
Freeclient (c);
Return
}//Update the used and unused sizes of the input buffers.
Sdsincrlen (C->querybuf,nread);
Set time for last server and client interaction c->lastinteraction = Server.unixtime;
If it is the primary node, the offset of the update copy operation if (C->flags & client_master) C->reploff + = Nread; Update the number of bytes entered from the network server.stat_net_input_bytes+ = Nread; If the input buffer length exceeds the maximum buffer length set by the server if (Sdslen (C->QUERYBUF) > Server.client_max_querybuf_len) {//Convert client information to s
DS SDS CI = catclientinfostring (Sdsempty (), c), bytes = Sdsempty ();
The input buffer is saved in bytes bytes = SDSCATREPR (bytes,c->querybuf,64); Print to log serverlog (ll_warning, "Closing client that reached max query buffer length:%s (qbuf initial bytes:%s)",
CI, bytes);
Free space Sdsfree (CI);
Sdsfree (bytes);
Freeclient (c);
Return
}//Processing the command contents of the client input processinputbuffer (c); }
In fact, the readqueryfromclient () function is the encapsulation of the read function, which reads data from the file descriptor FD into the input buffer querybuf and updates the peak querybuf_peak of the input buffer, and checks the length of the read, If it is greater than Server.client_max_querybuf_len, it exits, and this threshold is initialized to Proto_max_querybuf_len (1024*1024*1024), which is the 1G size of the server.
The various command implementations prior to the recall are handled by the two members of the client's argv and ARGC. Therefore, the server also needs to process the data in the input buffer querybuf into the argument list object, which is the ProcessInputBuffer () function above. The source code is as follows:
command content for processing client input void ProcessInputBuffer (client *c) {server.current_client = C;
/* Keep processing while there are something in the input buffer *//Read the contents of the input buffers while (Sdslen (C->QUERYBUF)) { /* Return if clients is paused. *///If in a paused state, return directly if (! (
C->flags & Client_slave) && clientsarepaused ()) break; /* Immediately Abort if the client is in the middle of something.
*///If the CLIENT is in a blocked state, return directly if (C->flags & client_blocked) break; If CLIENT is turned off, return directly if (C->flags & (client_close_after_reply|
CLIENT_CLOSE_ASAP)) break; /* Determine request type when unknown. *///If it is an unknown request type, the decision request type if (!c->reqtype) {////If it starts with "*", it is multiple requests and is a client-sent if (c
->querybuf[0] = = ' * ') {c->reqtype = Proto_req_multibulk; Otherwise, it is an inline request, which is sent by telnet} else {c->reqtype = Proto_req_inline; }}//If it is an inline request if (C->reqtype = = Proto_req_inline) {//handles the inline command sent by Telnet and creates an object, saved in
In the client's argument list, if (Processinlinebuffer (c)! = C_OK) break; If it is multiple requests} else if (C->reqtype = = Proto_req_multibulk) {//Convert the contents of the protocol in the client's querybuf to the client's parameter list
The object if (Processmultibulkbuffer (c)! = C_OK) break;
} else {serverpanic ("Unknown request Type"); }/* Multibulk processing could see a <= 0 length.
*///If the parameter is 0, reset the client if (C->ARGC = = 0) {resetclient (c);
} else {/* Only resets the client when the command is executed. *///EXECUTE command successfully reset client
if (ProcessCommand (c) = = C_OK) resetclient (c); /* freememoryifneeded may flush slave output buffers. This could result * into a slave, which May is the active client, to be freed. */if (server.cuRrent_client = = NULL) break;
}}//executed successfully, the client that will be used for crash reporting is set to null server.current_client = NULL; }
The
This processinputbuffer () function simply determines and sets the type of the request according to Reqtype, as previously mentioned, because the Redis server supports Telnet connections, so the Telnet command request protocol type is Proto_req_inline , the Processinlinebuffer () function is called, and the REDIS-CLI command requests a protocol type of proto_req_multibulk, which in turn calls the Processmultibulkbuffer () function to handle. All we have to do is look at the Processmultibulkbuffer () function, which is if the Redis protocol command is processed into the parameter list of the object. The source code is as follows:
Convert the contents of the protocol in the client's querybuf to the object in the client's argument list int processmultibulkbuffer (client *c) {char *newline = NULL;
int pos = 0, OK;
Long Long ll; The number of commands in the argument list is 0 if (C->multibulklen = = 0) {/* The client should has been reset */Serverassertwith
Info (C,NULL,C->ARGC = = 0);
/* Multi Bulk length cannot be read without a \ r \ n//query first line break newline = STRCHR (c->querybuf, ' \ R '); Not found \ r \ n, indicating non-conforming protocol, return error if (newline = = NULL) {if (Sdslen (c->querybuf) > Proto_inline_m
Ax_size) {Addreplyerror (c, "Protocol Error:too big Mbulk count string");
Setprotocolerror (c,0);
} return C_err; }/* Buffer should also contain \ *//Check format if (newline-(C->QUERYBUF) > ((Signed) Sdslen (c
->QUERYBUF)-2) return c_err; /* We know for sure there are a whole line since newline! = NULL, * So go ahead and find out the multi bulk length.
*//guarantee The first character is ' * ' serverassertwithinfo (c,null,c->querybuf[0] = = ' * '); Converts the number after ' * ' to an integer.
*3\r\n OK = string2ll (c->querybuf+1,newline-(c->querybuf+1), &ll);
if (!ok | | ll > 1024*1024) {addreplyerror (c, "Protocol error:invalid multibulk Length");
Setprotocolerror (C,pos);
return c_err;
}//Position pos = (newline-c->querybuf) +2 after "\ r \ n" pointing to "*3\r\n";
Blank command, the previous delete is retained, and the unread portion is preserved if (ll <= 0) {sdsrange (c->querybuf,pos,-1);
return C_OK;
}//Number of parameters C->multibulklen = ll;
/* Setup argv array on client structure *///Assign client parameter list space if (C->ARGV) Zfree (C->ARGV);
C->ARGV = Zmalloc (sizeof (robj*) *c->multibulklen);
} serverassertwithinfo (C,null,c->multibulklen > 0); Read the Multibulklen parameter and create the object to be saved in the argument list while (C->multibUlklen) {/* Read bulk length if unknown////////////////////////read-in Parameters
To line break, make sure "\ r \ n" Exists newline = STRCHR (C->querybuf+pos, ' \ R '); if (newline = = NULL) {if (Sdslen (C->QUERYBUF) > proto_inline_max_size) {addrep
Lyerror (c, "Protocol Error:too Big Bulk count string");
Setprotocolerror (c,0);
return c_err;
} break; }/* Buffer should also contain \ *//Check format if (newline-(C->QUERYBUF) > (sig
Ned) Sdslen (C->QUERYBUF)-2) break;
$3\r\nset\r\n., make sure that the ' $ ' character, guaranteed format if (C->querybuf[pos]! = ' $ ') {Addreplyerrorformat (c,
"Protocol error:expected ' $ ', got '%c '", C->querybuf[pos]);
Setprotocolerror (C,pos); return c_err;
}//Save the command length to LL.
OK = String2ll (c->querybuf+pos+1,newline-(c->querybuf+pos+1), &ll);
if (!ok | | ll < 0 | | ll > 512*1024*1024) {addreplyerror (c, "Protocol error:invalid bulk Length");
Setprotocolerror (C,pos);
return c_err;
}//locates the position of the first parameter, that is, the set S pos + = newline-(c->querybuf+pos) +2;
The parameters are too long to be optimized if (ll >= proto_mbulk_big_arg) {size_t Qblen; /* If We are going to read a large object from the network * Try to make it likely that it'll start at C-&G T;QUERYBUF * Boundary So we can optimize object creation * Avoiding a large copy of Data.
*///If we are going to read a large object from the network, try to make it possible to start with the c-> querybuf boundary so that we can optimize the object creation, avoid a large number of copies of the data//Save the unread part
Sdsrange (c->querybuf,pos,-1); Reset offset pos = 0;
Gets the length used in querybuf Qblen = Sdslen (C->QUERYBUF); /* Hint The SDS library about the amount of bytes this string was * going to contain. *///Extended QUERYBUF size if (Qblen < (size_t) ll+2) C->querybuf = SDSM
Akeroomfor (C->querybuf,ll+2-qblen);
}//Save the length of the parameter C->bulklen = ll;
}/* Read bulk argument *//because read only Multibulklen bytes of data, read the data is not enough, then jump directly out of the loop, execute the ProcessInputBuffer () function loop read if (Sdslen (C->QUERYBUF)-pos < (unsigned) (c->bulklen+2)) {/* Not enough data (+2 = = trailing \ r \ n) *
/break; An object was created for the parameter} else {/* optimization:if the buffer contains JUST our bulk element * Inst EAD of creating a new object by *copying* the SDS we * just with the current SDS string. */
//If the length of the read-in is greater than 32k if (pos = = 0 && c->bulklen >= proto_mbulk_big_arg && (signed) Sdslen (c->querybuf) = = c->bulklen+2) {c->argv[c->argc++] = CRE
Ateobject (OBJ_STRING,C->QUERYBUF); Skip line break Sdsincrlen (c->querybuf,-2); /* Remove CRLF */* Assume that if we saw a fat argument we'll see another one * likely. ..
*//Set a new length C->querybuf = Sdsnewlen (null,c->bulklen+2);
Sdsclear (C->QUERYBUF);
pos = 0; The creation object is saved in the client's argument list} else {c->argv[c->argc++] = Createstringobj
ECT (C->querybuf+pos,c->bulklen);
pos + = c->bulklen+2;
}//Empty the length of the command content C->bulklen =-1;
Not read the number of command parameters, read one, the value minus 1 c->multibulklen--; }}/* Trim to POS *///delete already read, leave unread if (POS) sdsrange (c->querybuf,pos,-1);
/* We ' re do when c->multibulk = = 0 *///Command parameters are all read if (C->multibulklen = = 0) return C_OK;
/* Still not read to process the command */return c_err; }
We combine multiple batches of replies for analysis. A multiple batch reply is prefixed with *<argc>\r\n, followed by several different bulk replies, with the number of ARGC for these bulk replies. Then the set Nmykey nmyvalue command translates to the Redis protocol content as follows:
"*3\r\n$3\r\nset\r\n$5\r\nmykey\r\n$7\r\nmyvalue\r\n"
After entering the Processmultibulkbuffer () function, if the function is executed for the first time, then the number of commands not read in argv is 0, that is, the argument list is empty, then the code for if (C->multibulklen = = 0) is executed. The code here parses *3\r\n, saves 3 to Multibulklen, represents the number of arguments that follow, and allocates space for the argv based on the number of arguments.
Next, execute the Multibulklen times while loop, each time reading a parameter, such as $3\r\nset\r\n, is also read out the parameter length, save in Bulklen, and then set the parameter set to save the object into the parameter list. Each time a parameter is read, the Multibulklen is reduced by 1, and when it equals 0 o'clock, it indicates that the parameters of the command are all read to the parameter list.
The entire process of command reception is then completed. 3.2 Command Reply
The function of the command reply, which is also one of the callback functions of the event handler. When the server's client's reply buffer has data, it calls Aecreatefileevent (