Packaging and unpacking (Classic Collection)
For the TCP-based development of the communication program, there is a very important problem to be solved, that is, the packet and unpacking. Since I've been working on network communication programming (about three years), I've been thinking about and improving the way that packets and unpacking are addressed. Here's what I think about this question. If it's wrong, The wrong place, begged everyone to correct. Thank you first.
I. Why TCP-based communication programs require packets and unpacking.
TCP is a "flow" protocol, the so-called flow, is a string of data without bounds. You can think of the river water, is connected into a piece, in the meantime there is no dividing line. However, the general communication program development needs to define a separate packet, such as a data packet for logging in, for logging off the packet. Because the TCP "stream "and the network condition, there are several situations in which data transfer occurs.
Suppose we call two times in a row send two pieces of data data1 and data2, at the receiving end there are the following kinds of reception (of course, not only these cases, here are only a representative case).
A. Receive the DATA1 first, then receive the DATA2.
B. Receive some of the DATA1 data first, then receive the remainder of data2 and all of the data2.
C. Received all the data of data1 and some data of data1, then received the remaining data of data2.
D. All data of data1 and Data2 are received at once.
For a This situation is exactly what we need, no more discussion. For b,c,d the situation is that we often say "sticky bag", we need to take the received data to be dismantled, split into a separate packet. In order to split the package, the packet must be marshaled on the sending side.
Another: For UDP, there is no problem of unpacking, because UDP is a "packet" protocol, that is, two of the data is bounded, at the receiving end of either receive data or receive a complete piece of data, no less receive or receive more.
Two. Why there is a b.c.d situation.
"Sticky packs" can occur on the sending side and can occur at the receiving end.
1. Sticky packets from the sending end caused by the Nagle algorithm: The Nagle algorithm is a kind of algorithm to improve the network transmission efficiency. Simply put, when we submit a piece of data to TCP for sending, TCP does not send this data immediately, but instead waits for a short period of time to see if there is any data to be sent during the wait. If any, these two pieces of data will be sent out at once. This is a simple explanation of the Nagle algorithm, please read the relevant books. The case of C and D is probably caused by the Nagle algorithm.
2. Receive end-of-time receive packets: TCP will present the received data in its own buffer, and then notify the application layer to fetch the data. When the application layer is unable to get the TCP data out in time for some reason, it will cause the TCP buffer to hold several pieces of data.
Three. How to package and unpacking.
When I first encountered a "sticky pack" problem, I slept for a short period of time by calling sleep between two calls. The disadvantage of this solution is obvious, which makes the transmission efficiency much lower and not reliable. Later it is solved by means of answer, though most of the time it is feasible, But can not solve the situation like B, and the use of response to increase the traffic volume, aggravating the network load (but like the FTP protocol is the answer mode). Later, the packet is encapsulated and the packet is split.
Package:
Packet is to a piece of data with Baotou, so that the packet is divided into Baotou and the package of two parts of the content (later, the packet will be added to filter illegal packets "packet tail" content). Baotou is actually a fixed-size structure, which has a struct member variable represents the length of the package body, this is a very important variable, Other struct members can be defined on their own terms. Depending on the length of the Baotou and the variable containing the length of the packet in the Baotou, a complete packet can be split correctly.
There are two ways I'm most commonly used for unpacking.
1. Dynamic buffer staging. The reason is that the buffer is dynamic because the buffer length is increased when the length of the data that needs to be buffered exceeds the length of the buffer.
The approximate process is described as follows:
A, a buffer is dynamically allocated for each connection, and the buffer is associated with the socket, which is commonly associated with a struct.
B, when the data is received, the data is first stored in the buffer.
C, determine whether the data length in the buffer is sufficient for the length of a baotou, if not enough, do not do the unpacking operation.
D, according to the Baotou data to parse out the inside represents the length of the package body variables.
E, determine whether the buffer in addition to the length of the packet outside the Baotou is enough to the length of a package, if not enough, do not do the unpacking operation.
F, remove the entire packet. Here the "take" means not only to copy the packet from the buffer, but also to remove the packet from the cache. The way to delete this packet is to move the data behind the package to the start address of the buffer.
There are two drawbacks to this approach. 1. Dynamically allocating a buffer for each connection increases memory usage. 2. Three places need to copy data, one place is to store the data in the buffer, one place is to take the complete packet out of the buffer, One place is to remove the packet from the buffer. This improved method of unpacking will solve and perfect some shortcomings.
The relevant code is given below.
First look at Baotou structure definition
#pragma pack (push,1)//start defining packets with byte alignment
/*----------------------Baotou---------------------*/
typedef struct TAGPACKAGEHEAD
{
BYTE Version;
WORD Command;
WORD ndatalen;//The length of the package body
}package_head;
#pragma pack (POP)//end definition packet to restore the original alignment
Then look at the storage data and the "fetch" data functions.
/*****************************************************************************
Description: Adding data to the cache
input:pbuff[in]-data to be added; nlen[in]-to add data length
Return: Returns False if the current buffer does not have enough space to hold pbuff, otherwise true.
******************************************************************************/
BOOL Cdatabufferpool::addbuff (char *pbuff, int nlen)
{
M_cs. Lock ();///Critical Zone lock
if (Nlen < 0)
{
M_cs. Unlock ();
return FALSE;
}
if (Nlen <= getfreesize ())///Determine if the remaining space is sufficient to store nlen Long data
{
memcpy (M_pbuff + m_noffset, Pbuff, Nlen);
M_noffset + = Nlen;
}
else///if not enough, expand the original space
{
char *p = M_pbuff;
M_nsize + = nlen*2;//per growth 2*nlen
M_pbuff = new Char[m_nsize];
memcpy (M_pbuff,p,m_noffset);
delete []p;
memcpy (M_pbuff + m_noffset, Pbuff, Nlen);
M_noffset + = Nlen;
M_cs. Unlock ();
return FALSE;
}
M_cs. Unlock ();
return TRUE;
}
/*****************************************************************************
Description: Get a complete package
The data obtained by input:buf[out]-; the length of data obtained by nlen[out]-
Return:1, current buffer not enough one packet header data 2, the current buffer is not enough for a package of data
******************************************************************************/
int Cdatabufferpool::getfullpacket (char *buf, int& Nlen)
{
M_cs. Lock ();
if (M_noffset < M_packetheadlen)//current buffer insufficient data for one header
{
M_cs. Unlock ();
return 1;
}
Package_head *p = (Package_head *) M_pbuff;
if ((M_noffset-m_packetheadlen) < (int) P->ndatalen)//current buffer is not enough for a packet of data
{
M_cs. Unlock ();
return 2;
}
Judging the legitimacy of a package
/* int isintegrallity = validatepackintegrality (p);
if (isintegrallity! = 0)
{
M_cs. Unlock ();
return isintegrallity;
}
*/
Nlen = m_packetheadlen+p->ndatalen;
memcpy (Buf, M_pbuff, Nlen);
M_noffset-= Nlen;
memcpy (M_pbuff, M_pbuff+nlen, M_noffset);
M_cs. Unlock ();
return 0;
}
The disadvantages of this approach are mentioned earlier. The following is an improved method, that is, the use of ring buffer. But this improvement does not solve the first disadvantage and the first copy of the data, only the third place to solve the copy of the data (this place is the most copied data place). The 2nd way of unpacking will solve both problems.
The loop buffer implementation is defined by defining two pointers, pointing to the header and tail of the valid data, respectively. When you store data and delete data, only the head and tail pointers are moved.
Code to illustrate. Note: The following code is the code that takes an open source game server, and I have modified this code.
int Ccircularbufferpool::P utdata (TCHAR *pdata, int len)
{
if (len <= 0)
return 1;
EnterCriticalSection (&m_cs);
while (Isoverflowcondition (len))///Determine if the remaining space in the buffer is sufficient to hold the Len Long data
{
Bufferresize (len), or///If not enough, expand the buffer.
}
if (Isindexoverflow (len))////Determine the position of the "tail" pointer.
{
int firstcopylen = M_ibufsize-m_itailpos;
int secondcopylen = Len-firstcopylen;
CopyMemory (M_pbuffer+m_itailpos, PData, Firstcopylen);
if (Secondcopylen)
{
CopyMemory (M_pbuffer, Pdata+firstcopylen, Secondcopylen);
M_itailpos = Secondcopylen;
}
Else
M_itailpos = 0;
}
Else
{
CopyMemory (M_pbuffer+m_itailpos, PData, Len);
M_itailpos + = Len;
}
LeaveCriticalSection (&m_cs);
return 0;
}
void Ccircularbufferpool::getdata (TCHAR *pdata, int len, bool Delete)
{
if (Len < M_ibufsize-m_iheadpos)
{
CopyMemory (PData, M_pbuffer+m_iheadpos, Len);
if (delete==true)
M_iheadpos + = Len;
}
Else
{
Int FC, SC;
FC = M_ibufsize-m_iheadpos;
sc = LEN-FC;
CopyMemory (PData, M_pbuffer+m_iheadpos, FC);
if (SC) copymemory (PDATA+FC, M_pbuffer, SC);
if (delete==true)
M_iheadpos = SC;
if (M_iheadpos >= m_ibufsize)
M_iheadpos = 0;
}
}
//
To parse a custom package
//
int Ccircularbufferpool::getfullpacket (TCHAR *buf, int &nlen)
{
EnterCriticalSection (&m_cs);
if (Getvalidcount () < M_packetheadlen)//current buffer insufficient data for one header
{
LeaveCriticalSection (&m_cs);
return 1;
}
GetData (Buf,m_packetheadlen,false);
Package_head *p = (Package_head *) Buf;
if ((Getvalidcount ()-m_packetheadlen) < (int) P->ndatalen)//current buffer is not enough for a packet of data
{
LeaveCriticalSection (&m_cs);
return 2;
}
Judging the legitimacy of a package
int isintegrallity = validatepackintegrality (p);
if (isintegrallity! = 0)
{
LeaveCriticalSection (&m_cs);
return isintegrallity;
}
GetData (buf,m_packetheadlen+p->ndatalen,true);
Nlen = m_packetheadlen+p->ndatalen;
LeaveCriticalSection (&m_cs);
return 0;
}
2. Using the underlying buffer to split the package
Because TCP also maintains a buffer, we can use the TCP buffer to cache our data, so we do not need to allocate a buffer for each connection. On the other hand we know that recv or WSARecv have a parameter, Used to indicate how long we are going to receive data. With these two conditions, we can optimize the first method.
For blocking sockets, we can use a loop to receive packet length data, then parse out the variable representing the length of the package, and then use a loop to receive the packet length data.
The relevant code is as follows:
Char packagehead[1024];
Char packagecontext[1024*20];
int Len;
Package_head *ppackagehead;
while (M_bclose = = False)
{
memset (packagehead,0,sizeof (Package_head));
Len = M_tcpsock.receivesize ((char*) packagehead,sizeof (Package_head));
if (len = = socket_error)
{
Break
}
if (len = = 0)
{
Break
}
Ppackagehead = (Package_head *) Packagehead;
memset (packagecontext,0,sizeof (Packagecontext));
if (ppackagehead->ndatalen>0)
{
Len = M_tcpsock.receivesize ((char*) packagecontext,ppackagehead->ndatalen);
}
}
M_tcpsock is a variable of a class that encapsulates a socket, where the receivesize is used to receive data of a certain length until a certain length of data is received or a network error is returned.
int Winsocket::receivesize (char* strdata, int ilen)
{
if (strdata = = NULL)
return err_badparam;
char *p = strdata;
int len = Ilen;
int ret = 0;
int returnlen = 0;
while (Len > 0)
{
ret = recv (M_hsocket, p+ (Ilen-len), Ilen-returnlen, 0);
if (ret = = Socket_error | | ret = = 0)
{
return ret;
}
Len-= ret;
Returnlen + = ret;
}
return Returnlen;
}
For a non-blocking socket, such as the completion port, we can submit a request to receive packet length data, when the GetQueuedCompletionStatus returns, we determine whether the length of the received data is equal to the length of the header, if equal to, then submit the packet length of the request to receive data, If not equal, the request to receive the remaining data is submitted. A similar approach is used when receiving the package body.
The relevant code is given below
Enum Iotype
{
Ioinitialize,
Ioread,
Iowrite,
Ioidle
};
Class Overlappedplus
{
Public
OVERLAPPED M_ol;
Iotype M_iotype;
BOOL m_bispackagehead;//Whether the data currently received is header data.
int m_count;
Wsabuf M_wsabuffer;
int m_recvpos;
Char m_buffer[1024*8];//this buffer to be as large as possible
Overlappedplus (Iotype iotype) {
ZeroMemory (this, sizeof (Overlappedplus));
M_iotype = Iotype;
}
};
Receives the first request issued after the connection, requests to receive the packet header size data.
Overlappedplus *poverlappedplus = new Overlappedplus;
Poverlappedplus->m_wsabuffer.buf = poverlappedplus->m_buffer;
Poverlappedplus->m_wsabuffer.len = length of package_head_len;///header
Poverlappedplus->m_bispackagehead = true;
Poverlappedplus->m_recvpos = 0;
Poverlappedplus->m_iotype = Ioread;
DWORD recvbytes;
DWORD Flags;
Flags = 0;
if (WSARecv (Clientsocket, & (Poverlappedplus->m_wsabuffer), 1, &recvbytes, &flags,
&poverlappedplus->m_ol, NULL) = = Socket_error)
{
if (WSAGetLastError ()! = error_io_pending)
{
Delete Poverlappedplus;
}
Else
{
Related error handling
}
}
Else
{
Related error handling
}
In the function where GetQueuedCompletionStatus is located.
if (poverlapplus->m_iotype== ioread)
{
if (Poverlapplus->m_wsabuffer.len = = dwiosize)
{
if (Poverlapplus->m_bispackagehead = = true)///Received is Baotou.
{
Package_head *ppackagehead = (Package_head *) (Poverlapplus->m_buffer);
if (Pthis->islegalitypackagehead (ppackagehead) ==false)///Determine if the package is legal
{
Closesocket (Lpclientcontext->m_socket);
Continue
}
Poverlapplus->m_bispackagehead = false;
Poverlapplus->m_wsabuffer.len = ppackagehead->ndatalen;
Poverlapplus->m_recvpos + = Dwiosize;
Poverlapplus->m_wsabuffer.buf = poverlapplus->m_buffer+poverlapplus->m_recvpos;
}
Else///received is the package body
{
Poverlapplus->m_recvpos + = Dwiosize;
In this case, a complete packet is stored in the Poverlapplus->m_buffer, with a length of poverlapplus->m_recvpos
Continue request packet header for next packet
Poverlapplus->m_wsabuffer.buf = poverlapplus->m_buffer;
memset (poverlapplus->m_buffer,0,sizeof (Poverlapplus->m_buffer));
Poverlapplus->m_wsabuffer.len = Package_head_len;
Poverlapplus->m_bispackagehead = true;
Poverlapplus->m_recvpos = 0;
}
}
The data received by else///is not complete yet
{
Poverlapplus->m_wsabuffer.len-= dwiosize;
Poverlapplus->m_recvpos + = Dwiosize;
Poverlapplus->m_wsabuffer.buf = poverlapplus->m_buffer+poverlapplus->m_recvpos;
}
Poverlapplus->m_iotype = Ioread;
State = WSARecv (Lpclientcontext->m_socket, & (Poverlapplus->m_wsabuffer), 1, &recvbytes, &Flags,
&poverlapplus->m_ol, NULL);
if (state = = Socket_error)
{
if (WSAGetLastError ()! = error_io_pending)
{
Close the socket to release the appropriate resource
Continue
}
}
}
Three: How to judge the legitimacy of the package.
Judging the legitimacy of the package can be judged in the following two ways. But think 100% of the decision out of the illegal package, only through the information security knowledge to determine, this method does not elaborate here.
1. Through the structure of Baotou to determine the legitimacy of the package.
Originally, I was based on the Baotou to determine the validity of the package, such as whether the command is beyond the range of commands, whether the Ndatalen is greater than the maximum packet length. But this method cannot filter out the illegal packets, when the illegal packets, the only thing we can do is disconnect, perhaps this is also the best way to handle.
We can give a complete package with a start and end flag, a flag can be an integer, or a string of strings. Take the first unpacking as an example. When we are going to split a full package, we first search the start flag of the packet from the buffer valid header pointer address, search for it and the current data is enough for a header data, Then judge whether the start flag and the header is legal, if the law is based on the value of the variable representing the length of the packet end, to determine whether the package tail flag is consistent with our definition, if consistent, this package is a legitimate package. If there is an inconsistency, continue to look for the start flag for the next package and discard the data in front of the next
2. The logic layer to determine the validity of the package.
When we take out a legitimate package, we also have to judge the validity of the package according to the logic of the current data processing. For example, after the successful landing of a certain period of time the server received the same client landing package, then we can determine that the package is illegal, simple processing is to disconnect.
Packet and unpacking of network communication