Packet unpacking of sockets

Source: Internet
Author: User
Tags bool error handling memory usage pack socket split ftp protocol

For the TCP-based development of the communication program, there is a very important problem to be solved, that is, the package and unpacking. Let's talk about my thoughts on this question. If there is any wrong, the wrong place, begged everyone to correct. Thank you first.

I. Why TCP-based communication programs require packets and unpacking.

TCP is a "flow" protocol, the so-called flow, is a string of data without bounds. You can think of the river water, is connected into a piece, in the meantime there is no dividing line. However, the general communication program development needs to define a separate packet, such as a data packet for logging in, for logging off the packet. Because the TCP "stream "and the network condition, there are several situations in which data transfer occurs.
Suppose we call two times in a row send two pieces of data data1 and data2, at the receiving end there are the following kinds of reception (of course, not only these cases, here are only a representative case).
A. Receive the DATA1 first, then receive the DATA2.
B. Receive some of the DATA1 data first, then receive the remainder of data1 and all of the data2.
C. Received all the data of data1 and some data of data2, then received the remaining data of data2.
D. All data of data1 and Data2 are received at once.

For a This situation is exactly what we need, no more discussion. For b,c,d the situation is that we often say "sticky bag", we need to take the received data to be dismantled, split into a separate packet. In order to split the package, the packet must be marshaled on the sending side.
Another: For UDP, there is no problem of unpacking, because UDP is a "packet" protocol, that is, two of the data is bounded, at the receiving end of either receive data or receive a complete piece of data, no less receive or receive more.

Two. Why there is a b.c.d situation.
      Sticky packs can occur on the sending side or at the receiving end.
       1. Sticky packets from the sending end caused by the Nagle algorithm: The Nagle algorithm is an algorithm to improve the network transmission efficiency. Simply put, when we submit a piece of data to TCP to send, TCP does not immediately send this data, but instead waits for a short period of time to see if there is any data to be sent during the wait, and if so, the two pieces of data will be sent out at once. This is a simple explanation for the Nagle algorithm, please read the relevant books in detail. Things like C and D are probably caused by the Nagle algorithm.
       2. Receive-side sticky packets received by the receiving side are not timely: TCP will present the received data in its own buffer, Then notify the application tier to fetch the data. When the application layer is unable to get the TCP data out in time for some reason, it will cause a few pieces of data to be stored in the TCP buffer.

Three. How to package and unpacking.
   When I first encountered a "sticky pack" problem, I slept for a short period of time by calling sleep between two calls. The disadvantage of this solution is obvious, which makes the transmission efficiency greatly reduced, and it is not reliable. It was later resolved by means of a response, Although it is feasible most of the time, it does not solve the situation like B, and the use of response to increase the traffic volume, aggravating the network load (but like the FTP protocol, such as the use of the Answer method). Later, the packet is encapsulated and the packet is split.
    Packet:
Packet is to give a piece of data with Baotou, so that the packet is divided into Baotou and the package two parts of the content (after the packet filter illegal packets will be added "packet tail" content). Baotou is actually a fixed-size structure, There is a struct member variable that represents the length of the package body, which is a very important variable, other members of the struct can be defined according to their own needs. According to the length of Baotou and the variable containing the length of the packet in the Baotou, a complete packet can be split correctly.
    for unpacking I am most commonly used in the following two ways.
    1. Dynamic buffer staging. The reason the buffer is dynamic is because the buffer length is increased when the length of the data that needs to be buffered exceeds the length of the buffer. The
    approximate process is described as follows:
    A, dynamically allocating a buffer for each connection, and associating this buffer with the socket, commonly used through struct-body correlation.
     B, this data is first stored in the buffer when the data is received.
    C To determine if the length of the data in the buffer is sufficient for the length of a header, if not enough, the unpacking operation is not performed.
    D, according to the Baotou data to resolve the inside of the variable representing the length of the package body.
    E, determine whether the data length in the buffer box is sufficient for a package length, if not enough, the unpacking operation is not performed.
    F, remove the entire packet. The "fetch" here means not only copying the packets from the buffer, but also removing the packets from the cache. The way to delete this packet is to move the data behind the package to the start address of the buffer.

There are two drawbacks to this approach. 1. Dynamically allocating a buffer for each connection increases memory usage. 2. Three places need to copy data, one place is to store the data in the buffer, one place is to take the complete packet out of the buffer, One place is to remove the packet from the buffer. This improved method of unpacking will solve and perfect some shortcomings.

The relevant code is given below.

First look at Baotou structure definition

#pragma pack (push,1)//start defining packets with byte alignment
/*----------------------Baotou---------------------*/
typedef struct TAGPACKAGEHEAD
{
BYTE Version;
WORD Command;
WORD ndatalen;//The length of the package body
}package_head;
#pragma pack (POP)//end definition packet to restore the original alignment

Then look at the storage data and the "fetch" data functions.

/*****************************************************************************
Description: Adding data to the cache
input:pbuff[in]-data to be added; nlen[in]-to add data length
Return: Returns False if the current buffer does not have enough space to hold pbuff, otherwise true.
******************************************************************************/
BOOL Cdatabufferpool::addbuff (char *pbuff, int nlen)
{
M_cs. Lock ();///Critical Zone lock

if (Nlen < 0)
{
M_cs. Unlock ();
return FALSE;
}

if (Nlen <= getfreesize ())///Determine if the remaining space is sufficient to store nlen Long data
{
memcpy (M_pbuff + m_noffset, Pbuff, Nlen);
M_noffset + = Nlen;
}
else///if not enough, expand the original space
{
char *p = M_pbuff;
M_nsize + = nlen*2;//per growth 2*nlen
M_pbuff = new Char[m_nsize];
memcpy (M_pbuff,p,m_noffset);
delete []p;
memcpy (M_pbuff + m_noffset, Pbuff, Nlen);
M_noffset + = Nlen;
M_cs. Unlock ();
return FALSE;
}
M_cs. Unlock ();
return TRUE;
}

/*****************************************************************************
Description: Get a complete package
The data obtained by input:buf[out]-; the length of data obtained by nlen[out]-
Return:1, current buffer not enough one packet header data 2, the current buffer is not enough for a package of data
******************************************************************************/

int Cdatabufferpool::getfullpacket (char *buf, int& Nlen)
{
M_cs. Lock ();

if (M_noffset < M_packetheadlen)//current buffer insufficient data for one header
{
M_cs. Unlock ();
return 1;
}
Package_head *p = (Package_head *) M_pbuff;
if ((M_noffset-m_packetheadlen) < (int) P->ndatalen)//current buffer is not enough for a packet of data
{
M_cs. Unlock ();
return 2;
}
Judging the legitimacy of a package
/* int isintegrallity = validatepackintegrality (p);
if (isintegrallity! = 0)
{
M_cs. Unlock ();
return isintegrallity;
}
*/
Nlen = m_packetheadlen+p->ndatalen;
memcpy (Buf, M_pbuff, Nlen);
M_noffset-= Nlen;
memcpy (M_pbuff, M_pbuff+nlen, M_noffset);

M_cs. Unlock ();
return 0;
}


The disadvantages of this approach are mentioned earlier. The following is an improved method, that is, the use of ring buffer. But this improvement does not solve the first disadvantage and the first copy of the data, only the third place to solve the copy of the data (this place is the most copied data place).

The 2nd way of unpacking will solve both problems.
The loop buffer implementation is defined by defining two pointers, pointing to the header and tail of the valid data, respectively. When you store data and delete data, only the head and tail pointers are moved. Use code to illustrate. Note: The following code is code that takes an open source game server, and I have modified this code.

int Ccircularbufferpool::P utdata (TCHAR *pdata, int len)
{
if (len <= 0)
return 1;

EnterCriticalSection (&m_cs);
while (Isoverflowcondition (len))///Determine if the remaining space in the buffer is sufficient to hold the Len Long data
{
Bufferresize (len), or///If not enough, expand the buffer.
}

if (Isindexoverflow (len))////Determine the position of the "tail" pointer.
{
int firstcopylen = M_ibufsize-m_itailpos;
int secondcopylen = Len-firstcopylen;
CopyMemory (M_pbuffer+m_itailpos, PData, Firstcopylen);
if (Secondcopylen)
{
CopyMemory (M_pbuffer, Pdata+firstcopylen, Secondcopylen);
M_itailpos = Secondcopylen;
}
Else
M_itailpos = 0;
}
Else
{
CopyMemory (M_pbuffer+m_itailpos, PData, Len);
M_itailpos + = Len;
}

LeaveCriticalSection (&m_cs);
return 0;


}


void Ccircularbufferpool::getdata (TCHAR *pdata, int len, bool Delete)
{
if (Len < M_ibufsize-m_iheadpos)
{
CopyMemory (PData, M_pbuffer+m_iheadpos, Len);
if (delete==true)
M_iheadpos + = Len;
}
Else
{
Int FC, SC;
FC = M_ibufsize-m_iheadpos;
sc = LEN-FC;
CopyMemory (PData, M_pbuffer+m_iheadpos, FC);
if (SC) copymemory (PDATA+FC, M_pbuffer, SC);
if (delete==true)
M_iheadpos = SC;
if (M_iheadpos >= m_ibufsize)
M_iheadpos = 0;

}
}

//
To parse a custom package
//
int Ccircularbufferpool::getfullpacket (TCHAR *buf, int &nlen)
{
EnterCriticalSection (&m_cs);
if (Getvalidcount () < M_packetheadlen)//current buffer insufficient data for one header
{
LeaveCriticalSection (&m_cs);
return 1;
}

GetData (Buf,m_packetheadlen,false);
Package_head *p = (Package_head *) Buf;
if ((Getvalidcount ()-m_packetheadlen) < (int) P->ndatalen)//current buffer is not enough for a packet of data
{
LeaveCriticalSection (&m_cs);
return 2;
}

Judging the legitimacy of a package
int isintegrallity = validatepackintegrality (p);
if (isintegrallity! = 0)
{
LeaveCriticalSection (&m_cs);
return isintegrallity;
}

GetData (buf,m_packetheadlen+p->ndatalen,true);
Nlen = m_packetheadlen+p->ndatalen;

LeaveCriticalSection (&m_cs);


return 0;
}

2. Using the underlying buffer for unpacking
    because TCP also maintains a buffer, we can use TCP buffers to cache our data, This makes it unnecessary to allocate a buffer for each connection. On the other hand, we know that recv or WSARecv have a parameter that indicates how long we are going to receive the data. Using these two conditions, we can optimize the first method.
    for blocking sockets, we can use a loop to receive header-length data, then parse out the variable representing the length of the package, and then use a loop to receive the packet length data.
The relevant code is as follows:
   
Char packagehead[1024];
Char packagecontext[1024*20];

int Len;
Package_head *ppackagehead;
while (M_bclose = = False)
{
memset (packagehead,0,sizeof (Package_head));
Len = M_tcpsock.receivesize (( char*) packagehead,sizeof (Package_head));
if (len = = socket_error)
{
      break;
}
if (len = = 0)
{
      break;
}
Ppackagehead = (Package_head *) Packagehead;
memset (packagecontext,0,sizeof (Packagecontext));
if (ppackagehead->ndatalen>0)
{
   len = m_tcpsock.receivesize ((char*) Packagecontext, Ppackagehead->ndatalen);
}
       }

M_tcpsock is a variable of a class that encapsulates a socket, where the receivesize is used to receive data of a certain length until a certain length of data is received or a network error is returned.

int Winsocket::receivesize (char* strdata, int ilen)
{
if (strdata = = NULL)
return err_badparam;
char *p = strdata;
int len = Ilen;
int ret = 0;
int returnlen = 0;
while (Len > 0)
{
ret = recv (M_hsocket, p+ (Ilen-len), Ilen-returnlen, 0);
if (ret = = Socket_error | | ret = = 0)
{

return ret;
}

Len-= ret;
Returnlen + = ret;
}

return Returnlen;
}
For a non-blocking socket, such as the completion port, we can submit a request to receive packet length data, when the GetQueuedCompletionStatus returns, we determine whether the length of the received data is equal to the length of the header, if equal to, then submit the packet length of the request to receive data, If not equal, the request to receive the remaining data is submitted. A similar approach is used when receiving the package body.
The relevant code is given below

Enum Iotype
{
Ioinitialize,
Ioread,
Iowrite,
Ioidle
};

Class Overlappedplus
{
Public
OVERLAPPED M_ol;
Iotype M_iotype;
BOOL m_bispackagehead;//Whether the data currently received is header data.

int m_count;
Wsabuf M_wsabuffer;
int m_recvpos;
Char m_buffer[1024*8];//this buffer to be as large as possible

Overlappedplus (Iotype iotype) {
ZeroMemory (this, sizeof (Overlappedplus));
M_iotype = Iotype;
}
};
Receives the first request issued after the connection, requests to receive the packet header size data.
Overlappedplus *poverlappedplus = new Overlappedplus;
Poverlappedplus->m_wsabuffer.buf = poverlappedplus->m_buffer;
Poverlappedplus->m_wsabuffer.len = length of package_head_len;///header
Poverlappedplus->m_bispackagehead = true;
Poverlappedplus->m_recvpos = 0;
Poverlappedplus->m_iotype = Ioread;


DWORD recvbytes;
DWORD Flags;
Flags = 0;
if (WSARecv (Clientsocket, & (Poverlappedplus->m_wsabuffer), 1, &recvbytes, &flags,
&poverlappedplus->m_ol, NULL) = = Socket_error)
{
if (WSAGetLastError ()! = error_io_pending)
{
Delete Poverlappedplus;
}
Else
{
Related error handling

}
}
Else
{
Related error handling

}


In the function where GetQueuedCompletionStatus is located.
if (poverlapplus->m_iotype== ioread)
{
if (Poverlapplus->m_wsabuffer.len = = dwiosize)
{
if (Poverlapplus->m_bispackagehead = = true)///Received is Baotou.
{
Package_head *ppackagehead = (Package_head *) (Poverlapplus->m_buffer);

if (Pthis->islegalitypackagehead (ppackagehead) ==false)///Determine if the package is legal
{
Closesocket (Lpclientcontext->m_socket);
Continue
}

Poverlapplus->m_bispackagehead = false;
Poverlapplus->m_wsabuffer.len = ppackagehead->ndatalen;
Poverlapplus->m_recvpos + = Dwiosize;
Poverlapplus->m_wsabuffer.buf = poverlapplus->m_buffer+poverlapplus->m_recvpos;

}
Else///received is the package body
{

Poverlapplus->m_recvpos + = Dwiosize;
In this case, a complete packet is stored in the Poverlapplus->m_buffer, with a length of poverlapplus->m_recvpos

Continue request packet header for next packet
Poverlapplus->m_wsabuffer.buf = poverlapplus->m_buffer;
memset (poverlapplus->m_buffer,0,sizeof (Poverlapplus->m_buffer));
Poverlapplus->m_wsabuffer.len = Package_head_len;
Poverlapplus->m_bispackagehead = true;
Poverlapplus->m_recvpos = 0;

}
}
The data received by else///is not complete yet
{
Poverlapplus->m_wsabuffer.len-= dwiosize;
Poverlapplus->m_recvpos + = Dwiosize;
Poverlapplus->m_wsabuffer.buf = poverlapplus->m_buffer+poverlapplus->m_recvpos;
}
Poverlapplus->m_iotype = Ioread;
State = WSARecv (Lpclientcontext->m_socket, & (Poverlapplus->m_wsabuffer), 1, &recvbytes, &Flags,
&poverlapplus->m_ol, NULL);
if (state = = Socket_error)
{
if (WSAGetLastError ()! = error_io_pending)
{

Close the socket to release the appropriate resource
Continue
}
}

}

Three: How to judge the legitimacy of the package.
Judging the legitimacy of the package can be judged in the following two ways. But think 100% of the decision out of the illegal package, only through the information security knowledge to determine, this method does not elaborate here.
1. Through the structure of Baotou to determine the legitimacy of the package.
Originally, I was based on the Baotou to determine the validity of the package, such as whether the command is beyond the range of commands, whether the Ndatalen is greater than the maximum packet length. But this method cannot filter out the illegal packets, when the illegal packets, the only thing we can do is disconnect, perhaps this is also the best way to handle.
We can give a complete package with a start and end flag, a flag can be an integer, or a string of strings. Take the first unpacking as an example. When we are going to split a full package, we first search the start flag of the packet from the buffer valid header pointer address, search for it and the current data is enough for a header data, Then judge whether the start flag and the header is legal, if the law is based on the value of the variable representing the length of the packet end, to determine whether the package tail flag is consistent with our definition, if consistent, this package is a legitimate package. If there is an inconsistency, continue to look for the start flag for the next package and discard the data in front of the next
2. The logic layer to determine the validity of the package.
When we take out a legitimate package, we also have to judge the validity of the package according to the logic of the current data processing. For example, after the successful landing of a certain period of time the server received the same client landing package, then we can determine that the package is illegal, simple processing is to disconnect.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.