Large file transfer (stream form) based on TCP protocol under Windows

Last Update:2016-06-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Simple implementation of high-efficient transmission of large files under TCP

Large file transfer under TCP, not as small files directly packaged in a buffer to send out, because the file is more likely to be 1g,2g or larger, the first efficiency problem, the second TCP sticky packet problem. for the service-side design, it needs to be more stringent. The following describes the simple implementation of large files in TCP transmission applications.

Sticky packet Occurrence Reason: occurs in the stream loss, UDP does not appear sticky packet, because it has the message boundary (reference Windows network programming)
1 The sending side needs to wait for the buffer to be full before sending out, resulting in sticky packets
2 The receiver does not receive the buffer packet in time, resulting in multiple packets receiving

Workaround:
In order to avoid sticking, the following measures can be taken:

One is caused by the sender of the sticky packet phenomenon, the user can be programmed to avoid, TCP provides a mandatory data transfer immediately after the operation instruction Push,tcp software received the operation instruction, the data is sent out immediately, without waiting for the transmission buffer full;

Second, for the receiver caused by the adhesive package, you can optimize the program design, reduce the workload of the receiving process, improve the priority of the receiving process and other measures, so that they receive data in a timely manner, so as to avoid the occurrence of sticky packet phenomenon;

Three is controlled by the receiver, a packet of data by the structure of the field, the human control sub-multiple received, and then merged, through this means to avoid sticky packets.

For the TCP-based development of the communication program, there is a very important problem to solve, that is, packet and unpacking.

Why do TCP-based communication programs require packets and unpacking?

TCP is a "flow" protocol, the so-called flow, is a string of data without bounds. You can think of the river water, is connected into a piece, in the meantime there is no dividing line. However, the general communication program development needs to define a separate packet, such as a data packet for logging in, for logging off the packet. Because the TCP "stream "and the network condition, there are several situations in which data transfer occurs.
Suppose we call two times in a row send two pieces of data data1 and data2, at the receiving end there are the following kinds of reception (of course, not only these cases, here are only a representative case).
A. Receive the DATA1 first, then receive the DATA2.
B. Receive some of the DATA1 data first, then receive the remainder of data1 and all of the data2.
C. Received all the data of data1 and some data of data2, then received the remaining data of data2.
D. All data of data1 and Data2 are received at once.

For a This situation is exactly what we need, no more discussion. For b,c,d the situation is that we often say "sticky bag", we need to take the received data to be dismantled, split into a separate packet. In order to split the package, the packet must be marshaled on the sending side.

Another: For UDP, there is no problem of unpacking, because UDP is a "packet" protocol, that is, two of the data is bounded, at the receiving end of either receive data or receive a complete piece of data, no less receive or receive more.

Two. Why there is a b.c.d situation.
"Sticky packs" can occur on the sending side and can occur at the receiving end.
1. sticky packets from the sending end caused by the Nagle algorithm:The Nagle algorithm is a kind of algorithm to improve the network transmission efficiency. Simply put, when we submit a piece of data to TCP for sending, TCP does not send this data immediately, but waits for a short period of time, See if there is any data to send during the wait, and if so, send the two pieces of data one at a time. This is a simple explanation for the Nagle algorithm, please read the related books in detail. The case of C and D is probably caused by the Nagle algorithm.
2. Receive end- of-timereceive packets: TCP will present the received data in its own buffer, and then notify the application layer to fetch the data. When the application layer is unable to get the TCP data out in time for some reason, it will cause the TCP buffer to hold several pieces of data.

Three. How to package and unpacking.
When I first encountered a "sticky pack" problem, I slept for a short period of time by calling sleep between two calls. The disadvantage of this solution is obvious, which makes the transmission efficiency much lower and not reliable. Later it is solved by means of answer, though most of the time it is feasible, But can not solve the situation like B, and the use of response to increase the traffic volume, aggravating the network load. Later, the packet is marshaled and the operation is split.
Package:
packet is to give a piece of data with Baotou, so that the packet is divided into Baotou and the package two parts of the content (after the filter illegal packets will be added "packet tail" content).Baotou is actually a fixed-size structure, which has a struct member variable to represent the length of the package, this is a very important variable, the other members of the struct can be defined according to their own needs. According to the length of Baotou and the variable containing the length of the packet in the Baotou, a complete packet can be split correctly.
There are two ways I'm most commonly used for unpacking.
1. Dynamic buffer staging. The reason is that the buffer is dynamic because the buffer length is increased when the length of the data that needs to be buffered exceeds the length of the buffer.
The approximate process is described as follows:
A, a buffer is dynamically allocated for each connection, and the buffer is associated with the socket, which is commonly associated with a struct.
B, when the data is received, the data is first stored in the buffer.
C, determine whether the data length in the buffer is sufficient for the length of a baotou, if not enough, do not do the unpacking operation.
D, according to the Baotou data to parse out the inside represents the length of the package body variables.
E, determine whether the buffer in addition to the length of the packet outside the Baotou is enough to the length of a package, if not enough, do not do the unpacking operation.
F, remove the entire packet. Here the "take" means not only to copy the packet from the buffer, but also to remove the packet from the cache. The way to delete this packet is to move the data behind the package to the start address of the buffer.

There are two drawbacks to this approach. 1. Dynamically allocating a buffer for each connection increases memory usage. 2. Three places need to copy data, one place is to store the data in the buffer, one place is to take the complete packet out of the buffer, One place is to remove the packet from the buffer. The second method of unpacking solves and perfects these shortcomings.

The disadvantages of this approach are mentioned earlier. The following is an improved method, that is, the use of ring buffer. But this improvement does not solve the first disadvantage and the first copy of the data, only the third place to solve the copy of the data (this place is the most copied data place). The 2nd way of unpacking will solve both problems.
The loop buffer implementation is defined by defining two pointers, pointing to the header and tail of the valid data, respectively. When you store data and delete data, only the head and tail pointers are moved.

2. Using the underlying buffer to split the package
Because TCP also maintains a buffer, we can use the TCP buffer to cache our data, so we do not need to allocate a buffer for each connection. On the other hand we know that recv or WSARecv have a parameter, Used to indicate how long we are going to receive data. With these two conditions, we can optimize the first method.
For blocking sockets, we can use a loop to receive packet length data, then parse out the variable representing the length of the package, and then use a loop to receive the packet length data.

TCP is a stream with no bounds. It doesn't matter. TCP is a protocol, and the socket is an interface. This article in the form of a stream to send a single large file, it does not matter the packet and unpacking problems, see the following code; But to send multiple large files in succession, the packet and unpacking is the problem to consider!

#ifndef tcprecvfile#define tcprecvfile#include <stdio.h>  #include <winsock2.h>  #include < iostream> #include <time.h> #define destaddress "192.168.27.170" #define FILENAME "d:\\file.jpg" #define SERVER _port 5210//Listening port #define Begin_num 19900711#define data_num  20160113#define end_num 11700991#define   block_ Data_size (Ten * 1024x768) #define File_head  4#define block_head 4class ctcprecvfile{public:    ctcprecvfile ();    ~ctcprecvfile ();    void Recvfile ();    void Recv ();p rivate:    void Initsocket ();    void Closesocket ();    SOCKET M_listen, M_server; Listening sockets, connecting socket     struct sockaddr_in serveraddr, clientaddr;//Address information}; #endif

#include "TCPRecvfile.h" Ctcprecvfile::ctcprecvfile () {Initsocket ();} Ctcprecvfile::~ctcprecvfile () {closesocket ();}         void Ctcprecvfile::initsocket () {WORD wversionrequested = Makeword (2, 2);//The version of the Winsock DLL you wish to use Wsadata wsadata;    Winsock initializes int ret = WSAStartup (wversionrequested, &wsadata);        if (ret! = 0) {printf ("WSAStartup () failed!\n");      return 0;    }//Create socket, use TCP protocol M_listen = socket (af_inet, sock_stream, ipproto_tcp);        if (M_listen = = Invalid_socket) {printf ("SOCKET () faild!\n");      return 0; }//build local address information serveraddr.sin_family = af_inet; Address Family Serveraddr.sin_port = htons (Server_port); Note The conversion to network byte-order serveraddr.sin_addr. S_un. S_ADDR = Inaddr_any;    Use Inaddr_any to indicate any address//binding RET = Bind (M_listen, (struct sockaddr *) &serveraddr, sizeof (SERVERADDR)); if (ret = = socket_error) {printf ("bind () faild!        Code:%d\n ", WSAGetLastError ());     return 0; }//Listen connection request RET = Listen (M_listen, 1); if (ret = = socket_error) {printf ("Listen () faild!    Code:%d\n ", WSAGetLastError ());    } int length = sizeof (SERVERADDR);    M_server = Accept (M_listen, (struct sockaddr *) &serveraddr, &length); if (M_server = = Invalid_socket) {printf ("Accept () faild!        Code:%d\n ", WSAGetLastError ());    Return } else printf ("Server is connected!\n");}    void Ctcprecvfile::recv () {char *eachbuf = new Char[block_data_size + 2 * file_head];    memset (eachbuf, 0, block_data_size + 2 * file_head);    FILE *FP;    UINT dwfilesize = 0;    unsigned int recvnum = 0, Flag_status = 0, flag_recv = 1;    fp = fopen (FILENAME, "WB"); 1, read the first set of data, get the file size, establish a connection recv (M_server, EACHBUF, 2 * file_head, 0);//////----------------recv char charfilesize[4] =    {0}; memcpy (Charfilesize, Eachbuf, File_head); Copy the first 4 bytes for (int i = 0; i < 4; i++) {Flag_status + = ((UCHAR) charfilesize[i]) << (8 * (4-i-1)); Get file Start character} memcpy (Charfilesize, Eachbuf + file_head, file_head); Copy 第5-8个 byte for (int i = 0; i < 4; i++) {dwfilesize + = ((UCHAR) charfilesize[i]) << (8 * (4-i-1) );    Get file size} int start = Clock ();        {//open receive memory int datapos = 0;        Char *filebuffer = new Char[dwfilesize];        memset (filebuffer, 0, dwfilesize);            while (1) {int ret = recv (M_server, Eachbuf, block_data_size, 0);            if (ret <= 0) break;            memcpy (Filebuffer + Datapos, eachbuf, ret);        Datapos = Datapos + ret;        } fwrite (Filebuffer, dwFileSize, 1, FP);    Fclose (FP);    } int end = Clock (); Std::cout << "Time:" << end-start << "MS" << Recvnum << Std::endl;}    void Ctcprecvfile::closesocket () {closesocket (m_server);//Close Socket closesocket (M_listen); WSACleanup ();}

#ifndef tcpsendfile#define tcpsendfile#include <stdio.h>  #include <stdlib.h>  #include < winsock2.h>  #include <iostream> #include <tchar.h> #include <time.h> #define SERVER_PORT 5210 Listening Port  #define DESTADDRESS "192.168.27.170" #define FILENAME "d:\\10m.jpg" #define Begin_num 19900711#define Data_ NUM  20160113#define end_num   11700991#define block_data_size (Ten * 1024x768) #define File_head 4#define  Block_ HEAD 4 class Ctcpsendfile{public:    Ctcpsendfile ();    ~ctcpsendfile ();    void Sendfile ();    void Send ();p rivate:    void Initsocket ();    void Closesocket ();    SOCKET m_client; Connect socket      struct sockaddr_in m_clientaddr;//server address information}; #endif

#include "TCPSendfile.h" Ctcpsendfile::ctcpsendfile () {Initsocket ();} Ctcpsendfile::~ctcpsendfile () {closesocket ();}    void Ctcpsendfile::initsocket () {WORD wversionrequested = Makeword (2, 2);//The version of the Winsock DLL you wish to use Wsadata wsadata;  int ret = WSAStartup (wversionrequested, &wsadata);        Load socket font if (ret! = 0) {printf ("WSAStartup () failed!\n");      return 0; }//Confirm that the Winsock DLL supports version 2.2 if (Lobyte (wsadata.wversion)! = 2 | |        Hibyte (wsadata.wversion)! = 2) {//Releases the resources allocated for the program, terminates the use of the Winsock Dynamic Library ("Invalid Winsock version!\n");      return 0;    }//winsock Initialize//create socket, use TCP protocol m_client = socket (af_inet, sock_stream, ipproto_tcp);        if (m_client = = Invalid_socket) {printf ("SOCKET () failed!\n");      return 0; }//Build server address information m_clientaddr.sin_family = af_inet; Address Family M_clientaddr.sin_port = htons (Server_port); Note the conversion to the network m_clientaddr.sin_addr sequence. S_un. S_ADDR = inet_addr (destaddreSS);        Connect server do {ret = connect (m_client, struct sockaddr *) &m_clientaddr, sizeof (M_CLIENTADDR)); if (ret = = socket_error) {printf ("Connect () failed!        Try it again!\n ");        } else printf ("Client is connected\n");    Sleep (1000); } while (ret = = Socket_error);}    void Ctcpsendfile::sendfile () {HANDLE hfile;    DWORD Dwhighsize, dwBytesRead;    DWORD dwFileSize; hfile = CreateFile (_t (FILENAME), Generic_read, File_share_read, NULL, open_existing, File_flag_sequential_scan, NUL    L);    dwFileSize = GetFileSize (hfile, &dwhighsize);    Std::cout << "dwfilesize=" << dwfilesize << Std::endl;    2, read the contents of the file to BYTE * fileData BOOL bsuccess;    Char *filedata = new Char[dwfilesize];    bsuccess = ReadFile (hfile, FileData, dwFileSize, &dwbytesread, NULL);    CloseHandle (hfile); 3. Determine if the file was successfully read if (!bsuccess | |    (dwBytesRead! = dwfilesize)) {std::cout << "read failed" << Std::endl;        Free (fileData);    Return    }//Send data frame DWORD retval = 0;    UINT datapos = 0;    Char *eachbuf = new Char[block_data_size + 2 * file_head];    memset (eachbuf, 0, block_data_size + 2 * file_head);    eachbuf[datapos++] = begin_num >> & 0xff;//file start identifier eachbuf[datapos++] = Begin_num >> & 0xff;    eachbuf[datapos++] = Begin_num >> 8 & 0xFF;    eachbuf[datapos++] = begin_num & 0xFF;    eachbuf[datapos++] = dwfilesize >> & 0xff;    eachbuf[datapos++] = dwfilesize >> & 0xff;    eachbuf[datapos++] = dwfilesize >> 8 & 0xFF;    eachbuf[datapos++] = dwfilesize & 0xFF;    retval = Send (M_client, EACHBUF, 2 * file_head, 0);    int start = clock ();        {retval = Send (M_client, FileData, dwfilesize, 0);        if (retval = =-1) std::cout << "Send error!";    int end = Clock ();  }}void Ctcpsendfile::closesocket () {closesocket (m_client);//Close socket    WSACleanup ();}

Large file transfer (stream form) based on TCP protocol under Windows

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Large file transfer (stream form) based on TCP protocol under Windows

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support