Complete HTTP download directly with socket

Source: Internet
Author: User
Tags response code set cookie first row
Screenshot of program running effect:


HTTP Download socket Download the source code attached to this article

There are many ways to download a file from an HTTP server, and the "enthusiastic" Microsoft offers a WinInet class that is handy to use. Of course, we can also implement these functions ourselves, By formatting the request headers, it is easy to implement the function of breakpoint continuation and check update, etc. the project included in this article has a DLL that supports the HTTP1.1 protocol, which implements the download function directly with the socket, and implements the following functions:
1. Connecting the host
2. Format Request headers
3. Set receive, send timeout
4. Receive and analyze response headers
Connect, send, set timeout, receive data etc I will not elaborate, Windows socket is already done, call the corresponding function on OK.
To download a file from the server, first send a request to the server. The HTTP request header consists of several line strings. The following is an example of the HTTP request header format. Suppose you want to download http://www.sina.com.cn/index.html this page, the request header is written as follows:
Line 1th: method, requested content, version of HTTP protocol
Download can generally use Get method, the content of the request is "/index.html", the version of the HTTP protocol refers to the version of the browser support, for downloading software does not matter, so with the 1.1 version of "http/1.1";
"Get/index.html http/1.1"
Line 2nd: Host name, Format "host: Hosts"
In this example: "Host:www.sina.com.cn"
Line 3rd: The data type that is accepted. Download software of course to receive all data types, so:
"Accept:*/*"
Line 4th: Specifies the type of browser
Some servers will increase or decrease some content depending on the type of customer server, and in this case you can write:
"User-agent:mozilla/4.0 (compatible; MSIE 5.00; Windows 98) "
Line 5th: Connection settings
Set to remain connected: "Connection:keep-alive"
Line 6th: To implement a breakpoint continuation, specify where to receive the data, in the following format
"range:bytes= start position-end position"
For example, to read the first 500 bytes can be written like this: "range:bytes=0-499"; Download starting from 1000th byte: "range:bytes=999-"
Finally, don't forget to add a line to the end of the request header. The entire request header is as follows:
Get/index.html http/1.1
Host:www.sina.com.cn
accept:*/*
user-agent:mozilla/4.0 (compatible; MSIE 5.00; Windows 98)
Connection:keep-alive
Chttpsocket provides a formatrequestheader () function to format the output HTTP request headers. The code is as follows:
Outputs the HTTP request Header const char *chttpsocket::formatrequestheader (char *pserver,char *pobject, Long &length, based on the requested relative URL).

	Char *pcookie,char *preferer,long nfrom, long nto,int nservertype) {char szport[10];

	Char sztemp[20];

	sprintf (Szport, "%d", m_port);



	memset (M_requestheader, '/0 ', 1024);

	Line 1th: Method, requested path, version strcat (M_requestheader, "get");

	strcat (M_requestheader,pobject);

    strcat (M_requestheader, "http/1.1");



	strcat (M_requestheader, "/r/n");

	Line 2nd: Host strcat (M_requestheader, "host:");

    strcat (M_requestheader,pserver);



	strcat (M_requestheader, "/r/n");

		Line 3rd: if (preferer!= NULL) {strcat (M_requestheader, "Referer:");

		strcat (M_requestheader,preferer);		

	strcat (M_requestheader, "/r/n");

    ///Line 4th: Received data type strcat (M_requestheader, "accept:*/*");



	strcat (M_requestheader, "/r/n"); Line 5th: Browser type strcat (M_requestheader, "user-agent:mozilla/4.0" (compatible; MSIE 5.00;

    Windows 98) "); strcat (M_requestheader, "/r/n ");

	Line 6th: Connection settings, keep strcat (M_requestheader, "connection:keep-alive");



	strcat (M_requestheader, "/r/n");

	Line 7th: Cookies.

		if (Pcookie!= NULL) {strcat (M_requestheader, "Set cookie:0");

		strcat (M_requestheader,pcookie);

	strcat (M_requestheader, "/r/n");

		///Line 8th: Requested data start byte position (key to continuation of breakpoint) if (Nfrom > 0) {strcat (M_requestheader, "range:bytes=");

		_ltoa (nfrom,sztemp,10);

		strcat (m_requestheader,sztemp);

		strcat (M_requestheader, "-");

			if (NTo > Nfrom) {_ltoa (nto,sztemp,10);

		strcat (m_requestheader,sztemp);

	} strcat (M_requestheader, "/r/n");



	///last line: Empty line strcat (M_requestheader, "/r/n");

	return result Length=strlen (M_requestheader);

return m_requestheader; }
Request the hair to send to the server can receive the response from the server header. The response header is also composed of several line strings, each consisting of a field and a value, except for the first and last blank lines. The first row includes the server's response state, from 2XX to 5XX, each with a different meaning, For more information you can view the RFC document downloads you need to be concerned about: 2XX indicates success, you can continue to read the data; 3XX indicates that the target has been transferred, the new address is in the "Location" field; 4XX indicates a client error, possibly a wrong download address, and so on; 5XX indicates a server-side error. The fields in the response header are "Content-length", "Accept-ranges", "Content-type", "Date", "last-modified", "Location" and so on content, download more interested domains have "content-length Domain and the Location field. Content-length "indicates the size of the download file," Location "indicates the actual location of the target, and when the response code is 3XX, it is reconnected with the value in that field.
The Chttpsocket class in the accompanying source code provides several methods for reading the server status code, the value of a field, the line in the response header, and the entire response header:
int	getserverstate ();						Return to server status code-1 for unsuccessful

int	GetField (const char* Szsession,char *szvalue,int nmaxlength);/return a field value,-1 indicates

unsuccessful int	getresponseline (char *pline,int nmaxlength);//Gets the line of			

Const char* getResponseHeader for the header returned	(int & Length);

After getting the response header, if the response code is 2XX and the value of "content-length" is not equal to 0, it means that you can receive the downloaded file data, and the next step is simple, call Chttpsocket::recevie () until the length of the data received is equal to " The value of Content-length "is OK.
A complete use process consists of the following steps:
1. Call the AfxParseURL () parse URL to get the server and download path.
2. Call Chttpsocket::socket () to create a socket.
3. Call Chttpsocket::connect () to connect to the server.
4. Call Chttpsocket::formatrequestheader () to format the request header.
5. Call Chttpsocket::sendrequest () to send the request header to the server.
6. Call Chttpsocket::getserverstate () to get the response status code.
7. Call Chttpsocket::getfield ("Content-length") to get the size of the download file.
8. Call Chttpsocket::receive () to receive the data until the data is received.
The source code included in this article is an example project that uses Chttpsocket to implement the download function. Note that all calls are blocked, so it is a good idea to create a thread for a download task, otherwise it will cause the interface to not respond to user input. The program runs the interface as shown above, showing the request headers, response headers, and download progress.
     of course, there's still a lot of work to do to really multitask multi-threaded downloads. This article only discusses the possibility of downloading by myself, and I hope it will help readers. Welcome to Mail Advice .  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.