Using Java to implement breakpoint continuation (HTTP)
Content:
(a) The principle of the continuation of the breakpoint
Two Key points in Java implementation of breakpoint continuation
(iii) The realization of the core of the breakpoint continuous transmission
About the author
At COSL (zhong_hua@263.net)
May 2001
(a) The principle of the continuation of the breakpoint
In fact, the principle of the continuation of the breakpoint is very simple, is the HTTP request and the general download is different.
For example, when a browser requests a text on the server, the request is made as follows:
Suppose the server domain name is wwww.sjtu.edu.cn and the file name is Down.zip.
Get/down.zip http/1.1
Accept:image/gif, Image/x-xbitmap, Image/jpeg, Image/pjpeg, application/vnd.ms-
Excel, Application/msword, Application/vnd.ms-powerpoint, */*
Accept-language:zh-cn
Accept-encoding:gzip, deflate
user-agent:mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
Connection:keep-alive
After the server receives the request, it looks for the requested file, extracts the file's information, and returns it to the browser, returning the following information:
The so-called breakpoint continuation, that is, from the file has been downloaded from the place to continue to download. So in the client browser, pass
Add a message to the Web server-where to start.
The following is a "browser" of your own to pass the request information to the Web server, which requires starting from 2000070 bytes.
Get/down.zip http/1.0
User-agent:netfox
range:bytes=2000070-
Accept:text/html, Image/gif, Image/jpeg, *; Q=.2, */*; q=.2
Take a closer look and you'll find one more line range:bytes=2000070-
This line is meant to tell the server that the file is down.zip from 2000070 bytes and that the preceding byte is not to be transmitted.
After the server receives this request, the information returned is as follows:
206
content-length=106786028
Content-range=bytes 2000070-106786027/106786028
Date=mon, APR 2001 12:55:20 GMT
etag=w/"02ca57e173c11:95b"
Content-type=application/octet-stream
server=microsoft-iis/5.0
Last-modified=mon, APR 2001 12:55:20 GMT
Compared to the information returned by the previous server, you will see an added line:
Content-range=bytes 2000070-106786027/106786028
The returned code is also changed to 206, and is no longer 200.
Knowing the above principles, you can proceed to the programming of the continuation of the breakpoint.
Two Key points in Java implementation of breakpoint continuation
(1) What method to implement the submission of range:bytes=2000070-.
Of course, with the most original socket is certainly able to complete, but it is too much trouble, in fact, Java NET package provides this functionality. The code is as follows:
Set User-agent
Httpconnection.setrequestproperty ("User-agent", "Netfox");
Set the starting position of a breakpoint continuation
Httpconnection.setrequestproperty ("RANGE", "bytes=2000070");
Get input stream
InputStream input = Httpconnection.getinputstream ();
The byte stream that is fetched from the input stream is the byte stream that the Down.zip file starts with 2000070.
You see, in fact, the continuation of the breakpoint in Java to achieve a very simple bar.
The next thing to do is how to save the obtained stream into the file.
The method used to save the file.
I am using the Randaccessfile class in the IO package.
The operation is fairly simple, assuming that the file is saved from 2000070 and the code is as follows:
Randomaccess osavedfile = new Randomaccessfile ("Down.zip", "RW");
Long NPOs = 2000070;
Locate file pointer to NPOs location
Osavedfile.seek (NPOs);
Byte[] B = new byte[1024];
int nread;
Reads a byte stream from the input stream and writes it to a file
while ((Nread=input.read (b,0,1024)) > 0)
{
Osavedfile.write (B,0,nread);
}
How, also very simple.
The next thing to do is to integrate into a complete program. Including a series of line program control system and so on.
(iii) The realization of the core of the breakpoint continuous transmission
The main use of 6 classes, including a test class.
Sitefilefetch.java is responsible for crawling the entire file, controlling internal threads (Filesplitterfetch classes).
Filesplitterfetch.java is responsible for the capture of some files.
Fileaccess.java is responsible for the storage of files.
Siteinfobean.java information about the files you want to crawl, such as the directory where the file is saved, the name, the URL of the crawl file, and so on.
Utility.java Tool class, put some simple methods.
Testmethod.java Test class.
Siteinfobean Siteinfobean = null; File Information Bean
Long[] Nstartpos; Start position
Long[] Nendpos; End Position
Filesplitterfetch[] Filesplitterfetch; Child Thread Object
Long nfilelength; File length
Boolean bfirst = true; Whether to fetch files for the first time
Boolean bstop = false; Stop flag
File tmpfile; Temporary information for file downloads
DataOutputStream output; Output stream to File
Public Sitefilefetch (Siteinfobean Bean) throws IOException
{
Siteinfobean = Bean;
Tmpfile = File.createtempfile ("Zhong", "1111", New File (Bean.getsfilepath ());
Tmpfile = new File (Bean.getsfilepath () +file.separator + bean.getsfilename () + ". Info");
if (Tmpfile.exists ())
{
Bfirst = false;
Read_npos ();
}
Else
{
Nstartpos = new Long[bean.getnsplitter ()];
Nendpos = new Long[bean.getnsplitter ()];
}
}
public void Run ()
{
Get file length
Split file
Instance Filesplitterfetch
Start Filesplitterfetch Thread
Wait for child thread to return
try{
if (Bfirst)
{
Nfilelength = GetFileSize ();
if (nfilelength = = 1)
{
System.err.println ("File Length is not known!");
}
else if (nfilelength = 2)
{
System.err.println ("File is not access!");
}
Else
{
for (int i=0;i<nstartpos.length;i++)
{
Nstartpos[i] = (long) (i* (nfilelength/nstartpos.length));
}
for (int i=0;i<nendpos.length-1;i++)
{
Nendpos[i] = nstartpos[i+1];
}
Nendpos[nendpos.length-1] = nfilelength;
}
}
String sURL; File URL
Long Nstartpos; File Snippet Start Position
Long Nendpos; File Snippet End Position
int nthreadid; Thread ' s ID
Boolean bdownover = false; Downing is Over
Boolean bstop = false; Stop identical
Fileaccessi Fileaccessi = null; File Access Interface
Public Fileaccessi () throws IOException
{
This ("", 0);
}
Public Fileaccessi (String Sname,long NPOs) throws IOException
{
Osavedfile = new Randomaccessfile (sname, "RW");
This.npos = NPOs;
Osavedfile.seek (NPOs);
}
public synchronized int write (byte[] b,int nstart,int nlen)
{
int n =-1;
try{
Osavedfile.write (B,nstart,nlen);
n = nlen;
}
catch (IOException E)
{
E.printstacktrace ();
}
return n;
}
}
/*
**siteinfobean.java
*/
Package Netfox;
public class Siteinfobean {
Private String Ssiteurl; Site ' s URL
Private String Sfilepath; Saved File ' s Path
Private String sFileName; Saved File ' s Name
private int nsplitter; Count of splited downloading File
Public Siteinfobean ()
{
Default value of Nsplitter is 5
This ("", "", "", 5);
}
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.