Recently done in doing Epartner project, involving file upload problem. Previously did file upload, but are small files, not more than 2M. This request uploads more than 100M of things. There is no way to find the data to study it. web-based file upload can use FTP and HTTP two kinds of protocol, although the transmission is stable with FTP, but security is a serious problem, and the FTP server read user library access permissions, so that users are still not easy to use. Only HTTP is left. There are 3 ways in HTTP, put, WEBDAV, RFC1867, the first 2 methods are not suitable for large file uploads, the Web uploads we use now are based on the RFC1867 standard HTML based on the form file upload.
First, briefly introduce the RFC1867 (form-based File Upload in HTML) Standard:
1. HTML form with file submission function
The existing HTML specification defines eight possible values for the type attribute of the INPUT element, respectively: CHECKBOX, HIDDEN, IMAGE, PASSWORD, RADIO, RESET, SUBMIT, TEXT. In addition, when the form is post, the form defaults to the "application/x-www-form-urlencoded" Enctype property.
The RFC1867 standard makes two modifications to HTML:
1 adds a file option for the type attribute of the INPUT element.
2 The input tag can have a accept property that specifies the file type or list of file formats that can be uploaded.
In addition, this standard defines a new MIME type: Multipart/form-data, and when dealing with a enctype= "multipart/form-data" and/or containing <input type= "file" > The label of the form should be taken when the behavior.
For example, when an HTML form author wants to allow a user to upload one or more files, he can write:
<form enctype= "Multipart/form-data" action= "_url_" method=post>
File to process:
<input name= "Userfile1" type= "File" >
<input type= "Submit" value= "Send File" >
</FORM>
The change to be made in the HTML DTD is to add an option to the InputType entity. In addition, we recommend using a series of comma-delimited file types as the Accept property of the input tag.
... (Other elements) ...
<! ENTITY% inputtype "(TEXT | PASSWORD | CHECKBOX |
RADIO | SUBMIT | RESET |
IMAGE | HIDDEN | FILE) ">
<! ELEMENT INPUT-0 empty>
<! Attlist INPUT
TYPE%inputtype TEXT
NAME CDATA #IMPLIED-required for all but submit and reset
VALUE CDATA #IMPLIED
SRC%uri #IMPLIED--for image inputs--
CHECKED (CHECKED) #IMPLIED
SIZE CDATA #IMPLIED--like NUMBERS,
But delimited with comma
MAXLENGTH number #IMPLIED
ALIGN (Top|middle|bottom) #IMPLIED
ACCEPT CDATA #IMPLIED--list of content types
>
... (Other elements) ...
2. File Transfer delay
In some cases, it is recommended that the server validate certain elements in the form data (such as user name, account number, etc.) before it is actually ready to accept the data. However, after some consideration, we think that if the server wants to do this, it is best to take a series of forms and return the data elements that were previously validated as "hidden" to the client, or by arranging the form so that the elements that need to be validated are displayed first. In this way, a server that needs to do a complex application can maintain its own state of transaction processing, and those simple applications can be implemented simply.
The HTTP protocol may need to know the total length of the content in the entire transaction. Even if there is no explicit requirement, HTTP clients should also provide the total content length of all uploaded files, so that a busy server can determine whether the contents of the file are too large to be fully processed, returning an error code and closing the connection without waiting for all the data to be accepted. At present, some existing CGI applications need to know the total length of the content for all post transactions.
If the input tag contains a MaxLength property, the client can view this property value as the maximum number of bytes that the server-side can accept for the transfer file. In this case, the server can prompt the client how much space on the server can be used for file uploads before the upload begins. It should be noted, however, that this is just a hint that the actual requirements of the server may change after the form is created and before the file is uploaded.
In any case, if the accepted file is too large, it is possible for any HTTP server to interrupt the transmission during file transfer.
3. Other solutions to transmit binary data
Some people have suggested using a new MIME type "aggregate", such as aggregate/mixed or content-transfer-encoding "packages", to describe binary data that is indeterminate in length, rather than being decomposed into multiple parts. Although we are not opposed to doing so, this requires additional design and standardization work to get people to accept and understand "aggregate". On the other hand, the "split into multiple" mechanism works well, can be implemented very simply on the client sender and server receiver, and works as efficiently as some other ways of synthesizing binary data.
4. Examples
Assume that the server segment provides the following HTML:
<form action= "Http://server.dom/cgi/handle"
Enctype= "Multipart/form-data"
Method=post>
What is your name? <input Type=text name=submitter>
What files are you sending? <input Type=file name=pics>
</FORM>
The user fills in "The name" field inside "Joe Blow", to the question ' What files are you sending? ', the user chooses
A text file "File1.txt".
The customer segment may send back data as follows:
Content-type:multipart/form-data, boundary=aab03x
--aab03x
Content-disposition:form-data; Name= "Field1"
Joe Blow
--aab03x
Content-disposition:form-data; Name= "Pics"; Filename= "File1.txt"
Content-type:text/plain
... the content of file1.txt ...
--aab03x--
If the user also selects another picture file "File2.gif", the data that the client may send will be:
Content-type:multipart/form-data, boundary=aab03x
--aab03x
Content-disposition:form-data; Name= "Field1"
Joe Blow
--aab03x
Content-disposition:form-data; Name= "Pics"
Content-type:multipart/mixed, BOUNDARY=BBC04Y
--bbc04y
Content-disposition:attachment; Filename= "File1.txt"
Content-type:text/plain
... the content of file1.txt ...
--bbc04y
Content-disposition:attachment; Filename= "File2.gif"
Content-type:image/gif
Content-transfer-encoding:binary
... the content of file2.gif ...
--bbc04y--
--aab03x--
Second, the use of RFC1867 standard processing file upload two ways:
1. Get the uploaded data at once, then analyze the processing.
After looking at the N code, I found that no component program and some COM components are currently using the Request.BinaryRead method. Get the uploaded data at once, then analyze the processing. This is why uploading large files is slow, the IIS timeout does not say, even if the hundreds of m file up, analysis processing also have a while.
2. While receiving the file, write the hard drive.
Understand the foreign business components, more popular with Power-web,aspupload,activefile,abcupload, Aspsmartupload,sa-fileup. One of the better is aspupload and sa-file, they claim to be able to handle 2G of files (sa-file EE version even without file size limit), but also the efficiency is very good, is not the efficiency of programming language so much? I looked up some information and thought that they were all directly manipulating file streams. This is not subject to file size constraints. But the foreigner's thing also is not absolutely perfect, aspupload processing big file, memory occupies the situation to be astonishing. 1G or so are commonplace. As for Sa-file, though it is a good thing, it is hard to find. And then found 2 paragraphs. NET upload components, Lion.Web.UpLoadModule and aspnetupload are also operation file streams. But upload speed and CPU occupancy rate are not as good as the business components of foreigners.
Did a test, LAN incoming 1G files. AspUpload upload speed average is 4.4m/s,cpu occupy 10-15, Memory occupies 700M. Sa-file is similar to this. And the fastest aspnetupload is only 1.5m/s, the average is 700k/s,cpu occupancy 15-39, test environment: piii800,256m memory, 100M LAN. I think aspnetupload slow is probably because one side receives the file and writes the hard drive. The cost of low resource consumption is to reduce transmission speed. But also have to admire the program of foreigners, CPU occupancy so low ....
Third, ASP. NET upload files encountered problems
We have had this or that problem when uploading large files with asp.net. Setting a large maxRequestLength value does not completely solve the problem, because ASP.net blocks until the entire file is loaded into memory and then processed. In fact, if the file is very large, we often see Internet Explorer display "The page cannot be displayed-cannot find server or DNS Error", it seems to be how can not catch this Error. Why? Because this is a client side error, the server side end Application_Error is not processed.
Four, ASP. NET large File Upload solution
The solution is to use the implied httpworkerrequest, using its getpreloadedentitybody and Readentitybody methods to read data in chunks from the pipe created by IIS for asp.net. Chris Hynes provides us with a scheme (with HttpModule) that allows you to upload a large file, in real time, to show the upload progress.
Lion.Web.UpLoadModule and Aspnetupload two. NET components are used in this scenario.
Scheme principle:
Use HttpHandler to implement functionality similar to ISAPI extention, processing requests (request) information, and sending responses (Response).
Programme highlights:
1. HttpHandler or HttpModule
A. Intercepting the request before the ASP.net process processes request requests
B. Block read and write data
C. Real-time tracking upload progress update meta information
2. Using implicit httpworkerrequest to process file streams with its getpreloadedentitybody and Readentitybody methods
IServiceProvider Provider = (IServiceProvider) httpcontext.current;
HttpWorkerRequest wr = (HttpWorkerRequest) provider. GetService (typeof (HttpWorkerRequest));
Byte[] bs = WR. Getpreloadedentitybody ();
....
if (!WR. Isentireentitybodyispreloaded ())
{
int n = 1024;
byte[] BS2 = new Byte[n];
while (WR. Readentitybody (bs2,n) >0)
{
.....
}
}
3. Custom multipart MIME Parser
Automatic interception of MIME delimiters
Block a file like a temporary file
Real-time update appliaction status (Receivingdata, error, complete)
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.