Organization: China Interactive publishing network (http://www.china-pub.com /)
RFC document Chinese Translation Plan (http://www.china-pub.com/compters/emook/aboutemook.htm)
Copyright: The copyright of this Chinese translation document belongs to China Interactive publishing network. It can be freely reproduced for non-commercial purposes, but must
Retain the translation and copyright information of this document.
Network Working Group E. Nebel
Request for comments: 1867 L. masinter
Category: Experimental Xerox Corporation
Form-based file upload in HTML
(Rfc1867 form-based file upload in HTML)
Status of this Memorandum
This memorandum describes an Internet community trial protocol. This memorandum does not set any Internet standards and it requires
Further discussions and suggestions are required for improvement. The publication of this memorandum is unrestricted.
1. Summary 2
2. HTML form 2 with file submission Function
3. Recommended applications 3
3.1 file component display 4
3.2 Action 4 after submission
3.3 multipart/form-data usage 4
3.4 explanation of other attributes 5
4. Considerations for backward compatibility 5
5. Other considerations 6
5.1 compression, encryption 6
5.2 file transmission latency 6
5.3 Other solutions for binary data transmission 7
5.4 do not modify <input> 7
5.5 default field content type 8
5.6 allow action to point to "mailto:" 8
5.7 remote file transmitted by a third party 8
5.8 use enctype = x-WWW-form-urlencoded to transmit file 8
5.9 use CRLF as the line separator 8
5.10 relationship with multipart/related 9
5.11 field names that contain non-ASCII codes 9
6. Example 9
7. multipart/form-data registration 10
8. security considerations 11
9. Conclusion 11
Author address: 12
A. Media type registered for multipart/form-data 12
Currently, HTML forms allow form writers to about users who browse forms through forms. In many
In the application entered by the user, the form is proved to be very useful. However, because HTML forms are not provided for use
Users can upload files or data. This capability is limited. So those that need to get from the user
File service providers have to build their own applications. (We can go to the www-talk email list.
Find the example of this type of customer browser .) Since File Upload is a feature that can benefit many applications, this makes
HTML extensions are required so that information providers can process File Upload requests and upload files in a unified manner.
The response provides a uniform mime-compatible representation. This solution also includes a backward compatible policy.
So that new servers can interact with existing HTML clients.
This recommendation is independent of the existing versions of HTML.
2. HTML form with file submission Function
The existing HTML Specification defines eight possible values for the Type attribute of the input element: checkbox,
Hidden, image, password, radio, reset, submit, text. In addition, when the form uses
In post mode, the form has the "Application/X-WWW-form-urlencoded" enctype by default.
We recommend that you make two changes to HTML:
1) added a file option for the Type attribute of the input element.
2) The input tag can have the accept attribute, which can specify the file type or format that can be uploaded.
In addition, we recommend that you define a new MIME type: multipart/form-data.
Enctype = "multipart/form-Data" and/or a form containing the markup of <input type = "file"> should
These changes can be considered completely independent, but they are necessary for reasonable File Upload requirements.
For example, when the author of an HTML form wants the user to upload one or more files, he can write as follows:
<Form enctype = "multipart/form-Data" Action = "_ URL _" method = post>
File to process: <input name = "userfile1" type = "file">
<Input type = "Submit" value = "Send File">
The change to the html dtd is to add an option for the inputtype object. We also recommend that you use
A series of file types separated by commas (,) are used as the accept attribute of the input tag.
... (Other elements )...
<! Entity % inputtype "(Text | password | checkbox |
Radio | submit | reset |
Image | hidden | file) ">
<! Element input-0 empty>
<! ATTLIST Input
Type % inputtype text
Name CDATA # implied -- required for all but submit and reset
Value CDATA # implied
SRC % URI # implied -- for image inputs --
Checked (checked) # implied
Size CDATA # implied -- like numbers,
But delimited with comma, not space
Maxlength number # implied
Align (top | Middle | bottom) # implied
Accept CDATA # implied -- List of content types
... (Other elements )...
3. Recommended applications
Because the user side has multiple ways to choose the most appropriate way to interpret HTML content, this section is specific to one of the following:
We recommend that you upload files on the WWW browser.
3.1 display of the file component
When the browser encounters a file-type input mark, it will display a file name (or the selected
And a browse button or similar selection method. Select this browse (Browse)
The button will trigger the selection method of the file corresponding to the platform on which the browser runs. For example
The browser will pop up a file selection window. In this file selection window, you can replace existing
To add a new file. The browser designer can determine whether the selected file name list is
Can be manually modified.
If the tag has the accept attribute, the browser can also restrict the file types that match the platform.
3.2 actions after submission
When the user fills out the form and selects the submit element, the browser should set the content of the form and the selected file
. For transmitting large binary data or text that contains non-ASCII characters,
The application/X-WWW-form-urlencoded encoding type is far from meeting the requirements. Therefore, we propose
A new media type: multipart/form-data, which is used to transmit the filled form content from the client to the host.
3.3 multipart/form-data usage
Section 7th defines multipart/form-data. The most extreme case is that the selection does not include any number.
Data. (This option is very likely in some cases .) As part of the data stream, each item in the form
They are all sent in the order they appear in the form. Each part is marked by input in an HTML form.
. If the type of this part of content is known, it is identified by the corresponding media content (for example,
It can be known from the file extension or related type information of the operating system). Otherwise, it will be identified
If multiple files are selected for upload, they must be transmitted in the multipart/mixed format.
Although the HTTP protocol can transmit binary data in any form, mail Transmission (for example, if the form action
Is in the form of mailto) the default method is 7-bit encoding. However, if the transmitted content is not compatible with the default encoding method
The transmitted content must be encoded with a "content-transfer-encoding" header.
(For more information, see section 1521 of RFC 5th ).
The original file name to be uploaded should also be transmitted together, or used as the filename parameter, or
'Content-Disposition: Form-data' header. If multiple files are transmitted, it can also be in the sub-content
The title header of 'content-Disposition: file. The client application should provide the file name whenever possible. If the client operates
The file name on the system contains non-US-ASCII characters, the file name can use similar characters or according to rfc1522
. This is convenient in some situations. For example, the uploaded files may contain
For example, a Tex file may have an additional description file with the suffix. sty.
On the server side, action may point to an HTTP address and use CGI to process the form. In
In this case, the CGI program will notice that the content type is multipart/form-data, and take measures to deal with different
(Check validity, write files to the disk in the processing order, etc)
3.4 explanation of other attributes
<Input type = File> the tag can have a value attribute to specify the default file name. This may affect
Platform independence, but it may also be very useful. For example, you can avoid using
Users constantly choose the same file name.
You can use "size = width, height" to specify the size attribute. The default width is the file name width, and the height is the selected
The size of the display area of the file list. For example
This is useful for users who input multiple lines of files (and, of course, there is a browse button next to it. When no
When the height value is specified, only a single row of file input boxes will be displayed (if the Form Designer only wants to upload one file
If the height is greater than 1, a multi-line input box with a scroll bar is displayed (if the Form Designer wants
Upload multiple files ).
4. Considerations for backward compatibility
Although a successful improvement solution does not have to consider this for the existing WWW form mechanism
A migration policy is also helpful: for users who use older browsers
They can also upload files. Most existing browsers encounter <input type = File>
It is treated according to <input type = text> and a text input box is provided. In this box
Enter the file name. In addition, it seems that all existing browsers ignore the enctype parameter in the form element, and press
Send form data according to application/X-WWW-form-urlencoded.
In this case, if the data type is
Application/X-WWW-form-urlencoded instead of multipart/form-data.
The browser does not implement file upload.
In this case, the CGI on the server does not return a "text/html" response, but returns a data stream
The append program can process the data stream. This data stream may be identified as "application/X-please-send-Files" and contains
The following content:
? URL (ending with CRLF) to which form data is actually transmitted)
? List of field names that should contain the file content (separated by spaces, ending with CRLF)
? Data transmitted from the client to the application/X-WWW-form-urlencoded form on the server
In this case, the browser needs to be set so that an additional program can be started to process application/X-please-send-Files
The append program can process form data and notice that there are "local file names" that need to be used in actual files.
Replace fields. It may need to prompt the user to change or add the file list, and then reload the data and file content
Package it into multipart/form-data and send it back to the server.
The append can process forms as the new browsers actually process data and follow the original action
The specified URL address sends data. The advantage of this processing is that the server can use the "same" CGI for processing.
Old and new versions of browsers.
The append does not need to display form data, but "yes" ensures that the user is informed of the transmitted file. (This
This is to prevent malicious servers from requiring users to transmit files that are not required to be transmitted.
All problems .) It is helpful to display the status of the file being transferred.
5. Other considerations
5.1 compression, encryption
This solution does not consider possible File compression. After some consideration, we found that if you want your browser
When determining which files need to be compressed, the discussion on File compression optimization will become very complicated. Many connection Layer
Transmission protocols (such as high-speed modem) are used to compress data at the connection layer.
Optimization may not be very appropriate. If you really want to do so, you can ask the browser to choose whether to perform
Content-transfer-encoding X-compress compression, and extract data before processing data on the server
. However, this will not be discussed in this solution.
Similarly, this solution does not include data encryption mechanisms. This should be handled by other data confidentiality transfer protocols
Or https or email.
5.2 file transmission latency
In some cases, before you prepare to accept data, the server first,
Account) is recommended for verification. However, after some consideration, we think that if the server wants to do this
It is best to use a series of forms and send the previously verified data elements as "hidden" fields to the customer.
Users, or arrange a form to display the elements to be verified first. In this case, those that need to be complicated
The application server can maintain the state of transaction processing, while those simple applications can be implemented more simply.
The HTTP protocol may need to know the total length of the content in the entire transaction. Even if there are no clear requirements, the HTTP client
The total length of all uploaded files should also be provided, so that a busy server can determine whether the file content is
Otherwise, an error code is returned and the connection is closed, instead of waiting for acceptance.
All the data. Currently, some existing CGI applications need to know the content of all post transactions.
If the input tag contains a maxlength attribute, the client can regard this attribute as a server
The maximum number of bytes of the file to be transmitted. In this case, the server can prompt the customer before the upload starts.
How much space does the client have on the server for file upload. However, it should be noted that this is only an example
The actual requirements of the server may change after the form is created and before the file is uploaded.
In any case, if the received file is too large, Any HTTP server may transfer the file
Transmission is interrupted.
5.3 Other solutions for binary data transmission
Some people have suggested using a new MIME type "aggregate", such as aggregate/mixed or
Content-transfer-encoding "package" is used to describe binary data with uncertain length, instead of decomposing multiple binary data.
. Although we are not opposed to doing so, we need to add additional design and standardization work to let everyone know.
Understand "aggregate ". On the other hand, the "decomposed into multiple parts" mechanism works very well and can be very simple
Single is implemented on the client sender and server receiver, and can be used to process binary data like other methods.
Work as efficiently as possible.
5.4 do not modify <input>
Some people have mentioned why we need to modify the input to implement the file upload function, instead of providing
Completely different types? When <input> is used, the most important consideration is the compatibility policy.
In fact, the <input> MARK "already" has been modified to include various input data.
The <input> mark of the type. It seems more reasonable to enhance the <input>. Input "type"
It is not the type of the content it returns, but more like "multi-type", that is, it represents the party that interacts with the user.
. Its definition is carefully considered so that it can be used in both a text browser and a sound mark.
5.5 default field content type
Many fields in HTML require user input. In the past, how do people transmit the form data back to the server?
Some differences of opinion. However, considering the content of these input fields as plain text will obviously help eliminate this score.
. Before the client transfers the data back to the server, it should use CRLF to separate the data and compile it properly.
5.6 allow action to point to "mailto :"
Although it has nothing to do with this scheme, if you allow the client's form action to point to a "mailto:" address, it will certainly not
It is often used. This is a good idea, no matter how it is imagined. Similarly, the forms used to receive emails
Action should also point to "reply-to:" by default :". These two ideas help to help HTML forms use the HTTP service
But the content is sent by email. You can also do this: Allow HTML forms to be emailed
Send. after entering the form for the email recipient specified in HTML, the result is sent back as an email.
5.7 remote files transmitted by a third party
In some cases, users who operate the client software may want to transmit
Instead of local data files. In this case, the browser can send the client a connection pointing to remote data.
Instead of all the actual content? This requirement can actually be implemented. For example, if you want the customer
In the data sent to the server, "message/external-Body" is used to specify the data type.
"Access-type" is set to the connection address, and the sent content contains the URL address of the remote data.
5.8 use enctype = x-WWW-form-urlencoded to transmit files
If a form contains the <input type = File> element, but the form does not contain the enctype attribute
If the behavior is not described in detail. This may lead to improper server operations on a large amount of data.
Urn encoding, which is not expected by the server
5.9 use CRLF as the line Separator
Like all mime transmissions, CRLF is used as a row shard when the form content is transmitted in post mode.
5.10 relationship with multipart/related
The mimesgml team is considering developing a new type called multipart/related. It contains and
Multipart/form-data features similar. The use of Form-data is completely different from that of the application, so it is
In some cases, the HTML form content (including files) may be encoded as multipart/related,
However, this is quite different from the situation discussed in this solution.
5.11 field names that contain non-ASCII codes
Note that the MIME header is usually composed of a 7-bit US-ASCII character set. Therefore, if the field name is
If it does not belong to this character set, it must be encoded according to the method mentioned in RFC 1522. In HTML 2.0
Inside, the default character set is ISO-8859-1, and the field name composed of Non-ASCII characters must be encoded.
Assume that the server segment provides the following HTML:
<Form action = "http://server.dom/cgi/handle"
Enctype = "multipart/form-Data"
Method = post>
What is your name? <Input type = text name = submitter>
What files are you sending? <Input type = file name = pics>
In the "name" field, enter "Joe Blow". What files are you sending? ', Select
A text file "file1.txt ".
The customer segment may send back the following data:
Content-Type: multipart/form-data, boundary = aab03x
Content-Disposition: Form-data; name = "field1"
Content-Disposition: Form-data; name = "pics"; filename = "file1.txt"
Contents of... file1.txt...
-- Aab03x --
If you select another image file "file2.gif", the client may send the following data:
Content-Type: multipart/form-data, boundary = aab03x
Content-Disposition: Form-data; name = "field1"
Content-Disposition: Form-data; name = "pics"
Content-Type: multipart/mixed, boundary = bbc04y
Content-Disposition: attachment; filename = "file1.txt"
Contents of... file1.txt...
Content-Disposition: attachment; filename = "file2.gif"
... File2.gif content...
-- Bbc04y --
-- Aab03x --
7. multipart/form-data registration
The media content of multipart/form-data follows the multi-part data flow rules stipulated in RFC 1521. It is mainly used
To describe the data returned after the form is filled in. In a form (here it refers to HTML, of course, some other applications can also
Can use forms), there are a series of fields provided to the user to fill in, each field has its own name. In a confirmation
In the form, each name is unique.
Multipart/form-data consists of multiple parts, each part has a content-Disposition Header, its
The value is "form-Data". Its Attribute specifies its field name in the form. For example, 'content-Disposition:
Form-data; name = "XXXXX" '. Here, Xxxxx is the field name corresponding to this field. If the field name contains non-
ASCII characters should also be encoded according to the methods specified in RFC 1522.
For all multi-part MIME types, each part has an optional Content-Type. The default value is
Text/plain. If the content of the file is returned by filling in the form, the input file is defined
Application/octet-stream, or, if you know the type, it is defined as the corresponding media type. For example
If a form returns multiple files, they are used as multipart/mixed in multipart/form-data.
If the transmitted content does not conform to the default encoding method, this part will be encoded
The title header of "content-transfer-encoding.
The uploaded file may also be specified as a file name. The file name can be determined by the filename in the header "content-disposition ".
Parameter. Although this is not necessary, we strongly recommend that you do so when you know the original file name.
This is required or useful for many applications.
8. security considerations
If the user does not explicitly require a file to be sent, the user end should not send the file, which is very important. So,
When you encounter the <input type = file value = "YYYY"> tag, the HTML interpreter should be able to ensure that
Recognize the default file name. Do not use implicit fields to specify any files.
This scheme does not cover the discussion of data encryption; this should be a confidential data transmission protocol, or encrypted HTTP, or
Which is discussed by the encryption protocol provided by Moss (which is described in RFC 1848.
Once the file is uploaded successfully, it depends on the file recipient to process the file or store it in a proper place.
The recommended application gives the client great flexibility to determine the type and quantity of files it sends to the server.
The server has the right to decide whether to accept the uploaded files. It also gives the server the opportunity to access files that do not support the file type.
Input browser interaction.
Although the modification to the html dtd is very simple, it has a great effect. Allows the current lack of file upload mechanism
The World Wide Web implements many services. This will add a lot of amazing value to the actual performance of the World Wide Web.
Xerox Palo Alto Research Center
3333 coyote Hill Road
Palo Alto, CA 94304
Phone: (415) 812-4365
Fax: (415) 812-4333
Xsoft, Xerox Corporation
10875 Rancho Bernardo Road, Suite 200
San Diego, CA 92127-2116
Phone: (619) 676-7817
Fax: (619) 676-7865
A. Media type registered for multipart/form-Data
Media type name:
Child type name:
There is no additional consideration than other types.
Multipart/form-data does not introduce new security considerations to address problems that may exist in the attached content.
[RFC 1521] mime (multi-purpose Internet Mail Extension protocol) Part 1:
Online mail content format confirmation and standardization mechanism
N. Borenstein & N. Freed.
[RFC 1522] mime (multi-purpose Internet Mail Extension protocol) Part 2:
Expand the mail header of non-ASCII text
[RFC 1806] information communication and expression on the Internet
Information: Content-Disposition Header.
R. Troost & S. Dorner,
RFC 1867 form-based file upload in HTML
RFC document Chinese Translation Plan
(Please comment on the article)