Uploading files under Django is a simple task, especially when using request. Files. However, in terms of performance, it is not very good.
The Django version I am using is SVN trunk revision 6635. When uploading large files, the memory usage and CPU usage are high. After checking the code, the code for obtaining the uploaded file is in Django/HTTP/_ init __. in the parse_file_upload method of the py file, this method will parse the data from the client post and put it into the post and files collections. The Code is as follows:
Def parse_file_upload (header_dict, post_data ):
"Returns a tuple of (post querydict, files multivaluedict )"
Import email, email. Message
From CGI import parse_header
Raw_message = ''. Join (['% s: % s' % pair for pair in header_dict.items ()])
Raw_message + = ''+ post_data
MSG = Email. message_from_string (raw_message)
Post = querydict ('', mutable = true)
Files = multivaluedict ()
For submessage in MSG. get_payload ():
If submessage and isinstance (submessage, email. message. Message ):
Name_dict = parse_header (submessage ['content-disposition']) [1]
# Name_dict is something like {'name': 'file', 'filename': 'test.txt '} for file uploads
# Or {'name': 'blah'} for post Fields
# We assume all uploaded files have a 'filename' set.
If 'filename' in name_dict:
Assert type ([])! = Type (submessage. get_payload (), "nested MIME messages are not supported"
If not name_dict ['filename']. Strip ():
Continue
# Ie submits the full path, so trim everything but the basename.
# (We can't use OS. Path. basename because that uses the server's
# Directory separator, which may not be the same as
# Client's one .)
Filename = name_dict ['filename'] [name_dict ['filename']. rfind ("/") + 1:]
Files. appendlist (name_dict ['name'], filedict ({
'Filename': filename,
'Content-type': 'content-type' in submessage and submessage ['content-type'] or none,
'Content': submessage. get_payload (),
}))
Else:
Post. appendlist (name_dict ['name'], submessage. get_payload ())
Return post, files
It's really amazing to have a look. All the uploaded content is stored in the memory (raw_message). If someone uploads a large file, it is estimated that the server will be killed. In modpython. py and wsgi. py, both call parse_file_upload to obtain files.
I checked the information online and many people asked similar questions. Some people say that in the production environment, the data size of the client POST request can be limited on the front-end web server (such as APACHE. Someone also proposed to improve the upload mechanism in Django and use temporary files to handle large file uploads. I think this method is more reliable. There is a 2070 ticket on Django's TRAC. This problem has been improved since last year. I checked the latest update date, which is October 27 and seems to be coming soon. For more information, see:
Ticket #2070
Also, this document
Streaming file uploads with Django