In Python, we typically use the tools provided in URLLIB2 to complete HTTP requests, such as post data to the server. Typically, all data is URL-encoded and Content-type is set to application/x-www-form-urlencoded. However, in some special cases (such as server restrictions that do not allow this type of data submission) or when uploading files, a post submission in multipart/form-data format is required.
At such times, we can manually encapsulate the data, as the following code does:
12345678910111213141516171819202122232425262728 |
def
encode_multipart_formdata(fields, files):
"""
fields is a sequence of (name, value) elements for regular form fields.
files is a sequence of (name, filename, value) elements for data to be uploaded as files
Return (content_type, body) ready for httplib.HTTP instance
"""
BOUNDARY
=
mimetools.choose_boundary()
CRLF
=
‘\r\n‘
L
=
[]
for
(key, value)
in fields:
L.append(
‘--‘
+
BOUNDARY)
L.append(
‘Content-Disposition: form-data; name="%s"‘
%
key)
L.append(‘‘)
L.append(value)
for
(key, filename, value)
in
files:
L.append(
‘--‘
+
BOUNDARY)
L.append(
‘Content-Disposition: form-data; name="%s"; filename="%s"‘
%
(key, filename))
L.append(
‘Content-Type: %s‘
%
get_content_type(filename))
L.append(‘‘)
L.append(value)
L.append(
‘--‘
+
BOUNDARY
+
‘--‘
)
L.append(‘‘)
body
=
CRLF.join(L)
content_type
=
‘multipart/form-data; boundary=%s‘
%
BOUNDARY
return
content_type, body
def
get_content_type(filename):
return
mimetypes.guess_type(filename)[
0
]
or
‘application/octet-stream‘
|
The Encode_multipart_formdata () method is the protagonist here, which encapsulates all post data and returns a tuple of the content_type and post data Entities (body).
With the above function, we will then use the Httpconnection to complete the request process:
123456789 |
def
post_data(host, path, fields, files):
content_type, body
= encode_multipart_formdata(fields, files)
client
=
httplib.HTTPConnection(host, port)
headers
=
{
‘content-type‘
: content_type}
client.request(
‘POST‘
, path, body, headers)
response
=
client.getresponse()
return
response.read()
|
Python crawler: Post entity encapsulation and submission in multipart/form-data format