標籤:空格 ref https values art pytho request seq should
0. 參考【整理】關於http(GET或POST)請求中的url地址的編碼(encode)和解碼(decode)python3中的urlopen對於中文url是如何處理的?中文URL的編碼問題1. rfc1738
2.1. The main parts of URLs A full BNF description of the URL syntax is given in Section 5. In general, URLs are written as follows: <scheme>:<scheme-specific-part> A URL contains the name of the scheme being used (<scheme>) followed by a colon and then a string (the <scheme-specific-part>) whose interpretation depends on the scheme. Scheme names consist of a sequence of characters. The lower case letters "a"--"z", digits, and the characters plus ("+"), period ("."), and hyphen ("-") are allowed. For resiliency, programs interpreting URLs should treat upper case letters as equivalent to lower case in scheme names (e.g., allow "HTTP" as well as "http").
注意字母不區分大小寫2. python22.1
1 >>> import urllib 2 >>> url = ‘http://web page.com‘ 3 >>> url_en = urllib.quote(url) #空格編碼為“%20” 4 >>> url_plus = urllib.quote_plus(url) #空格編碼為“+” 5 >>> url_en_twice = urllib.quote(url_en) 6 >>> url 7 ‘http://web page.com‘ 8 >>> url_en 9 ‘http%3A//web%20page.com‘10 >>> url_plus11 ‘http%3A%2F%2Fweb+page.com‘12 >>> url_en_twice13 ‘http%253A//web%2520page.com‘ #出現%25說明是二次編碼14 #相應解碼15 >>> urllib.unquote(url_en)16 ‘http://web page.com‘17 >>> urllib.unquote_plus(url_plus)18 ‘http://web page.com‘
2.2 URL含有中文
1 >>> import urllib2 >>> url_zh = u‘http://movie.douban.com/tag/美國‘3 >>> url_zh_en = urllib.quote(url_zh.encode(‘utf-8‘)) #參數為string4 >>> url_zh_en5 ‘http%3A//movie.douban.com/tag/%E7%BE%8E%E5%9B%BD‘6 >>> print urllib.unquote(url_zh_en).decode(‘utf-8‘)7 http://movie.douban.com/tag/美國
3. python33.1
1 >>> import urllib 2 >>> url = ‘http://web page.com‘ 3 >>> url_en = urllib.parse.quote(url) #注意是urllib.parse.quote 4 >>> url_plus = urllib.parse.quote_plus(url) 5 >>> url_en 6 ‘http%3A//web%20page.com‘ 7 >>> url_plus 8 ‘http%3A%2F%2Fweb+page.com‘ 9 >>> urllib.parse.unquote(url_en)10 ‘http://web page.com‘11 >>> urllib.parse.unquote_plus(url_plus)12 ‘http://web page.com‘
3.2 URl含中文
1 >>> import urllib2 >>> url_zh = ‘http://movie.douban.com/tag/美國‘3 >>> url_zh_en = urllib.parse.quote(url_zh)4 >>> url_zh_en5 ‘http%3A//movie.douban.com/tag/%E7%BE%8E%E5%9B%BD‘6 >>> urllib.parse.unquote(url_zh_en)7 ‘http://movie.douban.com/tag/美國‘
4. 其他
1 >>> help(urllib.urlencode) 2 Help on function urlencode in module urllib: 3 4 urlencode(query, doseq=0) 5 Encode a sequence of two-element tuples or dictionary into a URL query string. 6 7 If any values in the query arg are sequences and doseq is true, each 8 sequence element is converted to a separate parameter. 9 10 If the query arg is a sequence of two-element tuples, the order of the11 parameters in the output will match the order of parameters in the12 input.13 14 >>>
URL地址編碼和解碼