Copyright NOTICE: This article for Bo Master study record, reprint please indicate source () Urlparse ()
#Urllib.parse.urlparse (urlstring,scheme= ", allow_fragments=true)#URLString: This is a required entry, that is, the URL to parseresult = Urlparse ('http://www.baidu.com/index.html;user?id=5#comment') Print(Type (result), result)#scheme: It is the default protocol that takes effect only if the URL does not contain scheme informationresult = Urlparse ('www.baidu.com/index.html;user?id=5#comment', scheme='HTTPS') Print(Result)#allow_fragments: Whether to ignore fragment set to False will be ignored, it will be resolved to path,parameters or part of query, and the fragment part is emptyresult = Urlparse ('http://www.baidu.com/index.html;user?id=5#comment', allow_fragments=False)Print(Result)#The result of the return is actually a tuple, which we can get by index, or we can use the property name to get#Parseresult Property Scheme (://Protocol), Netloc (/domain), path (access path), params (; parameter), query (search condition), fragment (# anchor)result = Urlparse ('http://www.baidu.com/index.html;user?id=5#comment', allow_fragments=False)Print(result.scheme,result[0],result.netloc,result[1],sep='\ n')
Urlunparse ()
# urlunparse () length must be 6 parameters, or it will throw a problem with insufficient or excessive number of parameters data = ['http','www.baidu.com','index.html ','user','a=6',' Comment ' ] print
Urlsplit ()
#the Urlsplit () and Urlparse () methods are similar and no longer parse the params parameterresult = Urlsplit ('http://www.baidu.com/index.html;user?id=5#comment') Print(Result)#The returned result is also a tuple, which can be obtained by index, or it can be obtained by using the property name.#Splitresult Property Scheme (://Protocol), Netloc (/domain), path (access path), query (query condition), fragment (# anchor)result = Urlparse ('http://www.baidu.com/index.html;user?id=5#comment', allow_fragments=False)Print(Result.scheme,result[0])
Urlunsplit ()
# the Urlunsplit () and Urlunparse () methods are similar, except that the length becomes 5 parameters data = ['http','www.baidu.com','index.html ','a=6','comment'] Print
Urljoin ()
#Urljoin () is also a way of merging links, compared to the previous two methods, without the length of the specified Parameter object before the specific#the Urljoin () method provides two parameters, Base_url (the underlying link) as the first argument, and the new link as the second parameter, which parses the Base_url Scheme,netloc and path#these three content and supplement the true part of the new link, and finally return the resultPrint(Urljoin ('http://www.baidu.com','https://cuiqingcai.com/FAQ.html')) Print(Urljoin ('http://www.baidu.com/about.html','https://cuiqingcai.com/FAQ.html')) Print(Urljoin ('http://www.baidu.com','faq.html')) Print(Urljoin ('http://www.baidu.com/about.html','https://cuiqingcai.com/FAQ.html?question=2')) Print(Urljoin ('HTTP://WWW.BAIDU.COM?WD=ABC','https://cuiqingcai.com/index.php')) Print(Urljoin ('http://www.baidu.com','? category=2#comment')) Print(Urljoin ('www.baidu.com','? category=2#comment')) Print(Urljoin ('www.baidu.com#comment','? category=2'))
UrlEncode ()
# UrlEncode () serialization dictionary type conversion to request type params = { 'name''germey', ' Age ' : ' http://www.baidu.com? ' ' url = base_url + urlencode (params) print(URL)
Parse_qs ()
# Parse_qs () deserialization reverses the request parameter back to the dictionary parameter from Import Parse_qs ' name=germey&age=22 ' params = Parse_qs (query) print(params)
PARSE_QS1 ()
# parse_qs1 () It is used to convert the parameters into a list of tuples from Import PARSE_QSL ' name=germey&age=22 ' params = PARSE_QSL (query) Print
QUOTE ()
# quote () This method can be converted to URL-encoded format, URL with Chinese parameters, it is also possible to cause garbled problems, in this way can be converted to a URL code from Import Quote ' Wallpapers ' ' http://www.baidu.coms?wd= ' + quote (keyword) Print (URL)
Unquote ()
# unquote () it corresponds to the above method and can be URL decoded from Import unquote ' http://www.baidu.coms?wd=%E5%A3%81%E7%BA%B8 ' Print (unquote (URL))
Python-urllib Library Parse module parsing links common methods