This article mainly introduces the Python urllib crawl Baidu Cloud Connection instance code, has certain reference value, the interested small partners may refer to
Look at their previous written procedures, found that wrote a crawl a lot of Baidu Cloud resources, is to see the Transformers to write their own, and at that time the first contact Python probably wrote 2 days to get out of this program, Learning Python language, you can see the code at that time to write the real low. Although now is not how, haha, has been studying, do not do too much explanation, on the code, because the variable declaration is what I also forget (manual proud), even write files at that time will not haha haha haha, also do not know class can be initialized by Init, Alas learning Python I learned so much, thanks to Python.
From BS4 import beautifulsoupimport urllibimport requestsimport readr =[] "" url-encode the name of the search resource ' search_text =raw_input (' Please enter the search resource Name: ') Search_text = Search_text.decode (' GBK ') Search_text = Search_text.encode (' utf-8 ') Search_text = Urllib.quote (search_text) ' Get file address ' home = Urllib.urlopen (' http://www.panduoduo.net/s/name/' +search_text) ' Get Baidu Cloud Address ' def GETBAIDU (ADR): For i in adr:url = Urllib.urlopen (' http://www.panduoduo.net ' +i) bs = BeautifulSoup (Ur L) BS1 = Bs.select ('. Dbutton2 ') href = Re.compile (' http\% (\%|\d|\w|\/\/|\/|\.) * ') b = Href.search (str (BS1)) name = str (bs.select ('. Center ')). Decode (' utf-8 ') Text1 = Re.compile (' \