python3.4 Crawler Bulk Download music

Source: Internet
Author: User

Recently learning Python, using the version of python3.4, the development environment for Eclipse using the Pydev plugin. Just think http://www.dexiazai.com/?page_id=23 on the music good, decided to use Python bulk download down.

1. Music Address

After analysis, the address of the page embedded in the shrimp player is as follows, followed by a comma-delimited character for the music ID, such as the music address is http://www.xiami.com/song/2088578

<span style= "FONT-SIZE:14PX;" ><span style= "FONT-SIZE:14PX;" > <embed src= "http://www.xiami.com/widget/0_    2088578,163414,603408,1769697211,1519594,1944911,1769482546,1771498226,2050705,1769213208,2957497, 1769779902,1769897948,1770077250,1771638197,109133,1769220090,3469026,2456779,3673869,385167,1769528393,1770130506,101458    2,3418745,1769554460, 1769692638,1279925,1769582513,136064,1769528375,1769455237,1769075782,2095209,1770618381,3427512,2108249,1771186364,20875    41,1769384565,1770432131, 2149137,2083819,1768911382,3429194,2089207,1770177060,1770427913,1769279049,2089339,2085205,3437055,3646041,2070983,20707 41,3619123,1770068122, 2082956,2071004,1768679,1769683697,3567557,109133,1769572701,2152946,3489617,1770292731_ 235_346_ff8719_494949_1/multiplayer.swf "type=" Application/x-shockwave-flash " width=" 235 "height=" "wmode" = "opaque" ></EMBED></SPAN></SPAN> 

After analysis, the XML information of music can be queried in http://www.xiami.com/song/playlist/id/2088578/object_name/default/object_id/0, Where location is an encrypted source address that can be decrypted to get the correct address. The specific operation can be referred to the "Python Crawl Shrimp Music" this blog.


2. Get the ID of all music, form the list

<span style= "FONT-SIZE:14PX;" ><span style= "FONT-SIZE:14PX;" >    dexiazai_url= "http://www.dexiazai.com/?page_id=23"    req=urllib2. Request (Dexiazai_url, headers={    ' Connection ': ' keep-alive ',    ' Accept ': ' text/html, application/xhtml+xml, * * * ',    ' accept-language ': ' en-us,en;q=0.8,zh-hans-cn;q=0.5,zh-hans;q=0.3 ',    ' user-agent ': ' mozilla/5.0 ( Windows NT 6.3; WOW64; trident/7.0; rv:11.0) like Gecko '    })    Response=urllib2.urlopen (req)    content=response.read (). Decode (' Utf-8 ')    Pattern=re.compile (' <embed.*?src= ' http://www.xiami.com/widget/0_ (. *?) /multiplayer.swf "', Re. S)    Ids=re.search (pattern,content). Group (1)    idarr=ids.split (",")      </span></span>

3. Get the music name (plus serial number)

<span style= "FONT-SIZE:14PX;" ><span style= "FONT-SIZE:14PX;" > url= "http://www.xiami.com/song/" +str (Idarr[i]) print ("==================num:" +str (i) + "================ ======= ") print (URL) #获取歌词名 req=urllib2.        Request (URL, headers={' Connection ': ' keep-alive ', ' Accept ': ' text/html, Application/xhtml+xml, */* ', ' Accept-language ': ' en-us,en;q=0.8,zh-hans-cn;q=0.5,zh-hans;q=0.3 ', ' user-agent ': ' mozilla/5.0 (Windows NT 6.3; WOW64; trident/7.0; rv:11.0) like Gecko '}) Rep=urllib2.urlopen (req) Cont=rep.read (). Decode (' Utf-8 ') Pat=re.compil E (' <div.*?id= ' title > (. *?) 

5. Effect





6, use Cx_freeze packaging release exe

Because python3.4 in Py2exe or Pyinstaller release a bit of a problem (not supported), so with Cx_freeze release, Cx_freeze for Http://www.lfd.uci.edu/~gohlke/pythonlibs /#cx_freeze I downloaded the CYTHON?0.22?CP34?NONE?WIN32.WHL, is the python3.4 installation directory using py3.4 install D:\****\cython?0.22?cp34?none? WIN32.WHL installation.

And then in python3.4 's installation directory \lib\site-packages\cx_freeze\samples\ PYQT4 Copy the setup.py and edit the installation file name as the file to be published, and then execute the Python setup.py build command at the command line, generate the build folder with the executable file Xiami_download_ Dexiazai.exe






python3.4 Crawler Bulk Download music

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.