This article mainly introduces the use of http://www.php.cn/wiki/1514.html "target=" _blank ">python crawl the music URL in QQ music and the implementation of bulk download of relevant information, This article gives the detailed introduction and the example code, believed to have the certain reference value to everybody, needs the friend to take a look below together.
Objective
QQ Music is still a lot of songs, sometimes want to download good music, but every time in the Web download is annoying login what. So, came a qqmusic reptile. At least I think the most important thing for a for-loop crawler is to find the URL where you want to crawl the element. Let's start looking for it (don't laugh at me)
Implemented as follows
#寻找url:
This URL does not want other sites so easy to find. I am tired of not light, the key is more data, from so many data inside to pick out useful data, and finally combination of music real music. When I did it yesterday, I sorted out a few intermediate URLs:
#url1: https://c.y.qq.com/soso/fcgi-bin/client_search_cp?&lossless=0&flag_qc=0&p=1&n=20&w= Rain Butterfly
#url2:https://c.y.qq.com/base/fcgi-bin/fcg_music_express_mobile3.fcg?&jsonpcallback=musicjsoncallback& cid=205361747&[songmid]&c400+songmid+.m4a&guid=6612300644
#url3: Http://dl.stream.qqmusic.qq.com/[filename]?vkey=[vkey] (where Vkey replaces the music-specific string)
requests(url1)
The Songmid and mid of each music are obtained by the search list (the two values are, by the author's observation, each of which is music-specific). With these two values. The following is the exact value of the complete url2.
requests(url2)
The vkey value of each music in the search results is obtained, and by the author's observation, filename is c400songmid.m4a. The specific values of the url3 are then determined. And Url3 is the real URL of the music, because the author of the other parameters of this URL study is not thorough enough, so each time up to return 20 music URL, with the URL, that Tencent music can enjoy the enjoyment.
#代码
Here's a SRCS code block:
Import requestsimport Urllibimport Jsonword = ' rain butterfly ' res1 = Requests.get (' Https://c.y.qq.com/soso/fcgi-bin/client_search _cp?&t=0&aggr=1&cr=1&catzhida=1&lossless=0&flag_qc=0&p=1&n=20&w= ' +word) jm1 = Json.loads (Res1.text.strip (' Callback () []) jm1 = jm1[' data ' [' song '] [' list ']mids = []songmids = []srcs = []songnames = []singers = []for J in Jm1:try:mids.append (j[' Media_mid ']) songmids.append (j[' Songmid ']) songnames.append (j[' Songname ']) singers.append (j[' singer '][0][' name ']) except:print (' wrong ') for N in range (0,len (Mids)): Res2 = Requests.get (' http S://c.y.qq.com/base/fcgi-bin/fcg_music_express_mobile3.fcg?&jsonpcallback=musicjsoncallback&cid= 205361747&songmid= ' +songmids[n]+ ' &filename=c400 ' +mids[n]+ '. m4a&guid=6612300644 ') jm2 = Json.loads ( Res2.text) vkey = jm2[' data ' [' Items '][0][' vkey '] srcs.append (' http://dl.stream.qqmusic.qq.com/C400 ' +mids[n]+ '. m4a ? vkey= ' +vkey+ ' &guid=6612300644&uin=0&fromtag=66 ')
#下载:
With the SRCS, downloading nature is not a problem. Of course Get the singer and song name is also can copy src to the browser download. can also use large python bulk download, is nothing more than a loop, with us before the download Sogou image method similar to: (author py version: python3.3.3)
Print (' for ' +word+ ' Start download ... ') x = Len (SRCS) for M in Range (0,x): Print (str (m) + ' * * * * * +songnames[m]+ '-' +singers [m]+ '. M4A * * * * * * * * + ' downloading ... ') Try: Urllib.request.urlretrieve (srcs[m], ' d:/music/' +songnames[m]+ '-' + singers[m]+ '. m4a ') except: x = x-1 print (' Download wrong~ ') print (' for [' +word+ '] Download complete ' +str (x) + ' Files! ')
The above two pieces of code, written in the same py file, run to download the corresponding keyword music
#运行效果:
Download Start, below ... To the download directory to see:
Music has been successfully downloaded ...
At this point, the qqmusic of the URL crawler ideas and implementation of the narrative is complete.
#用途:
Musicplayer Good shell classmate, you should use it. Actually doing this is intended for my HTML-based Musicplayer service. But now stuck in the JS call py link, I look for it, understand the classmate hope to tell, thank you very much!