Qqmusic on the music is still a lot of, sometimes want to download good music, but every time in the Web download is annoying log-in or something. So, came a qqmusic reptile.
At least I think the most important thing for a for-loop crawler is to find the URL where you want to crawl the element. Let's start looking for it (don't laugh at me)
#寻找url:
This URL does not want other sites so easy to find. I am tired of not light, the key is more data, from so many data inside to pick out useful data, and finally combination of music real music. When I did it yesterday, I sorted out a few intermediate URLs:
#url1: https://c.y.qq.com/soso/fcgi-bin/client_search_cp?&lossless=0&flag_qc=0&p=1&n=20&w= #url2: Https://c.y.qq.com/base/fcgi-bin/fcg_music_express_mobile3.fcg?&jsonpCallback=MusicJsonCallback &cid=205361747&[songmid+.m4a&guid=6612300644
#url3: http://dl.stream.qqmusic.qq.com/the Songmid and mid of each music by the search list (which, by the author's observation, are the two values each of which are music-specific). With these two values. The following is the exact value of the complete url2.
Requests (URL2)
The vkey value of each music in the search results is obtained, and by the author's observation, filename is C400songmid.m4a. The specific values of the url3 are then determined. And Url3 is the real URL of the music, because the author of the other parameters of this URL study is not thorough enough, so each time up to return 20 music URL, with the URL, that Tencent music can enjoy the enjoyment.
Code:
Import requests
Import Urllib
Import JSON
Word = ' lei '
Res1 = Requests.get (' Https://c.y.qq.com/soso/fcgi-bin/client_search_cp?&t=0&aggr=1&cr=1&catZhida =1&lossless=0&flag_qc=0&p=1&n=20&w= ' +word)
JM1 = Json.loads (Res1.text.strip (' Callback () [] '))
JM1 = jm1[' data ' [' song '] [' list ']
Mids = []
Songmids = []
SRCS = []
Songnames = []
Singers = []
For J in Jm1:
Try
Mids.append (j[' Media_mid ')
Songmids.append (j[' Songmid ')
Songnames.append (j[' songname ')
Singers.append (j[' singer '][0][' name ')
Except
Print (' wrong ')
For-N in range (0,len (mids)):
Res2 = Requests.get (' https://c.y.qq.com/base/fcgi-bin/fcg_music_express_mobile3.fcg?&jsonpCallback= Musicjsoncallback&cid=205361747&songmid= ' +songmids[n]+ ' &filename=c400 ' +mids[n]+ '. m4a&guid= 6612300644 ')
JM2 = Json.loads (Res2.text)
vkey = jm2[' data ' [' Items '][0][' vkey ']
Srcs.append (' http://dl.stream.qqmusic.qq.com/C400 ' +mids[n]+ '. m4a?vkey= ' +vkey+ ' &guid=6612300644&uin=0 &fromtag=66 ')
Print (' for ' +word+ ' Start download ... ')
x = Len (SRCS)
For M in range (0,x):
Print (str (m) + ' * * * * * * * * * * +songnames[m]+ '-' +singers[m]+ '. M4A * * * * * * * * * * * * * * * *)
Try
Urllib.request.urlretrieve (Srcs[m], ' d:/music/' +songnames[m]+ '-' +singers[m]+ '. mp3 ')
Except
x = x-1
Print (' Download wrong~ ')
Print (' for [' +word+ '] Download complete ' +str (x) + ' Files! ')
Python crawls qqmusic music URLs and downloads in bulk