This time this really is dry goods oh, last night made a half night,,,, from 8 points after eating dinner began to write, has been nearly 12 o'clock to finish,,, novice, can't afford to hurt Ah ....
First of all, Baidu provides a music search API, you want Baidu request similar to
http://box.zhangmen.baidu.com/x?op=12&count=1&title= best loss friend $$ Eason Chan $ $
Address, Baidu will return you a section of XML, as shown below
This XML file does not appear to have any style information associated with it.
The document tree is shown below. <result> <count>1</count> <url> <encode> <! [Cdata[http://zhangmenshiting.baidu.com/data2/music/12762845/ Ymrqamdua21fn6nndk6ap5wxcjlrmg1xljhobwibmgpjk5ztmwizcwrjz5lqbgyelgkwlztubgljz5lka2uanwsxy1qin5t1ywbmzw5ocglhawdnbgtqbze $]]> </encode> <decode> <! [cdata[12762845.mp3?xcode=e6b69cf593ea22ac9d2b9314e565fc0caf85125f065ce3e0&mid=0.31929107437537]]> </ decode> <type>8</type> <lrcid>2829</lrcid> <flag>1</flag> </url> < Durl> <encode> <! [Cdata[http://zhangmenshiting2.baidu.com/data2/music/7345405/ Agvnawlmbgaeomzzrzmmnjzvmgqxbhcbl2dsz5qxawqslwpsmmdrb2mxamxpbxcclgnsmw2ba25mymxtapmzcwqtwagemnrox2vkbwdvaghozmzramluoa $$]]> </encode> <decode> <! [Cdata[7345405.mp3?xcode=e6b69cf593ea22ac78e1478e78479dc19e8e4650995cb99a&mid=0.31929107437537]]> </decode> <type>8</type> <lrcid>2829</lrcid> <flag>1</flag> </durl> <p2p>
The simple explanation, since all we have to do is get to the LRC lyrics address of the song, so the only useful one is 2829 this label.
and encode and decode inside the stitching up is the MP3 download address, as in this case
http://zhangmenshiting.baidu.com/data2/music/12762845/ Ymrqamdua21fn6nndk6ap5wxcjlrmg1xljhobwibmgpjk5ztmwizcwrjz5lqbgyelgkwlztubgljz5lka2uanwsxy1qin5t1ywbmzw5ocglhawdnbgtqbze $12762845.mp3?xcode=e6b69cf593ea22ac9d2b9314e565fc0caf85125f065ce3e0&mid=0.31929107437537
is to download the address, but the sound quality is too poor, there is time to study this.
Keep saying the lyrics, notice the 2829 in the lrcid tag.
http://box.zhangmen.baidu.com/bdlrc/This is Baidu LRC lyrics store address,
And then this example's lyrics address is HTTP://BOX.ZHANGMEN.BAIDU.COM/BDLRC/28/2829.LRC
See, the two digits behind the address of the lyrics are the integers that lrcid divided by 100, the first number, and then the second number is lrcid, followed by the suffix. LRC's done.
It's easy to get a LRC address, just ask for the address, and then write the acquired content to the file.
Well, that's probably it, here's the code:
Import OS import os.path import re import eyed3 import urllib2 import urllib from urllib import urlencode import sys IM Port OS Reload (SYS) sys.setdefaultencoding (' utf8 ') Music_path = r "E:\music" Lrc_path = r "E:\LRC" Os.remove (' Nolrc.txt ') ) os.remove (' lrcxml.txt ') the_file = open (' Lrcxml.txt ', ' a ') Nolrc_file = open (' Nolrc.txt ', ' a ') for root,dirs,files in Os.walk (Music_path): for filepath in Files:the_path = Os.path.join (Root,filepath) if (The_path.find ("MP3")!=-1): P Rint The_path the_music = eyed3.load (the_path) The_teg = The_music.tag._getalbum () the_artist = The_music.tag._getAr Tist () The_title = The_music.tag._gettitle () # print The_teg # print The_title # print The_artist b = the_title. Replace (', ' + ') # Print B a = The_artist.replace (', ' + ') #print UrlEncode (str (b)) if Isinstance (A,unicode): A = A.encode (' UTF8 ') Song_url = "http://box.zhangmen.baidu.com/x?op=12&count=1&title=" +b+ "$$" +a+ "$ $ $" The_f Ile.write (song_url+ ' \ n ')
page = Urllib2.urlopen (song_url). Read () print page Theid = 0 lrcid = re.compile (' <lrcid> (. *?) </lrcid> ', Re. S). FindAll (page) HAVE_LRC = True if Lrcid!= []: Theid = lrcid[0] else:nolrc_file.write (the_title+ ' \ n ') h AVE_LRC = False Print Theid if have_lrc:firstid = Int (theid)/100 lrcurl = "http://box.zhangmen.baidu.com/bd
lrc/"+str (FirstID) +"/"+theid+". LRC "Print Lrcurl LRC = Urllib2.urlopen (Lrcurl). Read () if (Lrc.find (' html ') = = 1): Lrcfile = open (lrc_path+ "\ +the_title+". LRC "," W ") Lrcfile.writelines (LRC) lrcfile.close () Else:nolrc_file.wri Te (the_title+ ' \ n ') the_file.close () nolrc_file.close () print "end!"
Useful the first step of the request is obtained in the XML format, so would like to parse XML to get lrcid, but in the implementation of the process encountered a variety of problems, and other easy, in this together the longest waste of time, the tangle of fruit, can only use regular expressions to obtain ... It just means it's not a fine apprentice.
Blog of the Lost days» Use Python to scan local music and download lyrics