This article mainly introduces how to use the Baidu Music Search api in python to download the lrc lyrics of a specified song and analyze the songs, if you need a friend, you can refer to this article. this is really a dry product. I got it for half a night last night. well, I started to write it after dinner at, and it's been ,,, newbie, sorry ....
To put it simply, Baidu provides an api for music search.
Http://box.zhangmen.baidu.com/x? Op = 12 & count = 1 & title = Friend loss $ Chan Xun $
, Baidu will return you a piece of xml, as shown below
This XML file does not appear to have any style information associated with it. The document tree is shown below.
1
http://zhangmenshiting.baidu.com/data2/music/12762845/YmRqamdua21fn6NndK6ap5WXcJlrmG1xlJhobWibmGpjk5ZtmWiZcWRjZ5lqbGyelGKWlZtubGljZ5lka2uanWSXY1qin5t1YWBmZW5ocGlhaWdnbGtqbzE$
12762845.mp3?xcode=e6b69cf593ea22ac9d2b9314e565fc0caf85125f065ce3e0&mid=0.31929107437537
8
2829
1
http://zhangmenshiting2.baidu.com/data2/music/7345405/aGVnaWlmbGaeomZzrZmmnJZvmGqXbHCbl2dsZ5qXaWqSlWpsmmdrb2mXamxpbXCclGNsmW2ba25mYmxtapmZcWqTWaGemnRoX2VkbWdvaGhoZmZramluOA$$
7345405.mp3?xcode=e6b69cf593ea22ac78e1478e78479dc19e8e4650995cb99a&mid=0.31929107437537
8
2829
1
f98b6772aa97966550ec80617879becee0233bf4
mp3
3778335
128
In a simple description, we only need to obtain the lrc address of the song, so only the 2829 tag is useful.
However, the combination of encode and decode is mp3. In this example
http://zhangmenshiting.baidu.com/data2/music/12762845/YmRqamdua21fn6NndK6ap5WXcJlrmG1xlJhobWibmGpjk5ZtmWiZcWRjZ5lqbGyelGKWlZtubGljZ5lka2uanWSXY1qin5t1YWBmZW5ocGlhaWdnbGtqbzE$12762845.mp3?xcode=e6b69cf593ea22ac9d2b9314e565fc0caf85125f065ce3e0&mid=0.31929107437537
That is, but the sound quality is too poor. I have time to study this.
Continue with the lyrics. pay attention to the 2829 in the lrcid label.
Http://box.zhangmen.baidu.com/bdlrc/ this is Baidu lrc lyrics storage address,
The lyrics in this example are then http://box.zhangmen.baidu.com/bdlrc/28/2829.lrc
You can see that the calculation method of the two numbers after the lyrics address is the integer obtained after dividing lrcid by 100, that is, the first number, the second number is lrcid, and then the suffix is added. lrc is done.
It is easy to get the lrc address. you only need to request the address and write the obtained content to the file.
Okay, this is probably the case. the following is the code:
import osimport os.pathimport reimport eyed3import urllib2import urllibfrom urllib import urlencodeimport sys import osreload(sys)sys.setdefaultencoding('utf8') music_path = r"E:\music"lrc_path = r"e:\lrc" os.remove('nolrc.txt')os.remove('lrcxml.txt') the_file = open('lrcxml.txt','a')nolrc_file = open('nolrc.txt','a') for root,dirs,files in os.walk(music_path): for filepath in files: the_path = os.path.join(root,filepath) if (the_path.find("mp3") != -1): print the_path the_music = eyed3.load(the_path) the_teg = the_music.tag._getAlbum() the_artist = the_music.tag._getArtist() the_title = the_music.tag._getTitle() # print the_teg # print the_title # print the_artist b = the_title.replace(' ','+') # print b a = the_artist.replace(' ','+') #print urlencode(str(b)) if isinstance(a,unicode): a = a.encode('utf8') song_url = "http://box.zhangmen.baidu.com/x?op=12&count=1&title="+b+"$$"+a+"$$$$ " the_file.write(song_url+'\n') page = urllib2.urlopen(song_url).read() print page theid = 0 lrcid = re.compile('
(.*?)
',re.S).findall(page) have_lrc = True if lrcid != []: theid = lrcid[0] else: nolrc_file.write(the_title+'\n') have_lrc = False print theid if have_lrc: firstid = int(theid)/100 lrcurl = "http://box.zhangmen.baidu.com/bdlrc/"+str(firstid)+"/"+theid+".lrc" print lrcurl lrc = urllib2.urlopen(lrcurl).read() if(lrc.find('html')== -1): lrcfile = open(lrc_path+"\\"+the_title+".lrc",'w') lrcfile.writelines(lrc) lrcfile.close() else: nolrc_file.write(the_title+'\n') the_file.close()nolrc_file.close()print "end!"
The first step is to get the request in xml format, so I was thinking about parsing xml to get the lrcid. but in the implementation process, I encountered various problems, and other problems are easy, this is the longest waste of time. after the tangle fails, you can only use a regular expression to obtain it... I can only explain whether learning is not refined.
Original article: blog of past days» use python to scan local music and download lyrics