90 lines Python with a music search tool

Last Update:2015-07-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Before reading this blog for some time, which describes how the author uses Java to implement the famous foreign Music search tool Shazam basic functions. The article mentioned also leads me to a paper and another blog about Shazam. After reading it found that the principle is not very complex, but the method of noise robustness is very good, out of curiosity decided to use Python himself to implement a simple music search tool--song Finder, its core functions are encapsulated in SFEngine , third-party dependencies only used to scipy.

Tool Demo

This demo shows the use of the tool under Ipython, this project is named Song Finder, and I enclose the functions of index and search in song Finder SFEngine . The first is the simple preparation work:

In [1]: from SFEngine import *In [2]: engine = SFEngine()

After that we indexed the existing songs, and I original prepared dozens of songs (. wav files) in the directory as Music library:

In [3]: engine.index(‘original‘) # 索引该目录下的所有歌曲

After completing the index we Song Finder searched for a recorded song with a background noise. For this section of "Maple" in 1 minutes and 15 seconds of recording:

The return result of the tool is:

Inch[4]:engine. Search ( ' record/record0.wav ' ) original / Jay - Maple 73 Original/ Jay - Maple 31 original/ Jay - maple 10original/ Jay -  Maple 28original/ I want to be happy. - Hui Mei 28

The display is the song name and the location of the clip in the song (in seconds), you can see the tool correctly retrieved the song's name, but also found its correct position in the song.

And for this piece of "fairy tale" in 1 minutes and 05 seconds of background noise more noisy recording:

The return result of the tool is:

Inch[5]:engine. Search ( ' record/record8.wav ' ) original / Light Liang - Fairy 67 Original/ Light Liang - Fairy 39 original/ Light Liang - fairy tale 33original/ Light Liang -  Fairy 135original/ Light Liang - Fairy 69

You can see that despite the noisy noise, the tool can still successfully identify the corresponding song and correspond to the correct location of the song, indicating that the tool in the noisy environment has good robustness!

Project home: Github

Song Finder principle

The retrieval of a recording fragment by a given music library is a no-compromise search problem, but the search for audio is not as straightforward as searching for documents or data. To complete the search for music, the tool needs to complete the following 3 tasks:

Extract features for all songs in the song Library
Extract features in the same way for recording clips
Searches for music library based on the characteristics of the recording clip, returning the most similar song and its position in the song

Feature extraction? Discrete Fourier transform!

In order to extract the characteristics of music (audio), a very straightforward idea is to get information about the pitch of the music, while the pitch is physically corresponding to the frequency information of the wave. To obtain this kind of information, a very direct approach is to use discrete Fu changes to analyze the sound, even if the sound is sampled with a sliding window, the data in the window is discrete Fourier changes, the information in the time domain is transformed into the information on the frequency domain, scipy the interface can be easily completed. After that we will segment the frequency to extract the frequency of the maximum amplitude per frequency:

DefExtract_feature(Self,Scaled,Start,Interval):End=Start+IntervalDst=Fft(Scaled[Start:End])Length=Len(Dst)/2Normalized=Abs(Dst[:(Length-1)])Feature=[Normalized[:50].Argmax(),50+Normalized[50:100].Argmax(),100+Normalized[100:200]. Argmax (), 200 + normalized[200:300]. Argmax (), 300 + normalized[300:400]. Argmax (), 400 + normalized[400:]. Argmax ()] return feature

Thus, for a sliding window, I extracted 6 frequencies as their characteristics. For the entire segment of the audio, we call this function repeatedly for feature extraction:

DefSample(Self,FileName,Start_second,Duration=5,Callback=None):Start=Start_second*44100IfDuration==0:End=1e15Else:End=Start+44100*DurationInterval=8192Scaled=Self.Read_and_scale(FileName)Length=Scaled.SizeWhilestart < min (length end): feature = self. Extract_feature (scaledstartinterval) if callback != none: callback (filename startfeature) start += interval

Where 44100 is the audio file itself sampling frequency, 8192 is I set the sampling window (yes, so hardcode is very wrong), callback is an incoming function, need this parameter because in different scenarios for the resulting features will have different follow-up operation.

Match Music Library

How to search efficiently is a problem when you get a lot of features of songs and recordings. An effective way is to create a special hash table, where the key is the frequency, and its corresponding value is a series (曲名,时间) of tuple, it is recorded that a certain song at a certain time a certain feature frequency, but with the frequency of key instead of the song name or time as key.

Form..

The advantage of this is that when a feature frequency is extracted from the recording, we can find the song and time associated with that feature frequency from this hash table!

Of course, with this hash table is not enough, we can not have all the characteristics of the frequency-related songs are drawn to see who hit more times, because this will completely ignore the song timing information, and introduce some wrong match.

Our approach is that for recording t a feature frequency at a point in time f , find all the associated tuple from the song Library f (曲名,时间) , for example we got

[(s1, t1), (s2, t2), (s3, t3)]

We use time to align and get this list

[(s1, t1-t), (s2, t2-t), (s3, t3-t)]

Recorded as

[(s1, t1`), (s2, t2`), (s3, t3`)]

We do this for all the feature frequencies at all points in time, and we get a large list:

[(s1, t1`), (s2, t2`), (s3, t3`), ..., (sn, tn`)]

Counting this list, you can see which song has the most hits, and return the most hits to the (曲名，时间) user.

Insufficient

This gadget is a hack that has been written for a few hours, and there are places to be improved, such as:

Currently only supports music library and recording in WAV format.
All data is in memory and music library needs to introduce better back-end storage when the volume increases
The index should be parallelized, the match should also be parallelized, and the matching model is a typical map-reduce.

Project Home

Github

90 lines Python with a music search tool

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

90 lines Python with a music search tool

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

90 lines Python with a music search tool

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support