90 lines Python with a music search tool

Source: Internet
Author: User

Before reading this blog for some time, which describes how the author uses Java to implement the famous foreign Music search tool Shazam basic functions. The article mentioned also leads me to a paper and another blog about Shazam. After reading it found that the principle is not very complex, but the method of noise robustness is very good, out of curiosity decided to use Python himself to implement a simple music search tool--song Finder, its core functions are encapsulated in SFEngine , third-party dependencies only used to scipy.

Tool Demo

This demo shows the use of the tool under Ipython, this project is named Song Finder, and I enclose the functions of index and search in song Finder SFEngine . The first is the simple preparation work:

In [1]: from SFEngine import *In [2]: engine = SFEngine()

After that we indexed the existing songs, and I original prepared dozens of songs (. wav files) in the directory as Music library:

In [3]: engine.index(‘original‘) # 索引该目录下的所有歌曲

After completing the index we Song Finder searched for a recorded song with a background noise. For this section of "Maple" in 1 minutes and 15 seconds of recording:

The return result of the tool is:

Inch[4]:engine. Search ( ' record/record0.wav ' ) original / Jay - Maple 73 Original/ Jay - Maple 31 original/ Jay - maple 10original/ Jay -  Maple 28original/ I want to be happy. - Hui Mei 28          

The display is the song name and the location of the clip in the song (in seconds), you can see the tool correctly retrieved the song's name, but also found its correct position in the song.

And for this piece of "fairy tale" in 1 minutes and 05 seconds of background noise more noisy recording:

The return result of the tool is:

Inch[5]:engine. Search ( ' record/record8.wav ' ) original / Light Liang - Fairy 67 Original/ Light Liang - Fairy 39 original/ Light Liang - fairy tale 33original/ Light Liang -  Fairy 135original/ Light Liang - Fairy 69           

You can see that despite the noisy noise, the tool can still successfully identify the corresponding song and correspond to the correct location of the song, indicating that the tool in the noisy environment has good robustness!

Project home: Github

Song Finder principle

The retrieval of a recording fragment by a given music library is a no-compromise search problem, but the search for audio is not as straightforward as searching for documents or data. To complete the search for music, the tool needs to complete the following 3 tasks:

    • Extract features for all songs in the song Library
    • Extract features in the same way for recording clips
    • Searches for music library based on the characteristics of the recording clip, returning the most similar song and its position in the song
Feature extraction? Discrete Fourier transform!

In order to extract the characteristics of music (audio), a very straightforward idea is to get information about the pitch of the music, while the pitch is physically corresponding to the frequency information of the wave. To obtain this kind of information, a very direct approach is to use discrete Fu changes to analyze the sound, even if the sound is sampled with a sliding window, the data in the window is discrete Fourier changes, the information in the time domain is transformed into the information on the frequency domain, scipy the interface can be easily completed. After that we will segment the frequency to extract the frequency of the maximum amplitude per frequency:

DefExtract_feature(Self,Scaled,Start,Interval):End=Start+IntervalDst=Fft(Scaled[Start:End])Length=Len(Dst)/2Normalized=Abs(Dst[:(Length-1)])Feature=[Normalized[:50].Argmax(),50+Normalized[50:100].Argmax(),100+Normalized[100:200]. Argmax (), 200 + normalized[200:300]. Argmax (), 300 + normalized[300:400]. Argmax (), 400 + normalized[400:]. Argmax ()] return feature       

Thus, for a sliding window, I extracted 6 frequencies as their characteristics. For the entire segment of the audio, we call this function repeatedly for feature extraction:

DefSample(Self,FileName,Start_second,Duration=5,Callback=None):Start=Start_second*44100IfDuration==0:End=1e15Else:End=Start+44100*DurationInterval=8192Scaled=Self.Read_and_scale(FileName)Length=Scaled.SizeWhilestart < min (length end): feature = self. Extract_feature (scaledstartinterval) if callback != none: callback (filename startfeature) start += interval          

Where 44100 is the audio file itself sampling frequency, 8192 is I set the sampling window (yes, so hardcode is very wrong), callback is an incoming function, need this parameter because in different scenarios for the resulting features will have different follow-up operation.

Match Music Library

How to search efficiently is a problem when you get a lot of features of songs and recordings. An effective way is to create a special hash table, where the key is the frequency, and its corresponding value is a series (曲名,时间) of tuple, it is recorded that a certain song at a certain time a certain feature frequency, but with the frequency of key instead of the song name or time as key.

Form..

The advantage of this is that when a feature frequency is extracted from the recording, we can find the song and time associated with that feature frequency from this hash table!

Of course, with this hash table is not enough, we can not have all the characteristics of the frequency-related songs are drawn to see who hit more times, because this will completely ignore the song timing information, and introduce some wrong match.

Our approach is that for recording t a feature frequency at a point in time f , find all the associated tuple from the song Library f (曲名,时间) , for example we got

[(s1, t1), (s2, t2), (s3, t3)]

We use time to align and get this list

[(s1, t1-t), (s2, t2-t), (s3, t3-t)]

Recorded as

[(s1, t1`), (s2, t2`), (s3, t3`)]

We do this for all the feature frequencies at all points in time, and we get a large list:

[(s1, t1`), (s2, t2`), (s3, t3`), ..., (sn, tn`)]

Counting this list, you can see which song has the most hits, and return the most hits to the (曲名,时间) user.

Insufficient

This gadget is a hack that has been written for a few hours, and there are places to be improved, such as:

    • Currently only supports music library and recording in WAV format.
    • All data is in memory and music library needs to introduce better back-end storage when the volume increases
    • The index should be parallelized, the match should also be parallelized, and the matching model is a typical map-reduce.
Project Home

Github

90 lines Python with a music search tool

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.