Research status of music retrieval:
From the target of the search, can be divided into two major categories: Searching symbolic data,searching AUDIO DATA. My understanding of these two categories is that the former is to retrieve the score, while the latter is to retrieve the audio.
Searching Symbolic DATA
1 string-based methods for monophonic melodies
Since this is the way to turn music into a form of strings, later retrieval can use many algorithms for string comparisons, such as calculating editing distances, finding the longest common substring, finding the number of occurrences of a string in another, and so very mature algorithms.
1.1 Distance measurement
1.1.1 Exact Match method: that is, the user's input must be fully matched to a segment of the music in the database. The distance measurement method used is the KMP algorithm (Knuth-morris-pratt) and the BM algorithm (Boyer-moore).
Representative system: Kornstadt's Themefinder.
This music search engine is only for users with professional music knowledge, because you need to enter the relevant music parameters accurately to search.
1.1.2 Fuzzy Matching method: that is, the input of the user and the music of the database are fuzzy matching. The distance measurement method used is the editing distance calculation.
Representative system: Prechelt and Typke Musipedia.
Music Search Engine-musipedia
Musipedia-a MIDI-based music search engine. (Simplified Chinese version)
Musicpedia is mainly search for its own library, but can also be searched for the whole web! Its search technology platform is using Alexa search platform. If there is a melody in the impression, but it is not exactly what music or song to Remember, you can take it to try to find out. It offers four ways to search:
Keyboard input Search (Keyboard search)
At the top of the keyboard a sound after the search can be, below the web there are ways to use
String-Law wheel silhouette Search (Contour search)
This part uses a kind of called Parsons Code to be melodic contours. The simple thing is to use the ups and downs of the continuous sound, when the next sound is higher, it encode into U, the lower is D, and, of course, the first sound of each music is encoded, which may be incorrect, so in Parsons Code the first tone is expressed in stars *, which is the number of symbols.
Humming Search (Sing or Whistle)
It is possible to use the recording directly, or to sing or whistle.
Play Search (Rhythm search)
After you press Start tapping, you can use the keyboard to tap a play to search.
The comparison is worth mentioning about how it is comparable to MIDI, mainly using Editing distance and Earth Mover ' s distance.
Musipedia, formerly known as Melodyhound, was built by Rainer Typke from 1997, until the 2006, when the music search engine changed its name to Musipedia, and was able to search the WWW. MIDI music. Melodyhound is still available at the moment.
1.2 Index Establishment
1.2.1 can often be indexed using methods such as inverted files or b-trees. The lack of the equivalent of words in music can is overcome by just cutting melodies into N-grams (Downie, 1999) and Inde Xing those.
Reference documents:
J. S. Downie. Evaluating a simple approach to music information retrieval:conceiving melodic n-grams as text. PhD Thesis, University of Western Ontario, London, Ontario, Canada, 1999. R. Typke, P. Giannopoulos, R. C. Veltkamp, F. Wiering, and R. van Oostrum. Using transportation distances for measuring melodic similarity. In Ismir proceedings, pages 107–114, 2003.
2 set-based Methods for polyphonic music
In this way, music is seen as a collection of melodic attributes that include the constant tone, the start time and duration of the pitch.
2.1 Distance Measurement
2.1.1 Finding supersets
M. Clausen, R. Engelbrecht, D. Meyer, and J. Schmitz. Proms:a web-based tool for searching in polyphonic music. In Ismir proceedings, 2000.
2.1.2 Earth Mover ' s Distance
R. Typke, P. Giannopoulos, R. C. Veltkamp, F. Wiering,and R. van Oostrum. Using transportation distances for measuring melodic similarity. In Ismir proceedings, pages 107–114, 2003.
Representative system: Typke's Orpheus. In this system, the note is represented as a vector of two dimensions, containing the starting time and pitch, Earth Mover's Distance as a measure of distance, vantage objects as the way of indexing.
2.2 Index Establishment
2.2.1 Inverted files
M. Clausen, R. Engelbrecht, D. Meyer, and J. Schmitz. Proms:a web-based tool for searching in polyphonic music. In Ismir proceedings, 2000.
2.2.2 Triangle Inequality for indexing
R. Typke, P. Giannopoulos, R. C. Veltkamp, F. Wiering,and R. van Oostrum. Using transportation distances for measuring melodic similarity. In Ismir proceedings, pages 107–114, 2003.
3 Probabilistic Matching
By training hidden Markov model to calculate the similarity of audio in query audio and database
3.1 Distance measurement
First, the Hidden Markov model is trained with database audio, and then the similarity degree of two audio is obtained by calculating the posterior probability of the query audio.
Representative system: Hoos's Guido/mir
3.2 Index Establishment
Hierarchical clustering of the adopted trees
Reference documents:
H. Hoos, K. Renz, and M. g¨org. Guido/mir-an Experimental musical information Retrieval system based on GUIDO music Nota tion. In Ismir proceedings, pages 41–50, 2001.
Searching AUDIO DATA
1 Extracting perceptionally relevant features
A piece of audio is segmented into a small segment, and the auditory perceptual features of each segment are extracted and retrieved by comparing the feature sequences. The main features are: loudness, Pitch, Tone, mel-filtered cepstral coefficients, derivatives.
Representative system: Jang's Super MBox
The system first extracts the basic frequency of music, then compares the similarity of two basic frequency sequences by dynamic time warping.
2 Audio Fingerprinting
In the complex environment, the use of "voice print" can achieve better results. This is also my time to go to focus on the direction of learning.
Representative system: Wang's Shazam
3 set-based Methods
Retrieving with a collection of audio features
4 self-organizing Map
Som is a very common artificial intelligence neural network algorithm, mainly used in unsupervised learning field, similar audio clustering and classification.
Representative system: Rauber's somejb-the som-enhanced JukeBox
Research status of "reprint" Music Retrieval: