Find CD information online

Source: Internet
Author: User
Tags cddb locale

I. Overview

CDDB's full name is CD database, translation is "CD databases." Just as literally, CDDB is a network database, music enthusiasts, CD publishers all over the world can submit CD information to the database through the network, or through the network to query CD information from the database, including CD album (Album) name, player (Artist), Year of publication, genre (genre), title of each track (title), etc.

CDDB's role includes, but is not limited to: the convenience of network music sharing, even can be said that the CDDB is with the online music sharing and development. Imagine that if everyone buys CDs to listen to, the CD cover is naturally printed with all the information on the CD (please do not use the unscrupulous D business to refute my views.) ), is it necessary to search the Internet? But it's a different network of shared music, whether you download the mirror image of a CD via Peer-to-peer, FTP, or other protocols, if the shared person forgets to scan the CD cover and forgets to tap the CD information into a text file and share it with the mirrored file, you won't be able to find a way north. This time a little hands-on spirit of the people, naturally will think of the internet search. Easy CD playback. When playing CDs on a computer, some people don't just listen to music, but also want to be able to see the title of each track while listening. Software such as Foobar, Windows Media player, Real Player, and so on, can display CD information through an online query CDDB when playing a CD. Facilitate the production and playback of music files such as MP3 and WMA. After the capture of the file name is Track01, Track02, such as boring things, many later I am afraid that the management will have trouble, so the EAC will be in the grip of the track from the CDDB query CD information, and then automatically generate meaningful file name, look at all cool. Foobar can also write information from CDDB to MP3, so that when you play MP3 files individually, you can display music titles, album names, singers, and so on in the MP3 player (whether it's hardware or software, ID3V1/2 tag on the MP3 format). Facilitate the establishment and management of their own media library. A lot of media files are stored on your hard disk, and management is a hassle. If each mirror, each MP3 to enter the information manually, at least I will not do it myself. And if you can query online, fill in the relevant information, natural will be much easier.

The purpose of my writing this article, of course, it is impossible to promote the development of the music D version, just hope that through the introduction of the common CDDB, so that people have a good understanding of their characteristics and query methods, in time to be able to obtain useful information from CDDB, of course, this information must also be used only for legitimate purposes.

Second, common CDDB

1, FreeDB

Official website: http://www.freedb.org
Query page: http://www.freedb.org/freedb_search.php
Chinese support: it sucks.
Famous User: EAC (Exact Audio Copy), Foobar

FreeDB is a non-profit organization that maintains the FREEDB website and its CD database for free access to CD information. If you find a CD in the FreeDB CD database, you can also manually enter the information submitted to FreeDB, enrich its database, reflect the "Everyone for me, I for everyone" principle.

FreeDB not only can query, and all of its query protocols are open, on its website has detailed instructions, so the use of freedb free software, including EAC, foobar some software are used it to query CD information, so popular.

In addition to the protocol open, freedb even the database content is open: From here free download to FreeDB's full database for offline query use. This kind of thing probably only has the website which starts with free to be able to do,. com website cannot come out.

But FreeDB's operating model, which relies entirely on music enthusiasts to submit information, also poses some limitations to its content. From my own use of the situation, freedb query foreign Copyright CD (including the domestic import of copyright CD) is basically not a big problem, but the domestic copyright is basically not, may and domestic music enthusiasts less to freedb to provide information about.

In addition, I evaluate its Chinese support as "very bad", which means: on its query page, if you directly input Chinese as a keyword, nothing can be found. From the development documentation it provides, the UTF-8 encoding is supported from CDDB Protocol level 6, but most software currently uses the CDDB Protocol level 6 version of the protocol, so the software currently supporting Chinese keyword Query freedb has not been seen. If you follow Cdid (the CD identification from the start of the CD's tracks, the length of the CD, the equivalent of the "fingerprint" of the CDs, which is usually used to identify each CD in CDDB), you can query for the Chinese CD information in FreeDB, but because the calculation cdid need to read the disc, So this can only be implemented in the software (such as foobar, etc.), in the query page is difficult to do.

2, Gracenote

Official website: http://www.gracenote.com
Query page: http://www.gracenote.com/music/
Chinese support: Very good
Famous User: Real Player

Gracenote this name for some people may be a little strange, but its predecessor cddb.org is famous, known as the Internet's earliest, largest, most complete cddb. But cddb.org represents the ideal free-sharing era and has now been transformed into a commercial operating model of. com, so simply change the name to proprietary.

If you want to integrate the CDDB function of the query gracenote in the software, you must obtain the Gracenote license. Free licenses are harder to apply, and at least my application is not answered, so Gracenote is rarely seen in freeware and is used in commercial software, such as real Player.

Gracenote support for Chinese is very good: through the Web query, not only support Chinese keywords, and input simplified keywords, even traditional CD information can be found. If you encounter special Western European characters when you programmatically query some European CD information through the controls provided by Gracenote, Gracenote will automatically be converted to phonetic pinyin characters for easy display in the Chinese context. This function seems to be unique to Gracenote, which I have not seen anywhere else. Later, I write to solve the cue file garbled software Cuecode, this recruit learned to come over, hehe ...

From my personal use of the situation, the amount of data gracenote is undoubtedly more than FreeDB, and Microsoft's CDDB compared to the different: foreign CDs may be almost two, the domestic sometimes gracenote find out, Microsoft can not find out, sometimes the opposite.

3, Microsoft CDDB

Official website: is said to be under construction, not yet officially announced, estimated to be http://metaservices.windowsmedia.com
Query page: Not officially published, the following URL I grabbed from Windows Media player, the validity period is not guaranteed:
Http://metaservices.windowsmedia.com/CDWizard/CDWizard1.asp
Chinese support: Very good
Notable users: Windows Media Player

This is Microsoft's own CD database, where Windows Media player looks for CD information. Rich people do different things after all, feeling that Microsoft's database is more than freedb, and often can find a CD cover picture, estimated to be directly from the CD manufacturer's data. But so far, Microsoft has not yet published the interface specification for the CD query, nor has it allowed to use it for free, so no publicly available software has ever seen the use of this CDDB in addition to Microsoft's own Windows Media player.

In Chinese support, Microsoft's CDDB is also good: through the Web query, not only support Chinese keywords, even the page is Chinese, this is stronger than the Gracenote. Occasionally, you can find the cover of a Chinese CD.

III. Programming-related

Above all is directly through IE query, but some inquiries can not be done through IE, only in accordance with the CDDB query protocol to develop specialized query software. The following is a discussion of this technology, only for interested people to refer to.

1. Query process

Usually through the Web query CDDB, are first entered some keywords, such as song or album name, singer or band name, and then click "Query", waiting to enter the query results page, and then click on the listed album name to view the album content.

When you are programming to implement a CD query, it is only part of the case to query by the above keyword, and most software is queried by the information on the CD itself, such as EAC, Foobar, Real Player, and Windows Media player, after the user inserts the CD into the optical drive Automatically read the disc information, combined into Cdid, and then submitted to CDDB for query, very few users need to enter the keyword themselves.

For FreeDB and Gracenote, because the CD album information is uploaded by the users themselves (Gracenote commercial operation may be directly from the CD publishers to obtain data, but the foundation of the free era should have reservations), it is inevitable that there will be duplication, And the CDID algorithm itself can also collide, so in the FreeDB and Gracenote development documentation, developers are required to deal with "fuzzy"-in fact, duplicate query results. The usual way to do this is to pop a list and let the user choose for themselves, like the EAC. The software that queries FreeDB may have to write this code itself, and the control provided by Gracenote contains a selection interface for duplicate results.

For Microsoft CDDB, I have not found any duplication of the phenomenon. May be because Microsoft directly from the original CD Publisher to get the data, Cdid algorithm is also more rigorous, so the query results more accurate, Maybe ...

2. TOC and Cdid

As I said before, by querying CDDB on the CD's own information, you need to read the disc information, produce a cdid (CD unique identification number), and then query with this keyword.

In FreeDB's development documentation, CDID is called Discid, where detailed algorithm descriptions are available. After looking at this algorithm, I also analyzed the Gracenote, Microsoft CDDB Cdid algorithm, found that their algorithm is similar: Read the CD through the optical drive TOC (Table of Contents, table of Contents), from which to get the CD number of tracks, the length of each track, track start time. Loop TOC entry, which is the final cdid from the track length and start time.

Although the algorithm is similar, but the results are not the same: FreeDB Cdid only 32, Gracenote, Microsoft CDDB are up to 128 bits. I once doubted that this might be one of the reasons why FreeDB's query results are more repetitive and inaccurate, but hard to prove. In addition, mapping from TOC to Cdid is obviously an asymmetric hash process, which inevitably involves collisions, so it is necessary to allow fuzzy queries.

So if there is no CD on the hand, only the ape and cue files captured from the CD, can calculate the Cdid and query cddb accordingly.
The answer is: most of the cases can be, in a few cases not.

The reason is that according to the current cue file format, the cue file lacks two key things: the starting position of the first track. In a real CD, the first track cannot start 00:00:00, and in the cue file (in fact, during the capture process), the blanks are skipped, so the first track in the cue file will always start at 00:00:00. The length of the last track (accurate to 1/75 seconds). The length of the other tracks can be calculated from the start time of the next track, minus the start of the track, and the length of the last track is not so calculated.

The first deficiency affects CD burning, and the two cdid will affect the calculation of the CDDB query.

To solve this problem, it is common practice to assume that the first track starts at 2 seconds. Calculates the last track length from the Ape file.

After making these two assumptions, in most cases, you can follow ape+cue from CDDB to CD information, after all, even if a bit out of shape, there are fuzzy query support, but in a few cases will be a problem: Although the first track of most CDs is starting from 2 seconds, but not all CDs are so, I have a CD, is the first track of the starting position of 1/75 seconds, in Microsoft's CDDB has not been found, but the manual input keyword query can be found. When calculating the last track from the ape, if the ape has been divided into rails, or even converted to MP3 and converted back, it often results in inaccurate calculations. For example, I tried to query 10CD "Teresa Teng Music Codex", there is no rail, the whole disk of a ape file query up the problem is not large, in addition to the 9th, 10 are found outside, and the 1th of the rail, 3 of the total can not find, estimated that the length of the last track has changed.

So before the software that can generate and identify cue files has been fully improved to fill up the two of information mentioned above, you can only pray that the downloaded disks start at 2 seconds, and the ape files are taken from the real original disk, not from the converted burning disc (the conversion may change the last track length).

This problem can also be described as follows: Suppose you buy a CD, insert the CD into the optical drive, and if you can find it directly with Windows Media Player, I'm not sure if the CD is D, but if it's not on the CD, but in Windows Media Player input album name, player, song name and so on but can find this CD, then I dare to bet you a penny: This CD ten is J business oneself from Ape, even MP3 reprinting. FreeDB, gracenote because there is fuzzy query, not too accurate reprinting still may muddle past, Microsoft CDDB is difficult to mix the past.

On the issue of cue file, I have written a "cue documents," published in the Kenter Red Fast Green Music forum, but seems to be a few people.

3. Programming interface

FreeDB's query protocol is fully open and relevant technical documentation is available here. Many people have written the relevant query code according to the agreement, some of which are open source and can be used directly. As I write Freemp3tag, I used the PJ Naughter mfccddb, but to the socket connection part of the improvement to increase the success rate of the connection.

Gracenote programming interface is simpler: to apply for a non-commercial ID here, you can download an ActiveX control, and the use of a variety of development language examples, the control embedded in your own use, and then follow the example to call the query interface. The trouble is that after the development is complete, if you want to officially release the software you developed, you will need to apply for another official release ID. Do not know if others have applied to, anyway, my application is always no reply.

Microsoft's query interface has not been publicly available, so there is no directly related documentation or code for your reference. But for anyone who can read the HTML source code, it's not hard to guess just by looking at the interaction between Windows Media Player and the server.

4. Discussion on Chinese support

If you press Cdid to inquire about CDs, of course there is no Chinese problem, because Cdid is just a bunch of numbers that are calculated from the TOC. But when you query by keyword (album name, song name, player, etc.), the support level of three CDDB is different, as mentioned earlier. Here I would like to explore the reasons for this.

I personally think that the core of the Chinese problem is the coding problem: User input Chinese keywords must be sent to the CDDB server, ie in the sending of Chinese keywords, because the Chinese code high of 1, it must first be encoded in Chinese, the server received a query request, then the code for IE decoding. Such a coded-> decoding process requires that the decoder must know exactly what code page the encoder is using. Unfortunately, FreeDB did not do well in this area, so it was all messed up after decoding. To give a simple example, the user input "Teresa Teng", ie in accordance with the simplified Chinese code into UTF8 sent out, but FreeDB server received this query request, did not know that the UTF8 string is encoded from the simplified Chinese, so can only be based on its own set of a code page to decode, For example, according to Japanese decoding, the solution will be "Teresa Teng" these three words. If the keyword is wrong, the natural query is not.

Here are three CDDB query "Teresa Teng" results URL:

http://www.freedb.org/freedb_search.php?words=%26%2337011%3b%26%2320029%3b%26%2321531%3b&allfields=no& Fields=artist&fields=title&allcats=yes&grouping=none

http://www.gracenote.com/music/search-adv.html?q=&qartist=%e9%82%93%e4%b8%bd%e5%90%9b&qdisc=& Qtrack=&n=10&x=49&y=15

Http://metaservices.windowsmedia.com/CDWizard/CDWizard2.asp? Wmpfriendly=true&locale=804&searchtype=artist&searchstring=-28525,20029,21531&mode= displayartistalbums&albumid=&artistid={b50f45ab-1870-480a-b1e6-6c626b98709b}&version=1.0& Svolume=1&scdtoc=

Microsoft SearchString content is "Teresa Teng" three characters of the decimal Unicode encoding, Gracenote qartist content for "Teresa Teng" 3-byte UTF-8 encoding, Mask bit is:
U-00000800-u-0000ffff:1110xxxx 10xxxxxx 10xxxxxx

FreeDB's code would not have guessed it.

In fact, through the Web query, ie sent to the CDDB server in the HTTP request header will include the following line: ACCEPT-LANGUAGE:ZH-CN
I guess Microsoft, Gracenote is to judge the query request from the Simplified Chinese page, and FreeDB did not make this judgment. After the decision, Microsoft simply put the LCID (804) of the simplified Chinese into the locale field of the URL (Simplified Chinese version of Windows XP registry hkey_local_machine/system/currentcontrolset/control/ The default value under the Nls/language key is 804).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.