Solution to mp3 garbled characters in Ubuntu

Source: Internet
Author: User
Tags rhythmbox
I recently tried Listen and Banshee and found that mp3 garbled characters on Rhythmbox are still more serious. to thoroughly understand and solve the problem, we must clarify two points: first, mp3 tag type and encoding, and second, I believe all the players should have relevant development documents to illustrate the mp3 tag reading situation. However, I still use the most stupid method, that is, one by one test to draw a conclusion, isn't truth from practice? 1. Understand the mp3 tag type and encoding.

I recently tried Listen and Banshee and found that mp3 garbled characters on Rhythmbox are still more serious. to thoroughly understand and solve the problem, we must clarify two points: first, mp3 tag type and encoding, and second, I believe all the players should have relevant development documents to illustrate the mp3 tag reading situation. However, I still use the most stupid method, that is, one by one test to draw a conclusion, isn't truth from practice?

1. Understand the mp3 tag type and encoding

First of all, the mp3 tag type and encoding, we should know that there are mainly these standards, ID3v1, ID3v2 2.3, ID3v2 2.4, APEv2, ID3v1 only support ISO-8859-1 encoding (refer to the collection ), strictly speaking, it does not support Chinese (it does not mean that it cannot store Chinese information. Currently, ID3v1 labels of Chinese mp3 use this field to store Chinese information encoded in GBK/GB18030 ), while the second edition (ID3v2) supports the format of increased UTF-16, until version 2.4 began to support uft-8, but ID3v2 standard does not have unified label content encoding, for example, version 2.4 ID3v2 you can use a ISO-8859-1 encoding, or a UTF-16/uft-8 Unicode encoding format. What is best done is APEv2, which not only has good scalability, but also unified the encoding format into UTF-8, in this way, as long as the player that supports APEv2 reading plays mp3 files with the APEv2 tag, there will be no garbled problem.

2. Read mp3 tags from various players

The following research is the support degree of various players for these standards, the tested players are: gnome comes with Rhythmbox 0.10.0, Listen 0.5, Banshee 0.12.1 + dfsg-3, Quod Libet 0.24, Exaile! 0.2.8, GMPC 0.20., Audacious 1.2.2.

The test method is simple. Use an mp3 file to write different types of tags (more than 20 tags are arranged and combined ), in ID3v1 and ID3v2 2.3/2.4, different encodings are used to write Chinese information (such as GBK encoding), and then these players are used to read and obtain the results. From the results of this test, Rhythmbox has the best support for various mp3 tags, mainly thanks to its support for reading APEv2 tags. While Banshee is exactly the same as the remaining players, and does not support APEv2 reading. This can explain why mp3 is normal on Rhythmbox and garbled on other players. The reason is that many mp3 files use ID3v1 and APEv2 tags for compatibility. Rhythmbox will read garbled characters like ID3v1, But it preferentially reads the APEv2 tag, the Banshee players do not support APEv2 and can only read ID3v1. Of course, they will be garbled.

Their common feature is that the dependent libid3tag library fully reads TAG content according to ID3 standards. No matter what standard label is used, as long as it is to read the Unicode encoded Chinese content, certainly no problem, encounter GBK/GB18030 Encoded chinese content, or read it as a ISO-8859-1 encoding, no worries.

Ps: WMP on Vista does not support reading ID3v2 2.4 and APEv2 tags, but it is very smart and cannot be read. Instead, it uses the file name. The full range of tags can be read by jingting, however, writing Based on ID3v2 2.4 is not supported. I do not know if the 5.0 version to be released has changed. Foobar2000 v0.9.4.3 supports reading the full range of tags. By default, ID3v2 2.4 (UTF-8) is used for writing.

3. Solution

Now that you understand the cause of Garbled text, you have to find a solution. One solution is like the player on Win, which can be decoded based on local encoding or some other transcoding mechanisms, alternatively, you can select the preferred read order. In the above test, the player does not support custom encoding reading except Audacious. Another solution is to convert mp3 tags to Unicode encoding. This method is simple and supports standards and is recommended for use. If you support displaying file paths like Banshee, you can solve the Garbled text problem, but this is not the root path.

Currently, two tools are available to convert tags to Unicode encoding, and both support batch conversion.

1) One is the ID3iconv 0.2.1 written by Zhou Feng in java. The last update time is 2004/2/20.

Usage:
Java-jar ~ /Id3iconv-0.2.1.jar-e gbk *. mp3

To convert all mp3 files (including subdirectories) in the current directory ):
Find.-iname "*. mp3"-execdir java-jar ~ /Id3iconv-0.2.1.jar-e gbk {} ";

Note the above ~ /Id3iconv-0.2.1.jar location depends on your own situation

I believe that most of the mp3 tags found in mainland China are encoded in GBK/GB18030 format. It is enough to use-e gbk for processing. Of course, you can also use-e gb18030 for processing.

The-e gbk parameter indicates that the GBK encoding label is converted to Unicode encoding, which is not converted. If you need to convert other encoding files, you can modify them by yourself, for example, to Big5.

Tested, ID3v2 is converted to Version 2.3, the encoding format is uft-16

2) the other one is "Mutagen" written in Python. Currently, the latest version is 1.11, And the Ubuntu 7.04 source also has the 1.10 version of Mutagen. You can use this command to install it:
Sudo apt-get install python-mutagen

Ps: Required for Quod Libet and Listen Installation

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.