Analysis of ID3V1 information and ID3V2 information structure of MP3 files

Source: Internet
Author: User
Tags id3
Analysis of id3v1 information and ID3v2 information structure of MP3 files


-- Wu Juntao 2005/05/05

E-mail: QQ: 29248671

Main Page: (with source code)

I am a programmer and want to use VB. NET to write a program that can read information about MP3 files, but does not know the file structure. Ah, I have been online for a long time (for several months) and have not found VB. Later I saw a VC with a structure analysis. That's great.

I. Analysis of "ID3v1" Information

The basic MP3 song information is contained in the last 128 bytes of the MP3 file. Its structure is as follows:

Public structure id3v1infostructure id3v1info

Dim id3v1tag as string' tag three letters, ID of id3v1

Dim title as string 'stores the title information, 30 bytes

Dim artist as string 'stores information about a singer, 30 bytes

Dim album as string 'stores album information, 30 bytes

Dim year as string 'Storage age information, 4 bytes

Dim comments as string 'stores remarks, 28 bytes (sometimes 30 bytes)

Dim genre as string 'stores the music style information, reserved bits, 1 byte

Dim reserved as string 'Reserved Bit, 1 byte (sometimes meaningless

Dim track as string' audio track (qu) Reserved Bit, 1 byte (sometimes none)

End Structure

The ID3V1 information storage structure is as follows (1 ):

Figure 1 ID3v1 information of an MP3 file

1-3 tag

4-33 song names (take me to your heart)

Michael learns to rock)

64-93 album name (take me to your heart)

94-97 (2004)

98-125 remarks (

126 reserved bits. At this time, it is 0, indicating that there is a sound track. The next bits are audio tracks.

127 reserved bits for the audio track (the first song) (OC)

128 reserved bits (style) (66)

In Winamp's ID3v1 song information (1), we can see that all of them include:
Artist (Artist name)
Album (Album name)
Comment (Remarks)
Genre (Song Style) note, see the detailed list below
Track # (the sequence of songs in the album is what we often call "the first few songs ")

Title, Artist, Album, Year, and Comment can all be obtained in the 128 bytes. Where Are Genre and Track? Some friends have paid attention to the first 128 pieces of information of the 125 bytes, but these two pieces of information are placed in the last 126-128 bytes. In fact, 127 is the Track information, and 128 is the Genre information. Their storage method is not character. When we extract them, we need to note that they are all numbers. For example, if we see 0x0 d in the 126 Part of this song, it is obviously 13. That is, the style of the No. 13th song, Pop Popular (listed below ).

At this time, you should have guessed that both 127 and 128 are meaningful. Naturally, 126 is also meaningful! Comment (Comment) of ID3v1 information occupies a total of 28 bytes. This statement is not completely correct. To be accurate, it should be correct. Sometimes the comment can also exceed this number. ID3v1 requires that the comment can be up to 30 bytes. Some readers may ask, "Is there 130 bytes of information for MP3 ID3v1 ?" No, of course not. ID3v1 is a fixed 128 byte, so you don't have to worry about it. In fact, ID3v1 is arranged as follows: if the MP3 comment is larger than 28 bytes, We need to borrow 126-127 bytes. Therefore, the comments of ID3v1 may be 28 bytes or 30 bytes. So how can we tell whether it is 28 bytes or 30 bytes? It is very simple. We only need to check if the value of 126 is 0x00. If the value is 0x00, the comment contains 28 bytes. If it is not equal to 0x00, the comment is 30 bytes. At the same time, do not forget that because the 127th bytes store the Track information, if the comment is 30 bytes, the 127 Information in the ID3v1 of this song is naturally not the Track information. The Track is naturally no place to store, so it doesn't make sense to change the 127 position. It is only part of the Comment. When you decide to create a program to read ID3v1, pay special attention to it.

We finally learned whether the comments of ID3v1 in Section 126 are 28 bytes or 30 bytes. Track information (Track) and Reserved3 (Genre ). Now we can re-write the structure.

Principle 2: whether the MP3 file contains the ID3v1 error message
That is what can be called "MP3 files do not have ID3v1 information ". The detection method is to extract the last 128 bytes of the specified MP3 file, and then determine that the first 3 bytes of the 128 bytes are "tags ". Many friends agree that this method is correct. However, the problem is actually not that simple.

Winamp or Other MP3 player-related software supports writing and reading MP3 information, however, the software that writes data to ID3v1 will unconsciously Add the 128 bytes of information to the MP3 file as soon as you open the file. That is to say, when we use this software to open an MP3 file, these software will automatically add a 128-byte ID3v1 structure at the end of the MP3 file, and start with "TAG! (3 ). Obviously, it is difficult to determine whether the MP3 has ID3v1 information by checking the three bytes of "TAG. We also need to determine whether the 125 bytes after the TAG is correct. Generally, the ID3v1 structure generated by such software is composed of a pile of 00, or a pile of spaces, therefore, we need to determine whether the ID3v1 information is a pile of 00 or a pile of spaces. If yes, although the MP3 file contains the "TAG" three letters, it is still not a legal ID3v1 information. MP3 files should still be considered as having no ID3v1 information. I think it is necessary to remind everyone of this.

Ii. ID3v2 Information Extraction
The "ID3v1 information" of the MP3 file ". This information structure is very easy to extract, and writing to a file is not difficult. However, its information arrangement and scalability are very poor (only 128 bytes ). As you know, MP3 files have another information structure, which is highly scalable and has unlimited storage capacity (that is, the total length is not fixed ). This information is the ID3v2 information (relative to ID3v1 ). Because the ID3v1 information is stored in the last 128 bytes of the file, ID3v2 has to discard the option of storing it at the end of the file, so it is stored at the starting position of the file.

The storage and reading of ID3v2 information is much more complex than ID3v1 information. This is because the ID3v2 information is no longer fixed, and because this information is stored at the beginning of the file, re-writing is far more troublesome than ID3v1.

I would like to explain how to read ID3v2 in a concise and clear way. Up to now, ID3v2 has a total of four versions, but the popular MP3 playback software only supports version 3rd, that is, ID3v2.3. We want to read ID3v2.3 information. ID3v2 consists of two parts: header information and header information. The header information occupies 10 fixed bytes,

Each ID3V2 tag consists of a tag header and several tag frames or an Extended Tag. Information about tracks, such as titles and authors, are stored in different tag frames, it is not necessary to expand the label header and the label frame, but each label must have at least one label header and the label frame are stored in the MP3 file header in sequence.

Its structure is as follows:

(1) label Header

Private structure id3v2headerstructure id3v2header

Dim header () as byte 'id3v2 identifier, which should be "ID3" with three letters

The Dim Ver As Byte version number ID3V2 is recorded As 3

Dim Revision As Byte. This version number is 0.

Dim Flag As Byte 'stores the Flag bytes. This version defines only three characters.

Dim Size () As Byte 'label Size, excluding the 10 bytes of the label header (but some articles have mentioned this). I did this after verifying it. I can see the source code.

End Structure

The information of these 10 bytes is used:

1. Header (2), generally "ID3", otherwise there is no ID3V2 Information

2. Flag byte: the Flag byte is generally 0, and its meaning is abc00000.

A-Indicates whether to use Unsynchronisation

B-Indicates whether an extension header exists. Generally, it does not exist (neither does WINAMP). Therefore, it is not set.

C-indicates whether the label is a test label (99.9% of labels are not used for testing, so they are not usually set)

3. Sixe (3) Label Size: a total of four bytes, but each byte only uses 7 bits, and the maximum BITs do not use a constant of 0, so the format is as follows:

0 xxxxxxx 0 xxxxxxx 0 xxxxxxx 0 xxxxxxx

To calculate the size of a tag, remove 0 and obtain a 28-bit binary number. The formula is as follows:

① VC: ID3size = (Size [0] & 0x7F) * 0x200000 + (Size [1] & 0x7F) * 0x400 + (Size [2] & 0x7F) * 0x80 + (Size [3] & 0x7F );

②. For VB: ID3size = Size (0) * (2 ^ 21) + Size (1) * (2 ^ 14) + Size (2) * (2 ^ 7) + Size (3) * (2 ^ 0)

In VB, I have already declared a function ByteToLong in the class, which is very convenient. It is okay to call it directly.

By parsing this header, we can know whether an MP3 file has ID3v2 information. If so, we can know the total length of ID3v2 data body.

(2) label Frame

Next, we will resolve the tag frames of ID3v2. Don't worry, although complicated, it is not as painful as you think. ID3v2 data bodies are divided into many identical data structures.

Each tag frame consists of a 10-byte frame header and at least one byte of unfixed length. They are also stored in files in sequence, there are no special characters separated from the label header and other label frames. The content of a complete frame can only be read from the frame header to the content size. Pay attention to the size when reading the frame, do not read the content or header of other frames. Frames are defined as follows:

Private Structure ID3v2FrameStructure ID3v2Frame

Dim FrameID As String 'identifies a frame with four characters, indicating its content. The table of common identifiers is shown in the appendix.

Dim Size () As Byte Size of the Four-Byte frame content, excluding the frame header and not smaller than 1. The formula is also used for calculation.

Dim Flags () As Byte 'Two-Byte storage flag, defined only 6 bits, will be explained in detail later

End Structure

1. frameid frame MARK: four characters are used to mark the content meaning of a frame. Common comparisons are as follows:

TEXT: lyrics by TENC: Encoding
WXXX: URL link (URL) TCOP: Copyright (Copyright)
TOPE: original artist TCOM: Composer
TDAT: Date TPE3: Conductor
TPE2: band TPE1: Artist equivalent to ID3v1 Artist
TPE4: Translation (recorder, modifier) TYER: Year Equivalent to ID3v1
USLT: lyrics TALB: Album equivalent to ID3v1 Album
TIT1: Content Group description TIT2: Title equivalent to ID3v1 Title
TIT3: subtitle TCON: Genre (style) equivalent to ID3v1 Genre see the table below
TBPM: Number of beats per minute COMM: Comment equivalent to ID3v1
TDLY: record the playing list. TRCK: the audio Track (qu) is equivalent to the Track of ID3v1.
TFLT: file type TIME: TIME
TKEY: initial keyword TLAN: Language
TLEN: length TMED: media type
TOAL: original album TOFN: original file name
TOLY: original lyrics Author: original release year
TOWM: file owner (license holder) TPOS: portfolio Section
TPUB: issuer TRDA: recording date
TRSN: Intenet radio station name TRSO: Intenet radio station owner
TSIZ: Size TSRC: ISRC (International Standard record code)
TSSE: software used for encoding (hardware settings) UFID: Unique File Identifier
AENC: audio encryption technology

It should be noted that this frameid is used. In id3v1, we determine the information of each information based on the fixed number and position of bytes. ID3v2 makes the information "dynamic" to provide better scalability, because the length is not pre-set, but stored in size [4. In this way, the length is no longer fixed. I think ID3v2 and id3v1 are also worth considering when we define our own files. If the structure is small and the storage volume is small, we can use the id3v1 information storage method. If the storage information is not fixed and requires good scalability, ID3v2 is of course the first choice. In fact, many file storage methods are very similar to ID3v2.

2. Size () frame content size: No longer the size of each byte like the total header is only the last 7 bits. It is stored in normal 8 bits. The format of the obtained frame content is as follows:


The formula is as follows:

① VC: fsize = size [0] * 0x100000000 + size [1] * 0x10000 + size [2] * 0x100 + size [3];

②. For VB: id3size = size (0) * (2 ^ 21) + size (1) * (2 ^ 14) + size (2) * (2 ^ 7) + size (3) * (2 ^ 0)

In VB, I have already declared a function bytetolong in the class, which is very convenient. It is okay to call it directly.

3. Flags () Flag: only six digits are defined. The other 10 digits are 0, but in most cases, 16 digits are 0. The format is as follows:

A-label protection flag. This frame is considered invalid when it is set.

B-indicates the file protection flag. This frame is considered invalid when it is set.

C-read-only flag. This frame cannot be modified when it is set (it does not seem to have been seen yet)

I-compression flag. When this parameter is set, two BCD codes are stored in one byte to indicate numbers.

J-encryption flag (not practical)

K-group flag. It indicates that this frame is a group with other frames.

For details, visit

4. Frame content (data body)

The header is followed by the data body. we extract the first 10 bytes of the Data body. We know that the frameid of the data structure storage is tit2. Check the table above, which indicates that the data structure stores the song name information. The value is 00 00 00 17, and the value is 23 in decimal format. That is, the song name is the information of 23 bytes after the Subheader. That is, "Take me to your heart ". The following data structure's frameid is tpe1, indicating the name of the singer, and the size is 00 00 00 17, indicating that the data body has 23 bytes, that is: "Michael learns to rock ". And so on. What you need to know here is that a Chinese character occupies two bytes. When writing data, we need to calculate the number of bytes. I have compiled a function bytesize, which can be used directly.

Note that the first four (but tested as the first five) bytes of the Data body of ID3v2 annotation information (frameid is Comm, it is not the comment content, but the natural language used for the comment. In this example, we see: "eng \ 0". We need to skip the four bytes for parsing. In addition, genre (frameid is tcon) of ID3v2 is also stored differently. Many MP3 players do not write in the same way, but do not write in genre. For example, the genre of ID3v2 of this song is classic rock. In fact, some of them will be written into: (1), 1, and (1) classic rock, so the formats are varied, we should pay attention to it during parsing. In addition, Winamp adds '\ 0' to the front of the content when saving and reading the frame content, and calculates the byte in the frame content size. Therefore, the singer name "Michael learns to rock" mentioned above should be 22 bytes, But it occupies 23 bytes.

The source code has been released. You can download it from the home page of this site. If you have any questions, please contact me.

If you have any questions, you can send E-MAIL to me, we will discuss it together, the home page has source code and instances, interested friends do not visit to see the

[Reproduced] by: Wu Juntao time: 2005/05/05

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.