5. MPEG audio tags
MPEG audio tags can be divided into two types: ID3v1, which has a file tail and a length of 128 bytes, and ID3v2, which is an extension of ID3v1 and has a file header with an indefinite length.
1. ID3v1
The ID3v1 label is used to describe the MPEG audio file. Including artist, title, album, publishing age, and genre. In addition, there is additional comment space. It is located at the end of the audio file, which is fixed to 128 bytes. You can read the last 128 bytes of the file to obtain the tag.
The structure is as follows:
Aaabbbbb bbbbbbbb bbbbbbbbbb
BCCCCCCC CCCCCCCC CCCCCCCD
DDDDDDDD dddddddeee
Efffffff ffffffff fffffffffffg
Symbol |
Length (bytes) |
Location (bytes) |
Description |
A |
3 |
(0-2) |
Label flag. If a TAG exists and is correct, it must contain 'tag '. |
B |
30 |
(3-32) |
Title |
C |
30 |
(33-62) |
Artist |
D |
30 |
(63-92) |
Album |
E |
4 |
(93-96) |
Age |
F |
30 |
(97-126) |
Note |
G |
1 |
(127) |
Genre |
This specification requires that all spaces must be filled with null characters (ASCII 0. However, not all applications follow this rule. For example, if winamp is replaced by space (ASCII 32.
The ID3v1.1 structure has some changes. The last byte in the comment part is used to define the track number in the album. If you do not know this information, you can use an empty character (ASCII 0) instead.
The genre is represented by the original code, which is one of the following numbers:
0 |
'Blues' |
20 |
& Apos; Alternative & apos' |
40 |
'Alternrock' |
60 |
'Top 40' |
1 |
'Classic Rock' |
21 |
'Sks' |
41 |
'Bass' |
61 |
'Christian reac' |
2 |
'Country' |
22 |
'Destath metal' |
42 |
'Soul' |
62 |
'Pop/funk' |
3 |
'Dance' |
23 |
'Prank' |
43 |
'Punk' |
63 |
'Jungl' |
4 |
'Disco' |
24 |
'Soundtrack' |
44 |
'Space' |
64 |
'Native American' |
5 |
'Funk' |
25 |
'Euro-techno' |
45 |
'Meditmode' |
65 |
'Cabaret' |
6 |
'Grunge' |
26 |
'Ambient' |
46 |
'Instrumental Pop' |
66 |
'New Wave' |
7 |
'Hip-Hop' |
27 |
'Trip-Hop' |
47 |
'Instrumental Rock' |
67 |
'Psychadelic' |
8 |
'Jazz' |
28 |
'Vocal' |
48 |
'Ethenice' |
68 |
'Rdav' |
9 |
'Metal' |
29 |
'Jazz + funk' |
49 |
'Gothic' |
69 |
'Showinstances' |
10 |
'New age' |
30 |
'Fusion' |
50 |
'Dark' |
70 |
'Trailer' |
11 |
'Oldies' |
31 |
'Trance' |
51 |
'Techno-industrial' |
71 |
'Lo-Fi' |
12 |
'Other' |
32 |
'Classical' |
52 |
'Electroenice' |
72 |
'Trigger' |
13 |
'Pop' |
33 |
'Instrumental' |
53 |
'Pop-folk' |
73 |
'Acid punk' |
14 |
'R & B' |
34 |
'Acid' |
54 |
'Eurodance' |
74 |
'Acid jazz' |
15 |
'Rap' |
35 |
'House' |
55 |
'Dream' |
75 |
'Polka' |
16 |
'Reggae' |
36 |
'Game' |
56 |
'Southern Rock' |
76 |
'Retro' |
17 |
'Rock' |
37 |
'Sound clip' |
57 |
'Comedy' |
77 |
'Musical' |
18 |
'Techno' |
38 |
'Gospel' |
58 |
'Cresult' |
78 |
'Rock & roll' |
19 |
'Industrial' |
39 |
'Noise' |
59 |
'Giangst' |
79 |
'Hard Rock' |
Winamp expands this table
80 |
'Folk' |
92 |
'Sive ssive Rock' |
104 |
'Chamber Music' |
116 |
'Balad' |
81 |
'Folk-Rock' |
93 |
'Psychedelic Rock' |
105 |
'Sonata' |
117 |
'Poweer balad' |
82 |
'National folk' |
94 |
'Symphonic Rock' |
106 |
'Symphony' |
118 |
'Rhytmic Soul' |
83 |
'Swing' |
95 |
'Low Rock' |
107 |
'Booty Brass' |
119 |
'Freestyle' |
84 |
'Fast Fusion' |
96 |
'Big band' |
108 |
'Primus' |
120 |
'Duet' |
85 |
'Bebob' |
97 |
'Chorus' |
109 |
'Porn groove' |
121 |
'Punk Rock' |
86 |
'Latin' |
98 |
'Easy listening' |
110 |
'Satire' |
122 |
'Drum solo' |
87 |
'Revival' |
99 |
'Acoustic' |
111 |
'Slow jam' |
123 |
'A Capela' |
88 |
'Celtic' |
100 |
'Humour' |
112 |
'Club' |
124 |
'Euro-house' |
89 |
'Bluegrass' |
101 |
'Speech' |
113 |
'Tnang' |
125 |
'Dance Hall' |
90 |
'Avantgarde' |
102 |
'Chanson' |
114 |
'Samba' |
|
|
91 |
'Gothic Rock' |
103 |
'Opera' |
115 |
'Folklore' |
Other extensions
126 |
'Goa' |
132 |
& Apos; British pop & apos' |
138 |
'Blackmetal' |
144 |
'Trashmetal' |
127 |
'Drum & Bass' |
133 |
'Negerpunk' |
139 |
'Crossover' |
145 |
'Anime' |
128 |
'Club-house' |
134 |
'Polskpunk' |
140 |
'Contemporarychristian' |
146 |
'Jpop' |
129 |
'Hardcore' |
135 |
'Beat' |
141 |
'Christianrock' |
147 |
'Synthpop' |
130 |
'Error' |
136 |
'Christangstarap' |
142 |
'Merenge' |
|
|
131 |
'Indie' |
137 |
'Heartmetal' |
143 |
'Salsa' |
|
|
Any other value is considered as "unknown"
2. ID3V2
Up to now, ID3V2 has a total of four versions, but the popular playback software only supports version 3rd, both ID3v2.3. Because ID3V1 is recorded at the end of the MP3 file, ID3V2 has to be recorded at the header of the MP3 file (if ID3V3 is released one day, I really don't know where the record is ). For this reason, the operation on ID3V2 is slower than that on ID3V1. In addition, the ID3V2 structure is much more complex than the ID3V1 structure, but it is more comprehensive and scalable than the former.
The following describes ID3V2.3.
Each ID3V2.3 tag consists of a tag header and several tag frames or an Extended Tag header. Information about tracks, such as titles and authors, are stored in different tag frames. It is not necessary to expand the tag header and tag frame, but each tag must have at least one tag frame. The tag header and the tag frame are sequentially stored in the MP3 file header.
(1) label Header
Record the header of the 10-byte ID3V2.3 In the first sequence of the file. The data structure is as follows:
Char Header [3];/* must be "ID3"; otherwise, the tag does not exist */
Char Ver;/* the version number ID3V2.3 is recorded 3 */
Char Revision;/* minor version number. This version is recorded as 0 */
Char Flag;/* indicates the byte of the Flag. This version defines only three characters, which will be detailed later */
Char Size [4];/* tag Size, including the 10 bytes of the tag header and the Size of all tag frames */
(1). Mark byte
The flag byte is generally 0 and is defined as follows:
Abc00000
A -- indicates whether to use Unsynchronisation (this word does not know what it means and is not found in the dictionary. It is generally not set)
B -- indicates whether there is an extension header. Generally, there is no (at least the Winamp has no records), so it is not set.
C -- indicates whether it is a test label (99.99% of the labels are not used for testing, so they are generally not set)
(2). Label Size
There are four bytes in total, but each byte only uses 7 bits, and the maximum bit does not use a constant of 0. The format is as follows:
0 xxxxxxx 0 xxxxxxx 0 xxxxxxx 0 xxxxxxx
To calculate the size of a tag, remove the value of 0 and obtain a 28-bit binary number. The formula is as follows:
Below:
Int total_size;
Total_size = (Size [0] & 0x7F) x 0x200000
+ (Size [1] & 0x7F) * 0x4000
+ (Size [2] & 0x7F) * 0x80
+ (Size [3] & 0x7F)
(2) label Frame
Each tag frame consists of a 10-byte frame header and at least one non-fixed-length content. They are also stored in files sequentially.
And the tag header and other tag frames are not separated by special characters. The content of a complete frame is only from the frame header to the content size.
You can only read the data after it is small. Pay attention to the size when reading the data. Do not read the content or header of other frames.
The frame header is defined as follows:
Char FrameID [4];/* identifies a frame with four characters, indicating its content. A common identification table will be provided later */
Char Size [4];/* Size of the frame content, excluding the frame header. It must be no less than 1 */
Char Flags [2];/* stores the flag, which is defined as only 6 characters. For more information, see */
(1). Frame ID
A frame is identified by four characters, indicating the content of a frame. The common comparison is as follows:
TIT2 = title indicates the title of the song, the same below
TPE1 = author
TALB = Special set
TRCK = audio track format: N/M where N is the first N in the album, M is the first M in the album, and N and M are ASCII numbers.
TYER = the age is an ASCII number.
TCON = the type is directly represented by a string
COMM = remarks: "eng/0 remarks", where eng indicates the natural language used for the remarks
(2). Size
This algorithm does not have the label header, so it is troublesome to use all 8 bits per byte. The format is as follows:
Xxxxxxxx
The algorithm is as follows:
Int FSize;
FSize = Size [0] x 0x100000000
+ Size [1] x 0x10000
+ Size [2] x 0x100
+ Size [3];
(3). Flag
Only 6 bits are defined, and the other 10 bits are 0, but in most cases, 16 bits are all 0. The format is as follows:
Abc00000 ijk00000
A -- tag protection flag, which is regarded as void when set
B -- indicates the file protection flag. This frame is considered invalid when it is set.
C -- read-only flag. When this frame is set, it cannot be modified (but I didn't find a software to ignore this flag)
I -- compression flag. When set, one byte stores two BCD codes to indicate numbers.
J -- encryption mark (I have never seen any MP3 file tag encrypted)
K -- group flag. It indicates that this frame and other frames are a group.
It is worth mentioning that when winamp saves and reads the frame content, it will add '/0' in front of the content and calculate this Byte in the frame content
In size.
Appendix: Description of the frame ID
(4). Declared ID3v2 frames
The following frames are declared in this draft.
AENC Audio encryption
APIC Attached picture
COMM Comments
COMR extends cial frame
ENCR Encryption method registration
EQUA Equalization
ETCO Event timing codes
GEOB General encapsulated object
GRID Group identification registration
IPLS Involved people list
LINK Linked information
MCDI Music CD identifier
Mllt mpeg location lookup table
OWNE Ownership frame
PRIV Private frame
PCNT Play counter
POPM Popularimeter
POSS Position synchronisation frame
RBUF Recommended buffer size
RVAD Relative volume adjustment
RVRB Reverb
SYLT Synchronized lyric/text
SYTC Synchronized tempo codes
TALB Album/Movie/Show title
Tbpm bpm (beats per minute)
TCOM Composer
TCON Content type
TCOP Copyright message
TDAT Date
TDLY Playlist delay
TENC Encoded
TEXT Lyricist/Text writer
TFLT File type
TIME Time
TIT1 Content group description
TIT2 Title/songname/content description
TIT3 Subtitle/Description refinement
TKEY Initial key
TLAN Language (s)
TLEN Length
TMED Media type
TOAL Original album/movie/show title
TOFN Original filename
TOLY Original lyricist (s)/text writer (s)
TOPE Original artist (s)/Timer mer (s)
TORY Original release year
TOWN File owner/licensee
TPE1 Lead sort mer (s)/Soloist (s)
TPE2 Band/orchestra/accompaniment
TPE3 Conductor/sort mer refinement
TPE4 Interpreted, remixed, or otherwise modified
TPOS Part of a set
TPUB Publisher
TRCK Track number/Position in set
TRDA Recording dates
TRSN Internet radio station name
TRSO Internet radio station owner
TSIZ Size
Tsrc isrc (international standard recording code)
TSSE Software/Hardware and settings used for encoding
TYER Year
TXXX User defined text information frame
UFID Unique file identifier
USER Terms of use
USLT Unsychronized lyric/text trancoder
WCOM Commercial information
WCOP Copyright/Legal information
WOAF Official audio file webpage
WOAR Official artist/Timer mer webpage
WOAS Official audio source webpage
WORS Official internet radio station homepage
WPAY Payment
WPUB Publishers official webpage
WXXX User defined URL link frame
Most of the above texts come from the Internet and contain some of my own understandings. If there are any mistakes, correct them.
URL of some reference articles
Http://mpgedit.org/mpgedit/mpeg_format/mpeghdr.htm
Http://www.codeproject.com/audio/MPEGAudioInfo.asp
Http://le-hacker.org/hacks/mpeg-drafts/11172-3.pdf (ISO/IEC 11172-3 I think there should be a lot of people looking for it, but the frame synchronization bit defined here is 12 bits because it is the old standard)
Http://webstore.iec.ch/preview/info_isoiec13818-3%7Bed2.0%7Den.pdf (ISO/IEC 13818-3 website seems to be charged, but can be down directly, it should be no one to bother me)