Because the human ear to the sound perception (such as: frequency, tone) is non-linear, in order to measure the perception of the sound, produced a series of scales (such as: 12 Average law), here the emphasis on bark scale and Mel scale. At first, I didn't understand the difference between the two scales myself. Then gradually understand their starting point of thinking, here simply share it.
Bark (buck) Frequency scale is in Hz, the frequency mapping to the psychological acoustics of the 24 critical frequency band, the 25th critical frequency band occupy about: 16k~20khz frequency, 1 critical frequency band width equals a Bark, simply said, Bark scale is to convert the physical frequency to the frequency of psychological acoustics. The central frequency and the critical bandwidth boundary frequency of the bark scale frequency are shown in the following table:
Critical band |
Frequency/hz |
Bark Band |
Center frequency |
Nether Frequency |
Upper bound frequency |
1 |
50 |
0 |
100 |
2 |
150 |
100 |
200 |
3 |
250 |
200 |
300 |
4 |
350 |
300 |
400 |
5 |
450 |
400 |
510 |
6 |
570 |
510 |
630 |
7 |
700 |
630 |
770 |
8 |
840 |
770 |
920 |
9 |
1000 |
920 |
1080 |
10 |
1170 |
1080 |
1270 |
11 |
1370 |
1270 |
1480 |
12 |
1600 |
1480 |
1720 |
13 |
1850 |
1720 |
2000 |
14 |
2150 |
2000 |
2320 |
15 |
2500 |
2320 |
2700 |
16 |
2900 |
2700 |
3150 |
17 |
3400 |
3150 |
3700 |
18 |
4000 |
3700 |
4400 |
19 |
4800 |
4400 |
5300 |
20 |
5800 |
5300 |
6400 |
21st |
7000 |
6400 |
7700 |
22 |
8500 |
7700 |
9500 |
23 |
10500 |
9500 |
12000 |
24 |
13500 |
12000 |
15500 |
25 |
18775 |
15500 |
22050 |
I've found that there are a number of formulas that try to model the above table, using a much more of the formula (Zwicker,terhardt 1980):
\[b = 13{\tan ^{-1}}\left ({\frac{{0.76f}}{{1000}}} \right) + 3.5{\tan ^{-1}}{\left ({\frac{f}{{7500}} \right) ^2}\]
The above-mentioned frequency F represents the center frequency, I put the above formula in MATLAB to calculate, found the first 5 bark band and the calculation of the entrance is relatively large, do not know what the reason.
The Mel frequency scale is also a frequency-mapping-aware model that describes the nonlinear mapping of pitch-aware functions, which are represented as follows:
\[m = 1127.01048{\log _e}\left ({1 + \frac{f}{{700}}} \right) \]
One thing to note is that the frequency F unit here is 1kHz, that is, 1kHz is the reference point between the Mel frequency and the true frequency in Hertz (1khz=1000mel), on the other hand, Mel is derived from the music terminology melody, Is the measure of the distance between the frequency component of the melody and the pitch perception.
This article is from Icoolmedia, the related algorithm exchange please go to the audio and video algorithm discussion group (374737122) to communicate together!
Scale--bark scale and Mel scale in audio processing