Algorithm Series 23: Audio playback and spectral display of discrete Fourier transform

Last Update:2015-03-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Audio playback and spectral display of the 23 discrete Fourier transform algorithm series
- Lead
- What is spectrum
  - 1 Principle of spectrum
  - 2 Selection of Spectrum
  - 3 Calculation of the spectrum
- Show Dynamic Spectrum
  - 1 Implementation Methods
  - 2 Miscellaneous Notes
- Results show

Lead

Spectrum and equalizer, almost a necessary object of Media Player program, without these two functions of the media Player will be considered not professional, now the mainstream player has these two features, Foobar 2000 of the 18-segment equalizer has been fascinated by many people. On the basis of the introduction of the discrete Fourier transform, this article will further explain how the spectrum is going on, the next article continues to introduce the equalizer.

1 What is spectrum

Spectrum is a professional term in the field of digital signal processing, but it is not the same as the spectrum described in this article. I played music with Winamp (AOL stopped Winamp support on December 20, 2013), and the first thing that attracted me was that the beating spectrum on the interface (1), I believe almost all the mainstream media player software has this, this is the spectrum to be introduced in this article.

Figure 1 The frequency of the winmap on the Beat

1.1 Principle of spectrum

This article first says the spectrum. Since it is the spectrum, it must be related to the frequency, right? Yes, the beating spectrum is actually the power distribution in the frequency domain of a small piece of audio information currently playing. Drum and chord music frequency range is very large, when the music has the ear drum, the frequency spectrum of the middle and low frequencies jump very high, indicating that the power of this part of the frequency is relatively high. Similarly, when the high-pitched violin sound sounded, the frequency of the spectrum of the high-jump, indicating high-frequency part of the higher power. It is because of this relationship that the spectrum is always "a contrast" with the music being played.

To display the bounce spectrum in the player, you need to know the power of each frequency in the audio data, and the common audio data are time domain signals that need to be converted into frequency domain signals for analysis. In the article "listening to the voice crack phone number", we introduce the discrete Fourier transform can transform the time domain sound signal into frequency domain frequency power distribution, and give the correlation algorithm, this is the basis of the spectrum display which this article introduces.

1.2 Selection of Spectrum

The Powerspectrums () function, which is given in the text of sound crack phone number, can obtain the effective frequency and power distribution of 1024 points after the 2048-point discrete Fourier transform of the audio signal of the 44100Hz sampling rate (the other 1024 points have symmetry with it). The corresponding frequency mapping range is 0Hz to 22050Hz. Player software usually has a very small interface, in this interface with 1024 bands all show from 0 to 22050Hz spectrum is unrealistic, also completely unnecessary, because most people's ear hearing range between 20Hz to 20KHz, not in this range of frequency can be ignored. The general spectrum shows a maximum of 32 bands (I use the Winamp 2.91 version with only 19 spectral bands), which involves another problem, which is how to select 32 of the 1024 spectral data to be used as a display of the spectrum. The principle of selection is to choose a representative frequency, the center frequency of two bands preferably do not differ too small, can be evenly selected, can also be uneven selection. There are many possible methods, the simplest way is to select a data every 32 frequency points, just select the power value of 32 points, and then map to 32 spectral bands display. After the 2048-point discrete Fourier transform, the 44100HZ sample rate audio signal has a spectral resolution of 3.90625Hz, and the frequency-domain data of every 32 points is over the width of 125Hz. In other words, this method selects a frequency point every 125Hz, and "simply and rudely" discards too much data, which causes the beating spectrum to lack consistent coherence.

1.3 Calculation of the spectrum

This article describes the method of dividing 1024 points into 32 bands, each band containing 32 frequency points. The center frequency points are found in each band, and two frequency points are uniformly taken from the center frequency point to the left and the right, and the values of the central frequency points to collect 5 frequency points are calculated. The calculation method is to give these 5 points a different weight, the middle point of the highest weight, and then lower the two sides, and then calculate the weighted average of 5 points, the weighted average value as the spectral power of the band map to the spectrum display. The calculated weighted average value can reflect the actual power of this 125Hz wide frequency segment, and from the final spectral display effect, the frequency spectrum of this method has better coherence. The Updatespectrum () function is the embodiment of this algorithm, for the Sampledata parameter gives a piece of audio data, first call the Powerspectrums () function to get the power distribution of this audio, and then according to the Band_count constant segment, Finally, the weighted average is calculated for the frequency domain data of each segment. The weights we assign to these 5 points are: Central point 0.5, two points close to the central point are 0.15, and the outermost two points are 0.1. Spectrumwnd is a spectrum window object that passes the computed results to the spectrum window through the object Setbandlevel () function.

voidUpdatespectrum ( Short*sampledata,intTotalsamples,intChannels) {floatPower[fft_size];if(Powerspectrums (&m_hfft, SampleData, totalsamples, channels, power)) {intFpfen = fft_size/2/Band_count;intLevel[band_count]; for(inti =0; i < Band_count; i++) {intCentpos = i * Fpfen + Fpfen/2;DoubleBandtotal = Power[centpos-2] *0.1+ Power[centpos-1] *0.15+ Power[centpos] *0.5+ Power[centpos +1] *0.15+ Power[centpos +2] *0.1; Level[i] = (int) (Bandtotal +0.5);      } m_spectrumwnd.setbandlevel (level, band_count); }  }

2 Show Dynamic spectrum

The dynamic spectrum is to update the spectrum at a certain time interval, making it appear to have a dynamic effect. Figure 2 shows a pseudo-Winamp spectrum display window, from this static moment of view, each band of the spectrum has three main components, namely the background, the current intensity level and a slowly falling thin line (Top_bar).

2.1 Implementation Methods

The principle of the Spectrum Display window is very simple, as long as you are familiar with Windows GDI programming, implementing a dynamic Spectrum window is very easy. First you need to record 32 bands of data, each band of data contains three parts, you need a list to record the current intensity level of each band and the current Top_bar position. Each time a buffer is played, the Updatespectrum () function calculates the power intensity of the corresponding band and refreshes the list of intensity levels for the current bands, depending on the buffer size of the selection, and the frequency of the refresh should be about 5-10 times per second. At the same time, the internal position update timer also reduces the intensity level of each band periodically, and reduces the top_bar position, in order to make the spectrum display smooth a little, the frequency of the update timer is greater than the intensity level of the refresh frequency, generally should be more than 15 times per second.

Top_bar location and intensity level refreshes are a process that is less frequent, but less in the same way. The decrease in intensity level can be a fixed value that decreases a certain amount each time. The Top_bar maintains a hover time, where the position does not change over the hover time, and when the hover time is over, the decrease in its value is a gradual process, and eventually catches up to the intensity level before the intensity level is reduced to 0, making the spectral display look lively and interesting. The following is the update timer processing code, which is used in the example of this article, for reference only:

voidCspectrumwnd::updatelevelontimer () { for(inti =0; i < Band_count; i++) {if(M_curlevel[i] >= m_levelstep) m_curlevel[i]-= m_levelstep;ElseM_curlevel[i] =0;if(M_topbar[i].wait >0) m_topbar[i].wait--;Else{m_topbar[i].level = (M_topbar[i].level > M_topbar[i].Step) ? (M_topbar[i].level-m_topbar[i].Step) :0;if(M_topbar[i].level <= m_curlevel[i]) M_topbar[i].level = M_curlevel[i];if(M_topbar[i].Step< -) M_topbar[i].Step+ = (M_topbar[i].Step/2); }      }  }

M_levelstep is the number of points per reduction in intensity values, the wait property of Top_bar is the hover count, which controls the hover time, and when it is reduced to 0 o'clock, it starts to drop top_bar position, and each drop is 1.5 times times the number of previous drops, so it is a step-up process.

2.2 Miscellaneous Notes

The Spectrum Display window is a window that requires high-speed drawing, and direct use of GDI functions to draw a spectrum window has proven to be inefficient and deprecated. In general, the use of bitmap buffers to handle the high-speed Refresh window, in a piece of bitmap data directly through the color value control "generate" spectrum display bitmap, and then use the map of the GDI function directly "paste" to the window DC.

Because of the difference in the conduction between the human brain and the brain, the sound and visual signals have a time difference between the response of the voice and the vision, and the speed of the propagation of the sound and the light itself is very different, so in order for the spectrum to show a better sensory experience, the timing of the spectrum display needs to be adjusted. Generally speaking, the sound should be played out before the spectrum, which involves a problem, that is, the sound of audio data segmentation is more appropriate? This is actually the player audio buffer size selection problem, the buffer can not be too large, such as 0.5 seconds more than the audio buffer, and so on after 0.5 seconds to display the spectrum, the visual experience on the feeling is not on, the drums are ringing a half-day spectrum is reflected out, this feeling certainly not good. Buffer too small is not good, first discrete Fourier transform calculation of large, it takes a certain amount of time to the audio data processing, the buffer is too small, there is not enough time to calculate, of course, now the CPU is very strong, this is not the main problem, if the buffer is too small will cause the spectrum refresh too often, This makes the spectrum display look incoherent and very mechanical. I also have no theoretical data support, according to practical experience, the audio buffer size between 0.05 seconds to 0.2 seconds, you can achieve a better visual experience, the example given in this paper uses a 0.1s audio buffer, for my feeling, the effect can be. My friends would be grateful if they had the theoretical data to tell me.

3 Results show

This article created in the process of writing the example program is a wave file playback program, play and display a beat of the spectrum, the appearance of imitation Winamp display effect, draw out the spectral shape is closer to the display of Winamp, Figure 2 is the final effect of the demo program, come here, The next one goes on to the implementation of the audio equalizer.

Finally broadcast, should be enthusiastic readers of the request, this column of all the article corresponding to the demo code, will provide packaging download, is currently looking for storage location, please pay attention to the blog update.

Figure 2 Spectrum Display Demo window

Algorithm Series 23: Audio playback and spectral display of discrete Fourier transform

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Algorithm Series 23: Audio playback and spectral display of discrete Fourier transform

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Algorithm Series 23: Audio playback and spectral display of discrete Fourier transform

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support