Algorithm Series 23: Audio playback and spectral display of discrete Fourier transform

Source: Internet
Author: User

Spectrum and equalizer, almost a necessary object of the media Player program, without these two features of the media Player will be considered not professional, now the mainstream player has these two functions,foobar 18-segment equalizer has been fascinated by many people. I use Winamp to play Music(AOL has stopped Winamp support in the year of the month), The first thing that attracted me was the beating spectrum on the playback interface, as shown in (1). I've been trying to figure out what this implementation principle was until I realized that there was a discrete Fourier transform when this thing existed.

Figure (1)Winmap on the beat of the spectrum

In this article, we say the spectrum first. Since it is the spectrum, it must be related to the frequency, right? Yes, the beating spectrum is actually the power distribution in the frequency domain of a small piece of audio information currently playing. Drum and chord music frequency range is very large, when the music has the ear drum, the frequency spectrum of the middle and low frequencies jump very high, indicating that the power of this part of the frequency is relatively high. Similarly, when the high-pitched violin sound sounded, the frequency of the spectrum of the high-jump, indicating high-frequency part of the higher power. It is because of this relationship that the spectrum is always "a contrast" with the music being played.

To display the bounce spectrum in the player, you need to know the power of each frequency in the audio data, and the common audio data are time domain signals that need to be converted into frequency domain signals for analysis. In the article "listening to the voice crack phone number", we introduce the discrete Fourier transform can transform the time domain sound signal into frequency domain frequency power distribution, and give the correlation algorithm, this is the basis of the spectrum display which this article introduces.

The Powerspectrums () function, which is given in the article "listening to the voice crack phone number,"44100HzThe sample rate of the audio signal passes through2048After the point discrete Fourier transform, we can get1024x768The effective frequency and power distribution of the dots (plus1024x768symmetric), the corresponding frequency mapping range is0HzTo22050Hz。 The player software usually has a very small interface, which is used in this interface.1024x768All of the bands are displayed from0To22050HzThe spectrum is unrealistic and completely unnecessary, since most people's ear hearing ranges in20HzTo20KHz, the frequencies that are not in this range can be ignored. General Spectrum display up to +Bands (I use theWinamp 2.91Version only +Spectrum bands), this involves another problem, that is, how to1024x768Selected in the spectral data +Used as a display for the spectrum. The principle of selection is to choose a representative frequency, the center frequency of two bands preferably do not differ too small, can be evenly selected, can also be uneven selection. There are many ways to do this, and the simplest way is to use every +Frequency points to select a data, just select +The power value of the dot, and then map to a +Display on the Spectrum bands.44100HzThe sample rate of the audio signal passes through2048After the point discrete Fourier transform, the spectral resolution is 3.90625Hz, and the frequency domain data of every 32 points is 125Hz. In other words, this method selects a frequency point every 125Hz, and "simply and rudely" discards too much data, which causes the beating spectrum to lack consistent coherence.

1024 points are divided into 32 bands, each band contains 32 frequency points. The center frequency points are found within each band, and two frequency points are uniformly taken from the center frequency point to the left and right, plus the central frequency points are collected 5 the value of the frequency points. The calculation method is given to this 5 points are given different weights, the middle point weights are highest, the sides are lowered in turn, and then the 5 the weighted average of the points, the weighted average is mapped to the spectral power of the band as shown on the spectrum. This calculated weighted average value can reflect the 125hz the actual power of the wide frequency segment, from the final spectrum display effect, The spectrum obtained by this method has a better coherence. The Updatespectrum () function is the embodiment of this algorithm, for the Sampledata parameter gives a piece of audio data, first call the Powerspectrums () function to get the power distribution of this audio, and then according to the Band_count constant segment, Finally, the weighted average is calculated for the frequency domain data of each segment. The weights we assign to these 5 points are: Central point 0.5, two points close to the central point are 0.15, and the outermost two points are 0.1. Spectrumwnd is a spectrum window object that passes the computed results to the spectrum window through the object Setbandlevel () function.

void Updatespectrum (short *sampledata, int totalsamples, int channels) {float power[fft_s    IZE]; if (Powerspectrums (&m_hfft, SampleData, totalsamples, channels, power)) {int fpfen = FFT_SIZE/2/Band_cou        NT;        int Level[band_count];            for (int i = 0; i < Band_count; i++) {int centpos = i * fpfen + FPFEN/2;  Double bandtotal = power[centpos-2] * 0.1 + power[centpos-1] * 0.15 + power[centpos] * 0.5 + power[centpos + 1] * 0.15            + Power[centpos + 2] * 0.1;        Level[i] = (int) (bandtotal + 0.5);    } m_spectrumwnd.setbandlevel (level, band_count); }}

        spectrum display window design no technical difficulty, as long as familiar with windows GDI Programming, Implementing a Spectrum window should be no problem. The display of each band is mainly divided into three parts, namely background, current intensity level and a slowly falling thin line (Top_bar). In addition to requiring a list to record the intensity levels of the current individual bands, a list is required to record the location of the Top_bar for each band whenever a buffer When playback is complete, the Updatespectrum () function calculates the intensity of the corresponding bands and refreshes the list of intensity levels for the current bands, size, the frequency of the refresh should be 5 - 10 times around. At the same time, the internal position update timer also reduces the intensity level of each band periodically, and reduces the top_bar position, in order to make the spectrum display smooth a little, the frequency of the update timer is greater than the intensity level of the refresh frequency, generally should be 15 more than times.

        Top_bar location and intensity level Refresh is a continuous process, But the reduction is not the same way. The decrease in intensity level can be a fixed value that decreases a certain amount each time. The Top_bar maintains a hover time, where the position does not change over the hover time, and when the hover time is over, the reduction of its value is a gradual process and eventually decreases at the intensity level to 0 before catching the position of intensity level, which makes the spectrum display look lively and interesting. The following is the code for updating the timer, which is used in the example of this article, for reference only:

void Cspectrumwnd::updatelevelontimer () {for    (int i = 0; i < Band_count; i++)    {        if (M_curlevel[i] >= m_ Levelstep)            m_curlevel[i]-= m_levelstep;        else            m_curlevel[i] = 0;        if (m_topbar[i].wait > 0)            m_topbar[i].wait--;        else        {            m_topbar[i].level = (M_topbar[i].level > M_topbar[i].step)? (M_topbar[i].level-m_topbar[i].step): 0;            if (M_topbar[i].level <= m_curlevel[i])                m_topbar[i].level = m_curlevel[i];                        if (M_topbar[i].step < m_topbar[i].step)                + = (M_TOPBAR[I].STEP/2);}}    


M_levelstep is the number of points per reduction in intensity values, the wait property of Top_bar is the hover count, which controls the hover time, and when it is reduced to 0 o'clock, it starts to drop top_bar position, and each drop is 1.5 times times the number of previous drops, so it is a step-up process.

        Spectrum Display window is a window that requires high-speed drawing, Using GDI functions directly to draw the spectrum window has proven to be inefficient and is not recommended. In general, the use of bitmap buffers to handle the high-speed Refresh window, in a piece of bitmap data directly through the color value control "generate" spectrum display bitmap, and then use the map of the GDI function directly "paste" to the window DC.

        Finally, it's humorous digression. Because of the difference in the conduction between the human brain and the brain, the sound and visual signals have a time difference between the response of the voice and the vision, and the speed of the propagation of the sound and the light itself is very different, so in order for the spectrum to show a better sensory experience, the timing of the spectrum display needs to be adjusted. Generally speaking, the sound should be played out before the spectrum, which involves a problem, that is, the sound of audio data segmentation is more appropriate? This is actually the player audio buffer size selection problem, the buffer can not be too large, such as 0.5 more than a second audio buffer, etc. after playing 0.5 seconds later to display the spectrum, the visual experience on the feeling is not on, the drums have rang for a half a day to reflect the spectrum, this feeling is certainly not good. Buffer too small is not good, first discrete Fourier transform calculation of large, it takes a certain amount of time to the audio data processing, the buffer is too small, there is not enough time to calculate, of course, now the is very strong, this is not the main problem, the main problem is that if the buffer is too small will cause the frequency spectrum refresh too frequently, which makes the spectrum display seems to feel incoherent, very mechanical. I also have no theoretical data support, according to practical experience, the audio buffer size in 0.05 seconds to 0.2 seconds between, you can achieve a better visual experience, the example given in this article uses 0.1s audio buffer, for my feeling , the effect is also possible. My friends would be grateful if they had the theoretical data to tell me.

The example program created during the writing process is a Wave file player that plays and displays a beating spectrum, the appearance of imitation Winamp , and draws the spectral shape closer to the Winamp display, pictured ( 2) is the final result of the demo program, let's go here, the next one to talk about the implementation of the audio equalizer.

Figure (2) Spectrum Display Demo window


Algorithm Series 23: Audio playback and spectral display of discrete Fourier transform

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.