Basic spectral subtraction de-noising

Source: Internet
Author: User

In speech denoising, the most commonly used method is spectral subtraction, spectral subtraction is an early development and the use of more mature speech denoising algorithm, the algorithm uses the additive noise and speech is not related to the characteristics of the hypothesis that the noise is statistically stable, the noise spectrum estimation without the voice gap is calculated to replace the voice period noise spectrum, Subtract from the noise-containing speech spectrum to get the estimated value of the speech spectrum. Spectral subtraction has the characteristics of simple algorithm and small computation, so it is easy to realize fast processing and can obtain high output signal-to-noise ratio, so it is widely used. the shortcoming of the classical form of the algorithm is that it will produce "music noise" which has some rhythmic fluctuation and sounds similar to music.

When converted to the frequency domain, these peaks sound like a multi-tone with a random frequency change between frames and frames, especially in voiceless segments, which is called "noise" due to the half-wave rectification. Fundamentally, the causes of music noise usually include:
(1) Non-linear processing of negative part in spectral subtraction algorithm
(2) The estimation of noise spectrum is not allowed
(3) The inhibitory function (gain function) has greater variability


Recommended a few of the introduction of spectral subtraction Blog:

Sound denoising based on spectral subtraction: http://blog.csdn.net/xiahouzuoxin/article/details/41124245

Spectral subtraction speech Noise reduction principle: http://blog.csdn.net/leixiaohua1020/article/details/47276353

A classical spectral subtraction is described below:

The time series of the speech signal is x (n), and the frame length of the frame can be obtained by adding the window frame to the speech Signal XI (m). Any one frame speech signal XI (m) does a DFT (spectral subtraction is to be transformed into the frequency domain) after


Next, we need to get two components for the subsequent calculation one is the amplitude, and the other is the phase angle. Where the amplitude is | Xi (k) |, phase angle is


These two sets of numbers are to be preserved in the spectral subtraction.

The leading non-segment (noise segment) is known to be an IS, and the corresponding number of frames is NIS, and the average energy value of the noise segment can be calculated as


The next step is to subtract the noise component from the original speech, which is calculated as follows:

, A and B are two constants, a is an over-decrement factor and B is a gain compensation factor.

At this point we have got a clean voice in the frequency domain, only need to pass the fast Fourier inverse transform can obtain the time domain speech sequence. At this point the phase angle can play a role, because the voice signal is not sensitive to the characteristics, can directly use the phase angle information to the signal after the spectrum reduction.

The process is as shown in the figure:


The speech after noise reduction has obvious "music noise", enhances the decrement factor a value, sometimes can reduce the "music noise", but also makes the waveform distorted when it is too large, so at the same time to choose a compromise value. And because of the random noise superimposed on the speech signal, the random noise on each stack is different.

References: 1. The application of MATLAB in speech signal analysis and synthesis
2.http://blog.csdn.net/tbl1234567/article/details/51841841

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.