First, Echo elimination algorithm model
Firstly, the main components of adaptive echo cancellation are analyzed, and the echo elimination model can be divided into two parts.
- Transverse filter structure
- Filter coefficient adaptive and step control
The transverse filter uses the impulse response W (n) "Somewhere also known as the Echo path" and the Distal speaker signal u (n) convolution to get an echo estimate, and Y (n) to represent the estimate. The microphone output signal is done as the desired response D (n), subtracting the "synthetic echo" of the filter from the desired response D (n), resulting in an error signal E (n). By constantly adjusting the filter coefficients w (n) to minimize the mean square value of the error signal, the result is that the error signal provides an approximate estimate of the local voice . This is why the structure can remove the echo.
Second, LMS algorithm optimal Step analysis
Let's focus on the problem of step control in adaptive updating of LMS filter coefficients, that is, how to obtain the optimal step size. We first put the expected response in the form of a vector as follows:
\[d (N) = {W^h}u (n) + V (n) \]
,W is the unknown parameter vector of the transverse filter and the additive interference noise in V (n). The tap weights computed by the LMS filter are estimates of W, with a weighted error vector for the mismatch between them.
\[\varepsilon (n + 1) = W-\hat w (n) \]
To measure the adaptive update mechanism by the transverse filter coefficients
\[\hat W (n + 1) = \hat W (n) + \mu U (n) e (n) \]
Can get
\[\varepsilon (n + 1) = \varepsilon (n)-\mu u (n) e (n) \]
If we choose the optimal step size by minimizing the increment change of the tap weight vector of the tap weight vector n iterations to the n+1 iterations, even
\[e\{|\varepsilon (n + 1) {|^2}\}-e\{|\varepsilon (n) {|^2}\} \]
To achieve the minimum, the optimal step size is expanded to obtain
\[e\{|\varepsilon (n + 1) {|^2}\}-e\{|\varepsilon (n) {|^2}\} = {\mu ^2}e\{{e^2} (n) |u (n) {|^2}\}-2\mu e\{E (n) {\va Repsilon ^t} (N) u (n) \} \]
An additional error vector is set (in some places also known as non-interference error, that is, assuming no noise in the case of the coefficient output error) and the filter weight error vector of the relationship between the
\[\XI (N) = {\varepsilon ^t} (N) u (n) \]
The
\[e\{|\varepsilon (n + 1) {|^2}\}-e\{|\varepsilon (n) {|^2}\} = {\mu ^2}e\{{e^2} (n) |u (n) {|^2}\}-2\mu e\{E (n) \xi (n) \} \]
On both sides of the upper-side derivative, and the derivative is 0, it is easy to get the optimal step size
\[{\mu _{opt}} (N) = \frac{{e\{E (n) \xi (n) \}}}{{e\{{e^2} (n) |u (n) {|^2}\}}}\]
And because the additional error can be regarded as the difference between the system output error and additive interference noise.
\[\XI (n) = e (n)-V (n) \]
Assuming that the additive interference noise signal is independent of the additional error signal, the optimal step size can be written as
\[{\mu _{opt}} (n) \approx \frac{{e\{E (n) \xi (n) \}}}{{|u (n) {|^2}e\{{e^2} (n) \}}} = \frac{{e\{{e^2} (n) \}-e\{{v^2} (N) \}}}{{|u (n) {|^2}e\{{e^2} (n) \}}}\]
It can be seen that the optimal step size of LMS algorithm is equal to the fixed step NLMs algorithm when the noise signal is not present .
The optimal step analysis of NLMs algorithm
The optimal step size of the NLMs filter is analyzed below, and the adaptive updating mechanism of the nlms transverse filter coefficients is
\[W (n + 1) = W (n) + \frac{\mu}{{|u (n) {|^2}}}u (n) {e^2} (n) \]
Minus W on both sides, finishing can get
\[\varepsilon (n + 1) = \varepsilon (n)-\frac{\mu}{{|u (n) {|^2}}}u (n) {e^2} (n) \]
We also select the optimal step size by minimizing the increment change of the tap weight vector from the n iteration to the n+1 iteration of the tap weight vector, even
\[e\{|\varepsilon (n + 1) {|^2}\}-e\{|\varepsilon (n) {|^2}\} = {\mu ^2}e\left\{{\frac{{|e (n) {|^2}}}{{|u (n) {|^2}}}} \ right\}-2\mu e\left\{{\frac{{e (n) \xi (n)}}{{|u (n) {|^2}}}} \right\}\]
The smallest, according to the above analysis LMS filter thought to take the derivation and the collation, finally obtains the NLMs the optimal step size is
\[{\mu _{opt}} = \frac{{e\{E (n) \xi (n)/|u (n) {|^2}\}}}{{e\{|e (n) {|^2}/|u (n) {|^2}\}}}\]
To simplify the calculation of optimal step size, we assume that the input signal energy fluctuations from one iteration to the next iteration are small enough to meet the following approximate
\[e\{e (n) \xi (n)/|u (n) {|^2}\} = e\{E (n) \xi (n) \}/e\{|u (n) {|^2}\} \]
And
\[e\{|e (n) {|^2}/|u (n) {|^2}\} = e\{|e (n) {|^2}\}/e\{|u (n) {|^2}\} \]
The optimal step size can be rewritten as
\[{\mu _{opt}} = \frac{{e\{E (n) \xi (n) \}}}{{e\{|e (n) {|^2}\}}}\]
Again, the optimal step size is rewritten again because the additive interference noise signal is independent of the additional error signal.
\[{\mu _{opt}} = \frac{{e\{E (n) \xi (n) \}}}{{e\{|e (n) {|^2}\}}} = \frac{{e\{[\xi (n) + V (n)]\xi (n) \}}}{{e\{|e (n) {|^2 }\}}} = \frac{{e\{|\xi (n) {|^2}\}}}{{e\{|e (n) {|^2}\}}} = \frac{{e\{|{ \varepsilon ^t} (N) u (n) {|^2}\}}}{{e\{|e (n) {|^2}\}}}\]
If we next assume that the input signal U (n) in the spectrum, each frequency point to the weighted error vector spectrum of the corresponding frequency point of the effect is the same, then
\[e\{|{ \varepsilon ^t} (N) u (n) {|^2}\} \approx e\{|{ \varepsilon ^t} (n) {|^2}\} e\{|u (n) {|^2}\} \]
The resulting optimal step size can be approximated as
\[{\mu _{opt}} = \frac{{e\{|{ \varepsilon ^t} (N) u (n) {|^2}\}}}{{e\{|e (n) {|^2}\}}} \approx \frac{{e\{|{ \varepsilon ^t} (n) {|^2}\} e\{|u (n) {|^2}\}}}{{e\{|e (n) {|^2}\}}}\]
Careful friend may have seen, this conclusion and the Speex echo elimination principle depth analysis of the best step in the conclusion of the meaning can be considered to be the same (the use of the symbol is different, does not affect the understanding), which shows that no matter from which angle analysis, in the tap weight vector mean square deviation of the minimum constraint criteria, The result of the optimal step is the same.
Iv. Improvement of Ideas
Since the principle has been analyzed clearly, and now to see, for this principle of Speex implementation, can have what improved ideas. My level is limited, here first to share out, welcome friends criticize the shortcomings of the place!
- The optimal initial value problem, although the Speex uses MDF as a long time delay filter, but in essence is still the temporal filtering principle, just in the frequency domain to do it. So in order to quickly converge as soon as possible at startup, the initial value problem of the filter weight vector is not easy to initialize with the limit.
- The effect of echo on each frequency point is different, can not use a leak factor to express, if in the frequency domain of the segmented processing, each paragraph adopt different leakage factor, should be a feasible idea
- In the time domain, the echo path is sparse, and Speex does not use the sparsity of the Echo path to speed up the convergence process.
- There is no nonlinear difference between the different speaker-to-microphone echo paths. This difference in the mobile phone side effect is obvious, if the remote reference signal to do non-linear processing, can weaken this effect.
This article comes from Icoolmedia, a person of limited level, welcome friends interested to join the audio and video algorithm discussion group (374737122) on the above issues together to do further discussion!
Theoretical analysis of optimal step size of LMS and nlms and possible improvement ideas of Speex echo cancellation