From the reason of communication echo, can be divided into acoustic echo (acoustic echo) and line echo (lines echo), the corresponding echo cancellation technology is called acoustic echo cancellation (acoustic echo cancellation, AEC) and line echo cancellation (lines Echo cancellation, LEC). Acoustic echo is a result of the speaker's sound being fed back to the microphone multiple times in the hands-free or conference application (better understood); The line echo is caused by the 24-wire matching coupling of the physical electronic circuit (more difficult to understand). There are two main reasons for the echo to occur: 1. Acoustic echoes from spatial acoustic reflections (see figure below):
The figure of the man speaking, voice signal (SPEECH1) to the lady's room, due to the reflection of space, the formation of Echo Speech1 (ECHO) re-input from the microphone, while superimposed on the woman's voice signal (SPEECH2). At this point the man will hear the voice of the lady superimposed on her voice, affecting the normal quality of the call. In this case, the Echo cancellation module is used in the lady's room to cancel out the echo of the man and let the man hear only the lady's voice. 2. Due to the line echo introduced by the 2-4-wire conversion (see figure below):
In the ADSL Modem and the switch there are 2-4-wire conversion circuit, due to the problem of the circuit mismatch, there will be a part of the signal back to form an echo. If the switch side does not echo cancel function, the caller will hear their own voice. Whatever the cause, the same is true for voice communication terminals or voice relay switches: When sending, remove unwanted echoes from the middle of the voice stream. Imagine a voice stream that mixes at least two sounds, separating them and then removing one of them, which is a lot more difficult. Just like a bottle of blue ink and a bottle of red ink pour together, and then need to red ink out, this is probably impossible. So it's not surprising that echo cancellation is considered a mysterious and incomprehensible technique. Admittedly, it is impossible to get rid of an echo if only a single voice signal is mixed with the Echo (the most advanced blind-signal separation technology is not). But, in fact, in addition to this mixed signal, we can get the original signal that produces the echo, although it is different from the echo signal. We look at the following AEC Acoustic echo cancellation block Diagram (reproduced in this image).
Among them, we can get two signals: one is the blue and red mixed signal 1, that is, the actual need to send the speech and the actual unwanted echo mixed voice flow, and the other is the dashed signal 2, which is the original voice that caused the echo. Then everyone will say, oh, the original echo cancellation is so simple, directly from the mixed signal 1 inside the dashed 2 minus the line. Please note that this dashed signal 2 and echo Echo are different, direct subtraction will make the speech beyond recognition. We call the mixed signal 1 is the near-end signal NE, the dashed signal 2 is called the remote reference signal FE, if there is no FE this signal, echo cancellation is impossible to complete the task, like "paddle". Although the reference signal FE and Echo are not exactly the same, there are differences, but the two are highly correlated, which is why Echo calls Echo. At the very least, the semantics of the Echo and the reference signal are the same, but you can understand it, but if you say it, it will be uncomfortable to hear your own words back in a minute. Since the FE is related to echo height, echo is also caused by FE, we can represent echo as the mathematical function of Fe: Echo=f (Fe). Function F is called the echo path. In acoustic echo cancellation, the function f represents a physical process where the sound is reflected multiple times on the wall, ceiling, etc., and in line echo cancellation, the function f represents the 24-wire matching coupling process of the electronic circuit. Obviously, the next job we're going to do is solve the function F. The function F can be obtained from the FE calculation Echo, and then from the mixed signal 1 minus echo to achieve echo cancellation.
Although Echo cancellation is a very complex technique, we can simply describe this process: 1. Room A's audio conferencing system receives the sound from room B 2, the sound is sampled, this sample is called Echo Cancellation reference 3, and then the sound is sent to room A's speaker and acoustic Echo Canceller 4, Room B's voice and room A's voice is picked up by room A's microphone 5, the sound is sent to the acoustic echo Canceller, compared with the original sample, remove the sound of room B
The process of solving the echo path function F is probably more difficult to express than the mathematical formula. In view of the difficulty of popular expression of mathematical formulas than the discovery of mathematical formulas, I do not bother to explain. The following paragraph expresses the process of solving the function F using the adaptive filter principle.
Adaptive Filter
The adaptive filter is an algorithm or device that automatically adjusts the filter coefficients and achieves the best filtering characteristics based on the estimation of the statistical characteristics of the input and output signals. Adaptive filters can be contiguous or discrete domains. The discrete-domain adaptive filter consists of a set of tapped delay lines, variable weighting coefficients and automatic adjustment coefficients. The drawings indicate that a discrete-domain adaptive filter is used to simulate the signal flow graph of unknown discrete systems. Adaptive filter to the input signal sequence x (n) of each of the values, according to a specific algorithm, update, adjust the weighting coefficient, the output signal sequence Y (n) and the desired output signal sequence D (n) comparison of the mean square error is minimal, that is, the output signal sequence y (n) approximation of the desired signal sequence d (n).
The coefficients of the adaptive filter designed with the minimum mean square error can be solved by the Wiener-Hov equation. B. A method proposed by Videro can solve the adaptive filter coefficients in real time, and the results approach the approximate solution of the Wiener-Hov equation. This algorithm is called the least mean square algorithm or LMS method. This algorithm uses the steepest descent method to calculate the coefficients vector of the next moment from the current moment filter coefficient vector by the gradient estimation of the mean square error.
KS in the formula is a negative number, its value determines the convergence of the algorithm, V "ε2 (n)" is the mean square error gradient estimation,
The
Adaptive filter is applied to automatic equalization, echo cancellation, antenna array beamforming, and other related domain signal processing parameters identification, noise cancellation, spectral estimation, etc. in the field of communication. For different applications, only the input signal and the expected signal are different, the basic principle is the same. The above remark indicates that the Echo path function f that needs to be solved is a process of adaptive filter W (n) Convergence. The input signal x (n) is FE and the desired signal is echo, and the W (n) After the adaptive filter converges is the echo path function f. After convergence, when the actual echo occurs, we put the FE through the function W (n), we can get a very accurate echo, the mixed signal directly minus Echo, get the actual need to send the voice speech, complete the Echo cancellation task. Notable two points: 1, Adaptive filter Convergence stage, the expected signal is the complete echo, can not be mixed with speech. Because speech and FE are not related, the convergence of W (n) is disturbed. In other words, the echo cancellation algorithm starts to converge to be very fast, the best of the other side too late to speak, you say on the convergence well; after convergence, if the other side starts to talk, that is, there are speech mixed, the W coefficient will not change, Need to stabilize. 2, echo path may be change, once the change, echo cancellation algorithm to be able to determine, because the adaptive filter learning to start again, that is, W (n) requires a new convergence process to approximate the new echo path function F. Basically, the above two points are a dilemma, one needs the adaptive filter after convergence to maintain the coefficient of stability to ensure that not affected by speech speech interference, another need to be adaptive filter at any time to maintain the updated state, to ensure that can track the change of Echo path. In this way, echo cancellation is difficult only from the mathematical algorithm level. Simply put, the design of the echo-Cancellation adaptive filter has two contradictory characteristics, i.e. fast convergence and high stability, and how to achieve both of these features is the main design challenge.