Current results: Speaker read local data, Mic recorded real-time data, can be 90% to eliminate the echo, Aecdelay also need to tune, NS and VAD are normally available. First ensure the AEC normal, next speaker real-time network data, from double talk there is a long way, refueling.
Effect Diagram:
Real-time audio echo cancellation Flowchart:
key points of knowledge:
1, the purpose of ECHO cancellation is to achieve two-way intercom (double talk)
2, hardware requirements, low-end equipment video audio plus AEC CPU (99%), sound one card, AEC can not be calculated. Currently only do two-way voice intercom, turn off the device video recording, solve.
3, speaker for amplification circuit playing data energy is too large, AEC can not be precisely eliminated, the emergence of true sound and echo are eliminated phenomenon. The debug phase manually reduces the speaker magnification parameters and resolves.
------------------------------------------------------Demo Start------------------------------------------------------------------------------------//
while (1)
{
if (3200 = = fread (far_frame_c, sizeof (char), 3200, Fp_far))
{
Fread (Near_frame_c, sizeof (char), 3200, fp_near);
for (i = 0;i < nn*10; i++) {
Far_frame_s[i] = (far_frame_c[i*2+1]<< 8) | (far_frame_c[i*2]&0xff);//two char type into a short
Far_frame[i] = far_frame_s[i];//to float type interface required
Near_frame_s[i] = (near_frame_c[i*2+1]<<8) | (NEAR_FRAME_C[I*2]&0XFF);
Near_frame[i] = near_frame_s[i];
}
for (i = 0; I < 10;i + +) {
NOTICE ("aec_bufferfarend......\n");
Ewebrtcaec_bufferfarend (HANDLEAEC, Far_frame+nn*i, NN);//handling of reference sounds (echoes)
NOTICE ("aec_processs......\n");
Ewebrtcaec_process (handleaec,near_frame+nn*i,1, aecout_frame+nn*i,nn,aecdelay,0);//Echo Cancellation
}
for (i = 0; i < nn*10; i + +) {
Aecout_frame_s[i] = Aecout_frame[i];
}
Fwrite (aecout_frame_s, sizeof (short), nn*10, FP_OUTAEC);
#if 0//ns
NOTICE ("ns_processs......\n");
Ewebrtcns_analyze (handlens,near_frame);//ns
Ewebrtcns_process (Handlens,aecout_frame,1,nsout_frame);
for (i = 0; i < NN; i + +) {
Nsout_frame_s[i] = Nsout_frame[i];
}
Fwrite (nsout_frame_s, sizeof (short), NN, Fp_outns);
#endif
#if 0//delay
NOTICE ("getdelaymetrics......\n");
Ewebrtcaec_getdelaymetrics (Handleaecdelay,median,std,frac);
Ewebrtcaec_getdelaymetrics (Handleaec,median,std,frac);
NOTICE ("median%d std%d frac%f\n", median[0],std[0],frac[0]);
#endif
#if 0//vad is OK
NOTICE ("vad_process......\n");
if (0 < ewebrtcvad_process (handlevad,sample_rate,far_frame_s,nn)) {
NOTICE ("voice~~~~~ sum%d\n", circulargetsum ());
Circularadddata (1);
} else {
NOTICE ("silence~~~~~ sum%d\n", circulargetsum ());
Circularadddata (0);
}
if (< Circulargetsum ()) {
int newvalue = MAX (0,median[0]-20);
NOTICE ("newvalue=%d aecdealy=%d\n", newvalue,aecdelay);
if (ABS (Newvalue-aecdelay) >= 8) {
Aecdelay = MAX (newvalue,0);
NOTICE ("Aecdelay%d\n", aecdelay);
}
}
#endif
}
Else
{
Break
}
}
------------------------------------------------------Demo End------------------------------------------------------------------------------------//
Basic parameters:
Parameter name |
Data type |
Description |
samplerate |
Int |
Sample rate 8000 |
Sampleperframe |
Int |
160*10 |
Channelcount |
int |
Mono 1 |
audioencoding |
Int |
2 bytes encoding_pcm_16bit |
Codecid |
Int |
audio_codec_gsm |
8000HZ, a sample sampled 1600 times, that is, each sample is 3200byte. 5 sample per second, each sample is 200ms, this is done due to CPU performance. The 3200byte split into 10 320byte in turn GSM code, GSM requirements 320byte. |