After years of verification, the code-driven Linear Prediction Model (CELP, Code Excited Linear Prediction) is one of the most popular speech codec models that are used to reconstruct speech quality. Speex and other CELP codecs are based on the CELP model. What is the main idea of the CELP model?
1. Use the linear prediction (LP, linear prediction) model to model the channel system;
2. Based on the speech generation principle, an adaptive CAPTCHA and a fixed CAPTCHA are used as the input of the LP model;
3. In the perceptual weighted domain, perform a closed-loop search of the QR code to find the optimal quantization vector to minimize the overall error;
It is a general CELP decoding block diagram: (the image upload function is temporarily disabled, so it cannot be uploaded)
Speex differs from current CELP codecs in three aspects:
1. Adaptive CAPTCHA search:
Most CELP codecs uses the integer latency + fractional latency Search Algorithm in the adaptive codebook search process, but speex uses a third-level Long Pitch estimator for integer latency search. Generally, the integer delay adaptive codebook is equivalent to a first-order long-term pitch estimator, but its pitch accuracy is limited, which affects the final speech synthesis quality. There are two main methods to improve accuracy: one is to increase the order of the Long Pitch estimator, and the other is to use the fractional latency adaptive coding book. This makes it clear that the general CELP codec adopts the second method, especially low bit rate speech encoding, while speex adopts the first method and quantifies the three gain coefficients by vector.
2. Quantification of fixed book gain
Most CELP Codecs quantifies the gain quantification of fixed codebooks by means of the Moving Average (MA, Moving Average) estimator, so that the benefit value remains smooth and continuous, but introduces the dependence on previous frames. Speex encodes the fixed CAPTCHA gain into a global incentive gain value, which is also a correction value for the gain of each sub-frame.
3. Quantification of fixed books
There are many methods to quantify fixed-coded books, such as the popular algorithms for searching digital books, which are useful in ITU G.729, G.723.1, and AMR, as well as vector and coded book search algorithms, the 8 kbps VSELP Speech Encoder of TIA is applied in the United States. To avoid the limitations of these patents, Speex can only use a quantum method similar to split-Type Vector Quantization to divide the excitation signals of each sub-frame into several sub-vectors for quantization, obviously, the performance of this method is not the optimal vector quantization method. Although local optimal vectors can be searched, the combination may not be the global optimal. This is all about the helplessness of patents.
(Speex learning... continue)
References
1. Jean-Marc Valin "Speex: A Free Codec For Free Speech"
2. Bao Changchun Low Bit Rate Digital Speech Coding Basics
Welcome to the discussion!