Speech Codec (g.711, g.723, g.726, g.729, ilbc)

Source: Internet
Author: User
Various codec codes have been widely used in various fields. Next we will compare the compression ratios of various codec codes. If they are incorrect, we hope you can correct them.
Speech Codec:
Current major speech codec include g.711, g.723, g.726, g.729, ilbc
Qcelp, EVRC, Amr, SMV

Major audio codec include:
Real Audio, AAC, AC3, MP3, WMA, SBC, etc. Various encodings have their key fields.

This article summarizes the speech codec-related indicators:
ITU releases the g.7xx series speech codec, which is widely used at present: g.711, g.723, g.726, g.729. each type has many branches, such as g.729, g.729a, g.729b, and g.729ab.

G.711:
G.711 is a nonlinear Quantization of analog speech signals. There are two subdivisions: g.711 A-Law and g.711 U-law. different countries and regions will select a standard. g.711 bitrate is 64 Kbps. detailed information can be related to spec in the ITU. The following mainly lists some performance parameters:
G.711 (PCM method: PCM = pulse code modulation: Pulse Code Modulation)
• Sampling Rate: 8 kHz
• Information volume: 64 Kbps/Channel
• Theoretical delay: 0.125 msec
• Quality: Mos value 4.10

G.723.1:
G.723.1 is a dual-Rate Speech Encoder that is recommended by ITU-T for compressing voice or other audio signals in low-rate multimedia services Algorithm ;
Its target application systems include multimedia communication systems such as H.323 and H.324. At present, this algorithm has become one of the mandatory algorithms in the IP Phone System. The frame length of the encoder is 30 ms, and there is a forward-looking prospect of 7.5ms, the algorithm latency of the encoder is 37.5 Ms. The encoder first filters the bandwidth of the traditional telephone signal (based on g.712), and then samples the voice signal at a traditional 8000-hz rate (based on g.711 ), it is transformed into a 16-bit linear PCM code as the input of the encoder;
The decoder carries out inverse operations on the output to reconstruct the voice signal. The high-speed encoder uses multi-pulse Maximum Likelihood quantization (MP-MLQ), and the low-rate encoder uses the digital generator to stimulate the linear prediction (ACELP) method, both encoder and decoder must support these two rates and can convert the two rates between frames;
This system can also compress and decompress music and other audio signals, but it is optimal for voice signals. It adopts the mute compression for discontinuous transmission, this means that artificial noise is added to the bit stream during mute. In addition to reserved bandwidth, this technology keeps the sender's modem working continuously and avoids the intermittent interruption of the carrier signal.

G.726:
G.726 has four bit rates: 32, 24, 16 kbit/s adaptive Differential Pulse Code Modulation (ADPCM). The most common method is 32 kbit/s, however, because it is only half of the g.711 speed, the available space of the network can be doubled. G.726 specifies how a 64 kbpsa-law or micro-Law PCM signal is converted to an ADPCM channel of 40, 32, 24, or 16 kbps. In these channels, 24 and 16 Kbps channels are used for voice transmission in digital circuit multiplier devices (DCME, the 40 Kbps channel is used for data demodulation signals in DCME (especially 4800 kbps or a higher modem ).
G.726 encoder input is generally g.711 encoder output: 64 Kbps A-law or U-law. The algorithm is essentially an ADPCM adaptive quantization algorithm.

G.729:
G .. 729 speech compression and encoding algorithm
Csacelp is an algorithm based on the CELP encoding model. It can achieve a high speech quality (long speech quality) and a low algorithm; the algorithm frame length is 10 ms, the encoder includes 5 ms foresight, and the algorithm latency is 15 ms. The reconstructed speech quality is equivalent to the ADPCM (g.726) of 32kb/s in most working environments, and the mos score is greater than 4.0; during encoding, The 16bitpcm voice signal is input, and the binary bit stream is output. During decoding, the input is a binary bit stream, and the 16bitpcm voice signal is output. Based on 8 kHz voice signal sampling, 16-bit linear PCM encoding. After compression, the data rate is 8 kbps and the compression ratio is equivalent.
The g.729 series is widely used in VoIP and has many branches. Source Code and related documents can be obtained directly from the ITU network.
G.729 (CS-ACELP mode: Conjugate Structure Algebraic Code Excited Linear Prediction)
• Sampling Rate: 8 kHz
• Information volume: 8 kbps/Channel
• Frame length: 10 msec
• Theoretical delay: 15 msec
• Quality: Mos value 3.9

Ilbc (Internet low bitrate codec ):
Developed by Global IP sound, a world-renowned voice engine provider, It is a low-Bit Rate decoder that provides robust robustness when packet loss occurs. Ilbc provides speech sound quality equivalent to or greater than g.729 and g.723.1, and is better at preventing packet loss than other Low Bit Rate codecs. Ilbc runs at a speed of 13.3 kb/s (30 ms per frame) and 15.2 kb/s (20 ms per frame). It is suitable for dial-up connections.
The main advantage of ilbc is its ability to process packet loss. Ilbc independently processes each voice packet and is an ideal packet-switched network speech codec. Under normal circumstances, ilbc records the relevant parameters and incentive signals of the current data for processing in the case of subsequent data loss; when the current data is received normally and the previous data packet is lost, ilbc performs smooth processing on the decoded speech and the previously generated speech to eliminate the inconsistency; when the current data packet is lost, ilbc processes the previously recorded incentive signal and mixes it with the random signal to obtain the simulated excitation signal, in this way, analog speech that replaces lost speech is obtained. In general, ilbc uses more natural and clear elements than standard low-speed encoding/decoding to accurately imitate the original voice signal, it is hailed as a high-Speech Quality codec suitable for packet switching networks.
In addition, most standard low-Bit Rate encodings, such as g.723.1 and g.729, only encode the frequency range of 300Hz-3400hz. In this frequency range, the speech quality achieved by g.711 encoding/decoding is the effect of voice calls over the traditional PSTN network.
Ilbc makes full use of the 0-300Hz frequency bandwidth for coding and has a super-clear speech quality, which is far beyond the traditional Hz-3400hz frequency range.
One of the core technologies of the popular Skype Network phone is ilbc speech codec technology. Global IP sound said the encoder's speech quality is superior to that of PSTN, and it can withstand up to 30% packet loss.
In general, the ilbc voice quality is better than g.729, g.723.1, and g.711 in the same packet exchange communication conditions. The voice is more mellow and full, and the packet loss rate is higher, ilbc has obvious advantages in speech quality!
At present, many VoIP devices and application vendors have integrated ilbc into their products in the international market. For example, Skype or Nortel. In the domestic market, no VoIP manufacturer has officially launched a gateway device that supports "ilbc". xunshi is the first to launch a relay gateway and IAD device that supports "ilbc.

For more information, visit:
Www.itu.int
Http://www.ilbcfreeware.org/documentation.html#presentations
Http://itbbs-arch.pconline.com.cn/topic.jsp? Tid = 2648071
Http://bbs.sdgb.cn/ShowThread.aspx? Postid = 11843
Http://en.wikipedia.org/wiki/G.726
Http://www.itu.int/rec/T-REC-G.726/e




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.