Skype silk codec Overview

Source: Internet
Author: User

Recently, I briefly read the Skype SILK codec algorithm. The basic principles and procedures are basically clear. I will take a closer look at the details later. I will give a brief introduction today. SILK Codec is a speech and audio Codec algorithm that provides great flexibility for audio bandwidth, network bandwidth, and algorithm complexity. Four sampling rates are supported: 8 KHz, 12 KHz, 16 KHz, and 24 KHz; three types of Complexity: low, medium, and high. The encoding bit rate is 6 ~ 40 kbps (different sampling rates have different bit rate ranges) and support VAD, DTX, FEC, and other modules. The most important thing is to provide fixed-point C code, which is very helpful for porting and optimization to ARM and DSP.

 

Attached principle Flowchart

 

 

After reading the SILK Codec code, I always think that it is a mix of iLBC and Speex. Of course it is not that simple. The typical Source-filter model is used as a whole, that is, the basis for the speech generation system modeling, after two-level filtering, the first-level long-term prediction filter (LTP) removes the Periodic Components in voiced speech. Of course, this step is not required for voiced speech. The second step is short-term filtering (LPC ), remove the redundant information between near sample points. Here, the LPC coefficient is calculated using the Berg (Burg) algorithm (generally, CELP Codec uses self-Correlation Algorithms to calculate LPC ), then the multilevel vector quantization method is used (generally CELP Codec adopts the split Vector Quantization Method). After these two levels of filtering, the excitation signal can be obtained, in general, CELP Codec often uses a fixed codebook + adaptive codebook to separately quantify the method. They approximate the near-cycle components and noise-like components in the excitation signal respectively (this model is too classic, therefore, CELP can maintain excellent sound quality at 8 kbps or above. In addition, different quantization methods of Fixed-code books lead to different Celp naming and variants. Otherwise, CELP will be introduced. Here, SILK is also different from iLBC, but it is very similar to finding the largest energy point in the sub-frame, and then quantize and normalize the gain, range encode is used for normalized signals. This distance encoding is a lossless compression algorithm. Its performance and principle are similar to arithmetic coding. It is mainly used to avoid patent reasons. In addition, VAD, DTX, FEC, and noise suppression are also good. The bitrate encoding method is similar to Speex.

 

 

 

Well, I will come here today and go to work tomorrow. I will study the details later.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.