Speech Signal Processing

Source: Internet
Author: User

Audio and video are two forms of information transmission, with the development of computer technology. Audio and video technologies have also been widely used.
My Master's Degree focuses on speech digital signal processing, speech recognition, and Speech Encoding. After graduation, he will continue to work in voice digital signal processing. Currently, the main work is to encode various speech (speech) and audio (audio) standards.AlgorithmTo transplant and optimize practical applications.

After getting into touch with the major of voice digital signal processing, I deeply like this direction and set my career direction here. He is involved in various aspects of speech Digital Signal Processing: Speech Recognition, Speech Encoding, speech enhancement, audio effects (echo, 3D, etc ). In addition, we have made in-depth research on various Speech Endpoint detection algorithms.

If you want to earn scores, you should take the following basic courses:
I. Digital Signal Processing
Ii. Random Process
3. Several books dedicated to Speech Signal Processing
In addition, you need to read more relevant Chinese and English documents to carefully calculate and program various algorithms.Programming LanguageBased on my expertise, I usually use C and Matlab.

I have been working for more than a year and a half, mainly in speech/Audio Coding/decoding. He has been familiar with the following codec:
WMA encoder; WMA decoder includes standerd, compression sion and lossless,; MP3 encoder/plugin, amr_nb encoder/decoder; Audio Encoder/decoder, FLAC (free lossless audio codec) decoder; AAC plus; ac3.
It is mainly used for project development based on ARM core and has experienced the following instruction sets: armv4 ~ Armv6: now we have to learn the armv7 command to meet the requirements of the times.

To sum up, in all voice signal processing systems, the deep research and study of Audio Encoding will quickly improve their own capabilities, encoding removes redundancy by making full use of the mathematical model of speech and the relevant time-domain characteristics of speech.
Speech Codec main algorithms: 1. Linear Prediction Model (LP) --- (Levison -- Durbin) algorithm; 2. LP--LSP--LSF coefficient conversion; 3. Vector Quantization; 4. Post filter.
Represents codec: g.726, g.729 series, amr_nb/WB, SVM, and other variable rate Codes

Audio Codec: main algorithms: 1. subband filtering; 2. mdct/imdct; 3. quantization (Adaptive Negotiation of quantitative step sizes); 4. Huffman encoder/decoder; 5, multi-channel encoding methods: dual-channel m/s (using the correlation of two-channel data); multi-channel mutli-channel transform, that is, multiplied by a matrix, you can convert two channels into multiple channels.
Represents codec: WMA series, AAC series, and MP3 series.

The above is just about the main algorithm model, and there are still a lot of algorithm details to learn, such as the length of frames in different codec varies greatly: From 32samples to 8192 samples, or even not long. . In one sentence, many algorithms can be used for reuse. It is very important to study well if you have learned it and can apply it multiple times.




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.