Audio Encoding (reprint)

Last Update:2016-08-05 Source: Internet

Author: User

Tags value of pi

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Frequency
Different frequency sine wave, the lower part than the upper part frequency high frequency is the unit time the repetition occurrence number measure, in the physics usually by the symbol Roman letter F or the Greek word ν, its international unit is Hertz (Hz). When an event is repeated n times in a T time, this event occurs at a frequency of f = n/t Hz. And because the cycle is defined as the minimum interval of recurrence, the frequency can also be expressed as the reciprocal of the cycle, that is, F = 1/t, where T represents the period.
60x=n=> x= N/60
In the International Standard Unit, the unit of frequency-hertz, is named after the name Heinrich Rudolph Hertz. 1 Hz indicates that an event occurs every second.
One vibration cycle per second is called 1HZ, and the audible audio of the ear is about 20HZ to 20KHZ. 20~20000 Vibration cycles per second.

Audio-Sound Information digitization
Audio digitization is to digitize the analog sound waveform for computer processing, including sampling, quantization, encoding three steps.
(1) Sampling
The amplitude value of the analog signal is extracted at a fixed time interval (sampling period). After sampling, a discrete sequence of sound amplitude samples is obtained, which is still the analog amount. The higher the sampling frequency, the better the fidelity of the sound, but the larger the amount of data obtained by sampling. In the MPC, the sampling frequency is determined as: 11,025khz,22,05khz,44,1khz.
(2) Quantification
The sample value of the sampled signal amplitude is converted from the analog to the digital amount. The bits number of digital quantity is quantization precision. In the MPC, the quantization accuracy standard is set to 8-bit, 16-bit.
The sampling and quantization process is called a modulo/number (A/d) conversion.
(3) coding
The digital sound information is represented by a certain data format, and its implementation is based on various compression methods to compress the data. [1]
Audio-Impact factors
(1) Sampling frequency: Sampling frequency refers to the number of samples per unit of time. The larger the sampling frequency, the smaller the interval between sampling points, the more realistic the sound will be when digitized, but the larger the corresponding data volume. The sound sampling frequency is measured in khz (gigahertz).
(2) Quantization digit (number of sample bits): The quantization bit is the number of data bits after the analog amount is converted to a digital quantity. The quantization bit represents the amplitude of the sound, the more the number of bits, the finer the sound quality, the larger the corresponding data volume. There are 8-bit and 16-bit quantization digits.
(3) Number of channels: The number of channels refers to whether the processed sound is mono or stereo. Mono has only a single stream of data during sound processing, while stereo requires two data streams for the left and right channels. Obviously, the stereo effect is better, but the corresponding amount of data is doubled than the amount of data in mono.
The amount of sound data is generally referred to as massive data. This is because the higher the sound quality requirements, the greater the amount of data. [

Interpreting Audio Properties
Sampling accuracy
What is sampling accuracy? Because WAV uses a digital signal, it is to use a bunch of numbers to describe the original analog signal, so it to the original analog signal analysis, we know that all the sound has its waveform, digital signal is in the original analog signal waveform on every time at once "take the point", give each point to a value, This is "sampling", and then all the "dots" can be used to describe the analog signal, it is clear that the more points in a given time, the more accurate the description of the waveform, the scale we call "sampling accuracy." Our most commonly used sampling accuracy is 44.1kHz, which means sampling 44,100 times per second. The use of this value is because after repeated experiments, it is found that the accuracy of the sample is the most appropriate, lower than this value will have a significant loss, and above this value people's ears are difficult to distinguish, and increase the space occupied by digital audio. In order to achieve "extremely accurate", we will also use 48kHz or even 96kHz of sampling accuracy, in fact, The difference between 96kHz sampling accuracy and 44.1kHz sampling accuracy will never be as large as 44.1kHz and 22kHz, and the sampling standard for the CDs we use is 44.1KHZ, and 44.1kHz is still the most common standard, and some people think that 96kHz will be the trend of future recording circles.

Bit rate
Bit rate is a commonly heard noun, digital recording is generally used 16-bit, 20-bit or 24-bit music production. What is "bit"? We know that the sound has a light noise, affecting the sound and sound degree of the physical elements is the amplitude, as a digital recording, must also be able to accurately represent the light of the music, so be sure to have a precise description of the amplitude of the waveform. "Bit" is such a unit, 16 bits refers to the amplitude of the waveform is divided into 2^16 that is 65,536 levels, according to the analog signal light rang to divide it into a certain level, it can be represented by the number. As with the sampling accuracy, the higher the bit rate, the more detailed the sound of the music changes. 20-bit can produce 1,048,576 levels, the performance of the symphony of such a dynamic music is no problem. Just mentioned a noun "dynamic", it is actually refers to a piece of music the loudest and lightest contrast can reach how much, we often say "dynamic range", Unit is DB, and dynamic range and we recorded when the bit rate used is tightly combined, if we use a very low bit rate, Then only a few grades can be used to describe the strength of the sound, of course, you can not hear a large contrast of strength. The dynamic range and bit rate are the same, and the dynamic range increases by 6dB for each bit rate increase of 1 bits. So if we use 1-bit recording, then our dynamic range is only 6dB, such music is impossible to listen to. At 16 bits, the dynamic range is 96dB. This can meet the general requirements. 20 bit, dynamic range is 120dB, contrast strong symphony can cope with, the performance of music is more than enough. The audiophile-level recorder also uses 24 bits, but as with the sampling accuracy, it does not change significantly compared to 20 bits, theoretically 24 bits can be 144 db dynamic range, but in fact it is difficult to achieve, because any device will inevitably generate noise, at least at this stage 24 bit difficult to achieve its expected effect.
Features of the audio file format. To play or process audio files in the computer, that is, to the sound file number, mode conversion, the process is also composed of sampling and quantification, the sound that the human ear can hear, the lowest frequency is from 20Hz to the highest frequency 20khz,20khz the ear is not audible, So the maximum bandwidth for audio is 20KHZ, so the sample rate needs to be between 40~50khz, and more quantization bits are needed for each sample. The standard for audio digitization is a signal-to-noise ratio of 16 bits per sample (16bit, or 96dB), modulated by a linear pulse-coded PCM with equal lengths for each quantization step. In the production of audio files, it is the adoption of this standard.

|
|
|
--------------------------------------------------------------------------Sampling
|
| bit rate
Audio encoding
The sound in nature is very complex, the waveform is extremely complex, usually we use the Pulse code modulation coding, namely PCM coding. PCM converts a continuously varying analog signal to a digital encoding by sampling, quantization, and coding in three steps.
Lossy and Lossless
According to the sampling rate and sample size can be learned that, relative to the nature of the signal, audio encoding can only be infinitely close, at least the current technology can only be so, relative to the nature of the signal, any digital audio coding scheme is lossy, because it can not be completely restored. In the computer application, can achieve the highest fidelity level is PCM code, is widely used in material preservation and music appreciation, CD, DVD and our common WAV files are used. As a result, PCM has a conventional lossless encoding, because PCM represents the best fidelity level in digital audio, and does not mean that the PCM will ensure that the signal is absolutely true and that the PCM can only achieve the maximum degree of proximity. We habitually include MP3 in the category of lossy audio coding, which is relative to PCM coding. The emphasis on the relative damage and lossless of the coding is to tell you that it is difficult to do real damage, like using numbers to express pi, no matter how high the precision is, it is just infinite proximity, not really equal to the value of pi.
Why use audio compression technology
To calculate the bitrate of a PCM audio stream is a very easy thing to do, sample rate value x sample size value x channel number bps. A sample rate of 44.1KHz, the sample size of 16bit, two-channel PCM encoded WAV file, its data rate is 44.1kx16x2 =1411.2 Kbps. We often say that the 128K MP3, the corresponding WAV parameter, is this 1411.2 Kbps, this parameter is also called the data bandwidth, it and ADSL bandwidth is a concept. By dividing the bitrate by 8, you can get the data rate of this WAV, which is 176.4kb/s. This means that the storage of a second sampling rate of 44.1KHz, sampling size of 16bit, two-channel PCM encoded audio signal, the need for 176.4KB of space, 1 minutes is about 10.34M, which is unacceptable to most users, especially like listening to music on the computer friends, to reduce disk occupancy, only 2 ways to reduce the sampling Indicator or compression. Reducing the indicators is undesirable, so experts have developed a variety of compression schemes. Due to the different use and target market, various audio compression coding achieves the same sound quality and compression ratio, which we will mention in the following article. One thing is for sure, they have all been compressed.

The relationship between frequency and sampling rate
The sample rate represents the number of times the original signal is sampled per second, and the sample rate of the audio file we often see is 44.1KHz, what does that mean? Suppose we have 2 sine wave signals, 20Hz and 20KHz, each for a second, to correspond to the lowest and highest frequencies we can hear, and to sample 40KHz of these two signals separately, what kind of results can we get? The result: The 20Hz signal is sampled 40k/20=2000 times per vibration, while the 20KHz signal is sampled only 2 times per vibration. Obviously, at the same sampling rate, the information recorded at low frequencies is much more detailed than the high frequencies.
PCM encoding
PCM Pulse code modulation is the abbreviation for the Pulse code modulation. The previous text we mentioned the PCM general workflow, we do not need to care about what the PCM final code is calculated, we only need to know the PCM encoded audio stream advantages and disadvantages. The biggest advantage of PCM coding is good sound quality, the biggest drawback is the large size. Our common audio CDs are encoded in PCM, and the capacity of a single disc can only hold 72 minutes of music information.

Wav
We are exposed to more DivX is a video encoding, AVI can use DivX encoding to compress the video stream, of course, you can also use other encoding compression. Similarly, WAV can also use a variety of audio encoding to compress its audio stream, but we are common is the audio stream is PCM encoding processing of WAV, but this does not mean that WAV can only use PCM encoding, MP3 encoding can also be used in WAV, and AVI, as long as the corresponding decode installed, You can enjoy these wav.
MP3 encoding
MP3 Introduction
MP3, as the most popular audio compression format, is widely accepted by everyone,

Coding: The process of representing information in code.
Decoding: The process of recovering information from an encoded form to a pre-encoded status.
In digital mode, the initial signal of the audio is PCM (for example, WAV), but the PCM is bulky and not conducive to transmission, so it is encoded so that its volume is changed, such as WAV encoded into MP3.
Decoding, is the inverse of the encoding process, such as playing MP3, is the first to decode the MP3 into PCM, and then play.
Audio format is also divided into lossless and lossy, MP3 is lossy audio encoding, mp3 to the PCM, and before the PCM is different! It feels like the sound quality is falling!

Audio encoding

Audio Encoding (reprint)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More