"Reprint" Audio Basics

Source: Internet
Author: User
Tags value of pi

Audio, English, or you might see an audio output or input on the back panel of a VCR or VCD. This allows us to interpret the audio in a very popular way, as long as it is a sound that we can hear and transmit as an audio signal. For the physical properties of audio because it is too professional, please refer to other information. The sound in nature is very complex, the waveform is extremely complex, usually we use the Pulse code modulation coding, namely PCM coding. PCM converts a continuously varying analog signal to a digital encoding by sampling, quantization, and encoding in three steps.

First, the basic concept of audio

1, what is the sample rate and sample size (bit/bit).

The sound is actually a kind of energy wave, so it also has the characteristic of frequency and amplitude, the frequency corresponds to the time axis, the amplitude corresponds to the level axis. The wave is infinitely smooth, the string can be regarded as a myriad of points, because the storage space is relatively limited, the digital encoding process, the point of the string must be sampled. The sampling process is to extract the frequency value of a point, it is clear that in one second the more points extracted, get more frequency information more abundant, in order to restore the waveform, a vibration, must have 2 points of sampling, the human ear can feel the highest frequency of 20kHz, so to meet the hearing requirements of human ears, It takes at least 40k samples per second, expressed in 40kHz, and this 40kHz is the sample rate. Our common CD with a sample rate of 44.1kHz. It is not enough to have the frequency information, we must also obtain the energy value of the frequency and quantify it to indicate the signal strength. The number of quantization levels is 2 of the power of the whole number, our common CD bit 16bit sample size, that is, 2 of the 16-square. Sample size is more difficult to understand than the sample rate, because to appear abstract point, for example: Suppose that a wave is sampled 8 times, the sample points corresponding to the energy value of A1-A8, but we only use 2bit sample size, the result we can only retain 4 points in the A1-A8 value and discard the other 4. If we do a 3bit sample size, we just record all the information for the next 8 points. The larger the value of the sample rate and sample size, the more the recorded waveform is closer to the original signal.

2. Lossy and Lossless

According to the sampling rate and sample size can be learned that, relative to the nature of the signal, audio encoding can only be infinitely close, at least the current technology can only be so, relative to the nature of the signal, any digital audio coding scheme is lossy, because it can not be completely restored. In the computer application, can achieve the highest fidelity level is PCM code, is widely used in material preservation and music appreciation, CD, DVD and our common WAV files are used. As a result, PCM has a conventional lossless encoding, because PCM represents the best fidelity level in digital audio, and does not mean that the PCM will ensure that the signal is absolutely true and that the PCM can only achieve the maximum degree of proximity. We habitually include MP3 in the category of lossy audio coding, which is relative to PCM coding. The emphasis on the relative damage and lossless of the coding is to tell you that it is difficult to do real damage, like using numbers to express pi, no matter how high the precision is, it is just infinite proximity, not really equal to the value of pi.

3. Why to use audio compression technology

To calculate the bitrate of a PCM audio stream is a very easy thing to do, sample rate value x sample size value x channel number bps. A sample rate of 44.1KHz, the sample size of 16bit, two-channel PCM encoded WAV file, its data rate is 44.1kx16x2 =1411.2 Kbps. We often say that the 128K MP3, the corresponding WAV parameter, is this 1411.2 Kbps, this parameter is also called the data bandwidth, it and ADSL bandwidth is a concept. By dividing the bitrate by 8, you can get the data rate of this WAV, which is 176.4kb/s. This means that the storage of a second sampling rate of 44.1KHz, sampling size of 16bit, two-channel PCM encoded audio signal, the need for 176.4KB of space, 1 minutes is about 10.34M, which is unacceptable to most users, especially like listening to music on the computer friends, to reduce disk occupancy, only 2 ways to reduce the sampling Indicator or compression. Reducing the indicators is undesirable, so experts have developed a variety of compression schemes. Due to the different use and target market, various audio compression coding achieves the same sound quality and compression ratio, which we will mention in the following article. One thing is for sure, they have all been compressed.

4. The relationship between frequency and sampling rate

The sample rate represents the number of times the original signal is sampled per second, and the sample rate of the audio file we often see is 44.1KHz, what does that mean? Suppose we have 2 sine wave signals, 20Hz and 20KHz, each for a second, to correspond to the lowest and highest frequencies we can hear, and to sample 40KHz of these two signals separately, what kind of results can we get? The result: The 20Hz signal is sampled 40k/20=2000 times per vibration, while the 20K signal is sampled only 2 times per vibration. Obviously, at the same sampling rate, the information recorded at low frequencies is much more detailed than the high frequencies. That is why some audio enthusiasts accuse CDs of having a digital sound that is not real enough, and 44.1KHz sampling of CDs does not guarantee a better recording of high-frequency signals. To better record the high-frequency signal, it seems to need a higher sampling rate, so some friends in the capture of CD tracks when the use of 48KHz sampling rate, this is not advisable! This is not really good for the sound quality, and for the capture software, keeping the same sample rate as 44.1KHz from the CD is one of the best quality guarantees, rather than improving it. Higher sampling rates are only useful when compared to analog signals, and if the sampled signal is digital, do not attempt to increase the sampling rate.

5. Flow characteristics

With the development of the network, people to listen to music online requirements, so also require audio files can be read while playing, and do not need to read all the files and then playback, so that you can do without downloading can be achieved listening. can also do one side of the coding side play, it is this feature, can achieve online live, set up their own digital radio became a reality.

 

Several complementary concepts:

What is a crossover device?
Frequency divider refers to the different frequency bands of the sound signal to distinguish, respectively, to enlarge, and then sent to the corresponding band of the speakers and then replay. In high-quality sound playback, electronic crossover processing is required. It can be divided into two types: (1) Power divider: Located after the power amplifier, set in the speaker, through the LC Filter network, the power amplifier output audio signal is divided into bass, midrange and treble, respectively sent to their respective speakers. Simple connection, easy to use, but power consumption, the emergence of Audio Valley Point, the generation of the intersection * distortion, its parameters and the speaker impedance has a direct relationship, and the speaker impedance is the function of the frequency, and the nominal value deviation is larger, so the error is larger, not conducive to adjustment. (2) Electronic divider: The audio weak signal to divide the device, in front of the power amplifier, and then use their own independent power amplifier, each audio band signal amplification, and then sent to the corresponding speaker unit. Because of the small current, it can be realized by the electronic active filter with smaller power, easy to adjust, reduce power loss and interference between the loudspeaker unit. Make the signal loss small, good sound quality. But this way each way to use an independent power amplifier, high cost, complex circuit structure, used in professional sound reinforcement system. (Excerpt from Av_world)
What is an actuator?
The exciter is a kind of harmonic generator, which makes use of people's psychological acoustic characteristics to modify and beautify the sound signal processing equipment. By adding high-frequency harmonic components to the sound, we can improve the sound quality, timbre, improve the penetrating power of the voice, and increase the sense of space of the sound. Modern exciter not only can create high-frequency harmonics, but also has low-frequency expansion and music style functions, so that the bass effect more perfect, music more expressive. Use the actuator to improve sound clarity, intelligibility and expressiveness. Make sound more pleasing, reduce hearing fatigue, increase loudness. Although the actuator only adds about 0.5dB harmonics to the sound, it actually sounds as if the volume has increased by about 10dB. It can increase the auditory loudness of sound, the three-dimensional sound image, and the increase of the separation of sound, improve the position and level of sound, and improve the sound quality of reproducing sounds and the reproduction rate of tapes. Because the acoustic signal in the transmission and recording process will lose high-frequency harmonic components, high-frequency noise. At this time the former with the exciter first to compensate for the signal, the latter can filter the high-frequency noise after filtering, and then create a treble composition, to ensure the playback quality. The adjustment of the actuator requires the sound division to distinguish the sound quality and timbre of the system, and then adjust it according to the subjective listening and sound evaluation. (
What is a equalizer?
Equalizer is a variety of frequency components can be adjusted to put a large number of electronic devices, through a variety of different frequency of electrical signal adjustment to compensate for the speaker and sound field defects, compensation and modification of various sound sources and other special role, the general mixer can only be used for high frequency, if, low frequency three-phase signal adjustment respectively. Equalizer is divided into three categories: graphic equalizer, parametric equalizer and room equalizer. 1. Graphic Equalizer: Also known as the chart Equalizer, through the panel push-pull Key distribution, can be visualized to reflect the balance compensation curve, each frequency of lifting and attenuation at a glance, it uses constant Q technology, each frequency point has a push-pull potentiometer, regardless of the frequency of lifting or attenuation, the filter bandwidth is always the same. The usual professional graphic equalizer is the 20hz~20khz signal is divided into 10, 15, 27, 31 segments to adjust. In this way, people choose the frequency equalizer of different segments according to different requirements. Generally speaking, the frequency point of the 10-segment equalizer with octave interval distribution, using in general, the 15-band Equalizer is a 2/3 times Octave Equalizer, the use of professional amplification, 31-band Equalizer is 1/3 times Octave Equalizer, most of the more important needs of fine compensation, the graphic equalizer structure is simple, straightforward, Therefore, it is widely used in professional audio. 2. Parametric equalizer: Also known as Parametric Equalizer, the equalization adjustment of various parameters can be fine-tuned equalizer, more attached to the mixer, but also has a separate parametric equalizer, the parameters of the adjustment includes frequency bands, frequency points, gain and quality factor Q value, can beautify (including smear) and modify the sound, Make sound (or music) style more prominent, colorful to achieve the desired artistic effect. 3. Room equalizer, used to adjust the frequency response characteristic curve in the room EQ, because the decorative material on the different frequency of absorption (or reflection) and the impact of Jane resonance caused by acoustic staining, so must use the room equalizer to the noise due to the frequency of defects to be objectively compensated for regulation. The finer the frequency band, the more sharp the peak of the adjustment, that is, the higher the Q value (quality factor), the more careful the compensation is, the more coarse the frequency bands are, the more wide the peak is, and the more difficult to compensate when the frequency characteristic curve of the sound field is more complicated. (
What is a compression limiter?
The compression limiter is the general designation of the compressor and limiter. It is a processing device for audio signals that can compress or limit the dynamic performance of an audio signal. The compressor is a variable gain amplifier whose magnification (gain) can automatically vary with the strength of the input signal and is inversely proportional. When the input signal reaches a certain level (threshold is also called critical value), the output signal increases with the input signal, which is known as compression (Compressor); no increase is called a limit (limiter). In the past, the pressure limiter adopts hard inflection point (hard-knee) technology, and the input signal reaches the threshold value. The gain is immediately reduced, so that the signal is dynamically mutated at the inflection point (the change of the gain), making the ear distinctly aware of the sudden compression of the strong signal. In order to solve this shortcoming, modern new type of pressure limiter adopts soft inflection point (soft-knee) technology, the compression ratio of this kind of pressure limiter is balanced and gradual before and after the threshold value, so that the compression change is difficult to detect and the sound quality is improved. In the recording process, the pressure limiter can maintain a certain balance between the instrument and the singer's volume, and ensure the balance of various signal intensities. Sometimes it is also used to eliminate the singer's lisp, or to make use of changes in compression and release time, resulting in the sound from small to large "reverse sound" special effects. In the broadcast system, it is used to compress the program signal with large dynamic range to improve the average emission level under the precondition of preventing the modulation distortion and preventing the transmitter from overloading. In the singing and dancing Hall of the sound reinforcement system, the pressure limiter is the signal through compression in the original program to maintain the style, reduce the dynamics of music, to meet the requirements of sound reinforcement systems and artistic activities. Although the pressure limiter has many uses, the modern compressor general uses the soft inflection point and so on the new technology, may further reduce the compressor side effect of the pressure limiter, but does not mean that the pressure limiter to the sound quality destruction effect has ceased to exist. Therefore, in the sound reinforcement system, do not misuse the pressure limiter, even if the use should be cautious with the reduction of the use of pressure limiter to signal processing. This is not only the need to protect the amplifier, speakers, but also to improve the sound quality needs.
what is signal-to-noise ratio (S/n)?
Snr refers to the signal power of a reference point in the line and the noise power inherent in the absence of a signal.
Ratio, expressed in decibels (db), the higher the value, the better, indicating less noise.
What is decibel
The decibel (db) is the standard unit representing the relative power or amplitude level. expressed in db. The larger the decibel is, the louder the sound is, and the decibel is calculated to be 10 decibels each, then the sound size is about 10 times times the original.
Db:decibel decibels. Used to express the relative level of two voltage, power, or sound.
DBm: A variant of decibel, 0dB = 1mW into Ohms
DBv: A variant of decibel, 0dB = 0.775 volts.
DBV: A variant of decibel, 0dB = 1 volt.
Db/octave: decibels/Eight degrees. The way the slope of the filter is expressed, the greater the number of decibels per eight degrees, the more steep the slash.

The concept is relatively complex, and we use physics to illustrate:

In order to indicate the intensity of the sound, people introduced the concept of "acoustic intensity", and measured its size by the number of sound energy perpendicular to the unit area within 1 seconds, with the letter "I" as its unit "w/M 2". According to the regulations, if the sound energy is doubled vertically through the unit area within 1 seconds, then the value of acoustic intensity becomes twice times the original. So sound intensity is an objective physical quantity that does not shift with people's feelings.
Although sound intensity is an objective physical quantity, there is a great difference between the size of the acoustic intensity and the vocal strength of the people's subjective feeling. In order to meet people's subjective sense of voice strength, physics introduced the concept of "sound intensity level", the decibel is a unit of intensity level, it is Bell's one-tenth.
How is the sound intensity level regulated? What does it have to do with strong harmony?
The measurement proves that the sensitivity of the ear to different frequencies of sound waves is different. Most sensitive to 3000 Hz sound waves. As long as the sound intensity of this frequency reaches i0=10-12 w/M 2, it can cause hearing of the human ear. Sound intensity level is the human ear can hear the most quiet strong I0 for the benchmark, and the i0=10-12 watt/meter 2 of the intensity of the 0-level intensity, that is, the intensity level at this time is 0 bell (also 0 decibels). When the sound intensity is doubled from I0 to 2i0, the vocal strength of the human ear is not doubled. Only when the sound intensity reaches 10i0, the voice of the human ear is increased by a factor of 1 Bell = 10 db, and when the sound intensity changes to 100i0, the voice of the human ear increases twice times, and the corresponding intensity level is 2 Bell = 20 db; When sound intensity changes to 1000i0, The sound of the human ear is increased by 3 times times, the corresponding intensity level is 3 Bell = 30 db, and so on. The maximum sound intensity the human ear can withstand is 1 w/M 2=1012i0, which corresponds to a sound intensity level of 12 Bell = 120 decibels.
Formula: Sound pressure level (DB) =20LG (measured sound pressure/reference sound pressure value)
Old fish Note: when measured to the same size as the reference sound pressure, the logarithm is calculated after the result is 0dB. On the analog audio device, it can be greater than 0dB, but the digital device does not, the number calculation needs a measure, the infinite value is not. So in the digital devices and software we use, 0dB becomes a reference standard value.

Second, the common audio format and player introduction

Characteristics and adaptability of mainstream audio formats

A variety of audio coding has its technical characteristics and the applicability of different occasions, we roughly explain how to flexibly apply these audio coding.

4-1 PCM-encoded WAV

As mentioned earlier, PCM encoded WAV file is the best sound format, under the Windows platform, all audio software can provide support for her. There are many functions in Windows WINAPI that can play WAV directly, so when developing multimedia software, WAV is often used as the event sound effect and background music. PCM-encoded WAV can achieve the best sound quality under the same sampling rate and sample size, so it is also used in many fields such as audio editing and non-linear editing.

Features: sound quality is very good, is supported by a large number of software.

Applies To: Multimedia development, save music and sound material.

4-2 MP3

MP3 has a good compression ratio, using lame encoded medium-high bitrate mp3, the listening sense has been very close to the source WAV file. With the right parameters, the lame encoded MP3 is ideal for music appreciation. As the MP3 has been introduced for a long time, coupled with a good quality and compression ratio, many games also use MP3 to do event sound and background music. Almost all the famous audio editing software also provides support for MP3, can be used like WAV mp3, but because MP3 encoding is lossy, so after several edits, the sound quality will fall sharply, MP3 is not suitable to save the material, but as the work of the demo is really very good. MP3 Long-term history and good sound quality, so that it becomes one of the most widely used lossy coding, the network can find a large number of MP3 resources, Mp3player increasingly become a fashion. Many Vcdplayer, Dvdplayer and even mobile phones can play Mp3,mp3 is one of the best encodings supported. MP3 is not perfect, and does not perform well at lower bit rates. MP3 also has the basic characteristics of streaming media, which can be played online.

Features: good sound quality, compression is higher, by a large number of software and hardware support, widely used.

Suitable for: Suitable for high-demand music appreciation.

4-3 OGG

Ogg is a very potential coding, in various code rates have a more impressive performance, especially in the low-bit rate. In addition to the good sound quality, OGG is a completely free code, which is the basis for more support for Ogg. Ogg has a very good algorithm, can use a smaller bitrate to achieve better sound quality, 128kbps ogg than 192kbps or even higher bitrate mp3 also outstanding. Ogg's treble has a certain metallic flavour, so this flaw in Ogg is exposed when coding some high-frequency instruments for solo. Ogg has the basic characteristics of streaming media, but now there is no media services software support, so Ogg-based digital broadcasting is not possible. Ogg is currently supported by the situation is not good enough, both software and hardware, can not be compared with MP3.

Features: You can use a smaller rate than the MP3 to achieve better sound quality than MP3, high-school low-bit rate has a good performance.

Ideal for: better sound quality with smaller storage space (relative MP3)

4-4 MPC

As with OGG, the MPC competitor is also MP3, in the high-bit rate, the MPC can do better than the competitor's sound quality, in the medium rate, the performance of the MPC is not inferior to OGG, in the high code rate, the performance of the MPC is alone, the MPC's sound quality is mainly in the high-frequency part, The MPC's high frequency is much more delicate than the MP3, and does not have the metal taste of Ogg, which is currently the most suitable lossy coding for music appreciation. Because they are new codes, they are similar to the Ogg encounters and lack extensive software and hardware support. MPC has a good coding efficiency, encoding time is much shorter than Ogg and lame.

Features: The best quality performance in lossy coding, high frequency performance is excellent under the high code rate.

For: Music appreciation for the best sound quality while saving a lot of space.

4-6 WMA

Microsoft developed WMA is also a lot of friends love, in the low bit rate, has better than mp3 a lot of sound performance, WMA appearance, immediately eliminated the once rage VQF code. With Microsoft background WMA getting good software and hardware support, Windows Media player is able to play WMA and listen to digital radio based on WMA encoding technology. Because the player is almost on every PC, more and more music sites are happy to use WMA as the first choice for online audition. In addition to supporting the environment is good, WMA in the 64-128kbps code rate also has a very good performance, although a lot of high demand for friends and not satisfied, but more demanding friends accept this code, WMA quickly popularized.

Features: Low bit rate of sound quality performance is difficult to have opponents

Applicable to: Digital radio set up, online audition, low-demand music appreciation

4-7 mp3PRO

As MP3 's improved version of mp3PRO showed a pretty good quality, treble plump, although mp3pro is through the SBR technology in the playing process inserted, but the actual listening sense is quite good, although it seems a little thin, but in the 64kbps world has no opponents, Even more than the 128kbps mp3, but unfortunately, mp3PRO's low-frequency performance is like MP3, fortunately, the high-frequency interpolation of SBR can be more or less to cover up the defect, so mp3pro low-frequency weakness is not as obvious as WMA. You can feel it deeply when using the pro switch of RCA mp3PRO Audio player to switch Pro mode and normal mode. Overall, the 64kbps mp3PRO achieves a 128kbps MP3 sound quality level, and a slight win in the high-frequency segment.

Features: The King of sound quality under low bit rate

Suitable for: Music appreciation under low requirements

4-8 APE

A new lossless audio codec that can provide a 50-70% compression ratio, although not worth mentioning in comparison to lossy coding, is a great boon for a friend who pursues perfect attention. Ape can be truly lossless, not sound lossless, and compression ratios are better than similar lossless formats.

Features: very good sound quality.

Apply to: The highest quality music appreciation and collection.

 

Third, the audio signal encoding processing

1. PCM Code

 

PCM Pulse code modulation is the abbreviation for the Pulse code modulation. The previous text we mentioned the PCM general workflow, we do not need to care about what the PCM final code is calculated, we only need to know the PCM encoded audio stream advantages and disadvantages. The biggest advantage of PCM coding is good sound quality, the biggest drawback is the large size. Our common audio CDs are encoded in PCM, and the capacity of a single disc can only hold 72 minutes of music information.

As you know, no matter how powerful the multimedia computer is today, it can only handle digital information inside. And the sound we hear is analog, how can we get the computer to handle the sound data? And what is the difference between analog audio and digital audio? What are some of the advantages of digital audio? These are all we have to introduce below.

Converting analog audio to digital audio, which is called sampling in computer music, is the main hardware device used in the process of the analog/digital converter (Analog to Digital Converter, or ADC). The process of sampling is actually converting the electrical signals of the usual analog audio signals into a number of binary codes 0 and 1, called "bit", which constitute a digital audio file. For example, the sinusoidal curve in the figure represents the original audio curves, and the color-filled squares represent the results obtained after sampling, and the more they match, the better the sample results.


The horizontal axis is the sampling frequency, and the ordinate is the sampling resolution. The lattice from left to right, gradually encrypted, first increase the density of the horizontal axis, and then increase the density of the ordinate. Obviously, the smaller the horizontal axis, the smaller the interval of two sampling time, the more favorable the real situation of the original sound, in other words, the greater the frequency of sampling, the better the quality of the sound; Similarly, the smaller the ordinate units, the better the sound quality, i.e. the larger the number of samples is better.

Please note that the 8-bit (8Bit) is not to say that the ordinate is divided into 8 parts, but divided into 2^8=256 parts, the same 16-bit is divided into 2^16=65536 parts, and 24 is divided into 2^24=16777216 parts. Now let's do a calculation to see how much data is in a digital audio file. Let's say we use 44.1kHz, 16bit for stereo (i.e. two channels)

2. WAVE

This is an ancient audio file format developed by Microsoft. WAV is a file format that conforms to the PIFF Resource Interchange File format specification. All WAV has a file header, which is the encoding parameter of the header audio stream. WAV does not have a hard-coded audio stream encoding, except for PCM, and almost any code that supports the ACM specification can encode WAV audio streams. Many friends do not have this concept, we take AVI to do a demonstration, because AVI and WAV in the file structure is very similar, but AVI more than a video stream. We are exposed to many kinds of AVI, so we often need to install some decode to watch some avi, we touch more DivX is a video encoding, AVI can use DivX encoding to compress the video stream, of course, can also use other encoding compression. Similarly, WAV can also use a variety of audio encoding to compress its audio stream, but we are common is the audio stream is PCM encoding processing of WAV, but this does not mean that WAV can only use PCM encoding, MP3 encoding can also be used in WAV, and AVI, as long as the corresponding decode installed, You can enjoy these wav.
Under the Windows platform, PCM-encoded WAV is the best supported audio format, all audio software can be perfectly supported, because it can achieve high sound quality requirements, therefore, WAV is also the preferred format for music editing, suitable for saving music material. As a result, PCM-encoded WAV is used as an intermediary format, often in the conversion of other encodings, such as MP3 to WMA.

3, MP3 Code

MP3 as the most popular audio compression format, for everyone to accept a large number of MP3-related software products, and more hardware products also began to support MP3, we can buy Vcd/dvd player are many can support MP3, There are more portable MP3 players and so on, although several major musicians are extremely disgusted with this open format, but also cannot prevent this audio compression format from surviving and circulating. MP3 Development has been 10 years, he is MPEG (mpeg:moving picture Experts Group) Audio Layer-3 abbreviation, is MPEG1 's derivative coding scheme, 1993 by Germany Fraunhofer The IIS Institute and the Thomson company have developed a successful partnership. MP3 can do 12:1 of the amazing compression ratio and maintain the basic audible sound quality, in the year of hard disk days, MP3 quickly accepted by users, with the popularity of the network, MP3 by hundreds of millions of users to accept. At the beginning of the release of MP3 coding technology is very imperfect, due to the lack of sound and human ear auditory research, the early MP3 encoder almost all in a rough way to encode, the sound quality is seriously damaged. With the continuous introduction of new technologies, MP3 coding technology has been improved once, with 2 significant technical improvements.
VBR: Files in the MP3 format have an interesting feature, which is that they can be read side-by-side, which also conforms to the most basic characteristics of streaming media. This means that the player can play without the full contents of the pre-read file and read where it is played, even if the file is partially damaged. Although MP3 can have a file header, it is not important for files in the MP3 format, and because of this feature, it is determined that each frame of the MP3 file can have a separate average data rate, without the need for a special decoding scheme. So there is a technology called VBR (Variable bitrate, Dynamic Data rate), you can let MP3 file each paragraph or even every frame can have a separate bitrate, the advantage is to ensure the quality of the premise of the maximum limit the size of the file. The superiority of this technique is obvious, but it is difficult to use it, because it requires the encoder to know how to allocate bitrate for each section, which is a fake technique for encoders without waveform analysis. It is true that VBR technology does not appear to be dazzling.

Through long-term acoustic studies, experts found that the human ear has a masking effect. The sound signal is actually a kind of energy wave, in the air or other medium transmission, the ear of the sound energy is the most direct response to loudness or sound pressure is to hear the size of the sound, we call it loudness, the loudness of this energy is expressed in decibels (db). Even with the same loudness, people will feel different sizes of sounds because of their frequency. The most easily heard in the ear is the frequency of 4000Hz, regardless of whether the frequency is higher or lower, even if the loudness in the same situation, everyone will feel that the sound is getting smaller. But the loudness drops to a certain extent, the human ear can not hear, each frequency has a different value.

You can see that this curve basically into a V-shaped, when the frequency of more than 15000Hz, the ear will feel the sound is very small, a lot of hearing is not very good people, simply can not hear the frequency of 20000Hz, regardless of the loudness of how big. When the human ear hears at the same time two different frequencies, the different loudness sound, the small loudness also can be ignored, for example: in the daytime we can not hear the cooling fan sound in the computer, the night becomes the noise source, according to this principle, the encoder may filter out many inaudible sounds, simplifies the information complexity, increases the compression ratio, Without noticeable reduction in sound quality. This masking is called the simultaneous masking effect. But sound a is obscured by sound B, and if a is in the masking range of center B, the shading is more pronounced, and this range is called the critical bandwidth. The critical bandwidth of each frequency is different, and the higher the frequency, the wider the critical bandwidth.

Frequency (Hz)

Critical Bandwidth (HZ)

Frequency (Hz)

Critical Bandwidth (HZ)

50

80

1850

280

150

100

2150

320

350

100

2500

380

450

110

3400

550

570

120

4000

700

700

140

4800

900

840

150

5800

1100

1000

160

7000

1300

1170

190

8500

1800

1370

210

10500

2500

1600

240

13500

3500

According to this effect, the experts designed the human auditory psychological model, which was imported into the MP3 code, resulting in a revolution in the quality of the sound, MP3 coding technology has been saddled with poor sound reputation, but the stigma has now gradually been eluted. At this time, has been buried by the VBR technology brilliance, with the use of psychological model of the reality of a strong temptation and lethality.
For a long time, many people are not good impression of MP3, more people think the best sound quality WMA is better than MP3, this argument is not correct, in the high code rate, coding the right MP3 is much better than WMA, can be very close to the CD quality, in the less well supported by hardware devices, not many people can distinguish between the difference between the two, This is not a myth, although you can easily distinguish between MP3 and CDs in the past, but now you cannot guarantee that you can tell the difference correctly. Because MP3 is excellent coding, it was buried before.

4. OGG Code

On the network appeared a kind of called Ogg Vorbis Audio coding, known as MP3 Killer! What exactly is Ogg Vorbis? Ogg is a huge multimedia development program with a project name that will involve coding development in areas such as video and audio. The entire OGG project plan is designed to provide anyone with a completely FREE multimedia coding scheme! Ogg's belief is: open! free! Vorbis is a "playboy" character in Trie Pratt Zeit's fantasy novel "Small Gods". This term became the official name for the audio encoding in the Ogg project. At present, Vorbis has been developed successfully and the encoder has been developed.

Ogg Vorbis is a high-quality audio coding scheme, and official data shows that Ogg Vorbis can achieve better sound quality than MP3 at relatively low data rates! Ogg Vorbis This coding is also far more advanced than the successful MP3 of the 90 's, and she can support multichannel, what does that mean? This means that Ogg Vorbis, with the support of the SACD, DTSCD, DVD audio capture software (which is not yet available), can encode all channels, rather than MP3 encode only 2 channels. The rise of multichannel music brings a revolutionary change to music appreciation, especially when it comes to appreciating the symphony, which brings more realism. This revolutionary change is MP3 to be able to adapt.

Like MP3, Ogg Vorbis is a flexible and open audio codec that can be used to adjust the sound quality and improve the new algorithm after the coding scheme has been fixed. Therefore, its sound quality will be getting better, and MP3 similar, Ogg Vorbis more like an audio coding framework, you can constantly import new technologies gradually perfected. Like MP3, Ogg also supports VBR.

5. MPC Code

MPC is another impressive strength of the player, its popularity process is very low-key, there is no complicated background story, her appearance is only one purpose, smaller volume of better sound! The MPC was formerly known as Mp+, and it was clear that she was targeting the competitor. However, as long as the person who has used this code will have a deep impression, is her outstanding sound quality.

6, mp3PRO Code

June 14, 2001, Thomson Multimedia SA and Francheves Association (Fraunhofer Institute) released a new version of the music format, named mp3PRO, on June 14. This is an improved scheme based on MP3 coding technology, which appears to be quite appealing from the official announcement features. From various sources, mp3PRO is not a completely new format, it is based on the traditional MP3 coding technology, the biggest technical highlight is the SBR (spectral Band Replication band copy), which is a new audio coding enhancement algorithm. It provides the possibility to improve the performance of audio and speech coding in low-rate situations. This method can increase the bandwidth of the audio or improve the coding efficiency at the specified bit rate. The biggest advantage of SBR is to achieve very efficient coding at low data rate, unlike traditional encoding technology, SBR is more like a post-processing technology, so the advantages and disadvantages of decoder algorithm directly affect the quality of sound. The high frequency is actually produced by the decoder (player), the SBR encoded data is more like a high-frequency command set, or called the guidance of the signal source, which is a bit 駇 idi way of working. As we can see, mp3PRO is actually a mixed data stream encoding of MP3 signal flow and SBR signal stream. The data show that SBR technology can improve the high frequency sound quality under low data flow, the degree of improvement is about 30%, we no matter how this 30% is obtained, but can anticipate this improvement can make 64kbps mp3 reach 128kbps MP3 sound quality level (note: Under the same encoding condition, The increase in data rates and sound quality is not proportional to, at least, the ear of the hearing is such, and the official claim that the 64kbps mp3PRO can be comparable to 128kbps MP3 propaganda is basically consistent.

7. WMA

WMA is the Windows Media Audio encoded file format, developed by Microsoft, WMA is not aimed at the single market, is the network! Competitors are the famous real Networks in the online media market. Microsoft claims that with only 64kbps of bitrate, WMA can reach the sound quality near the CD. Unlike previous encodings, WMA supports anti-replication, and she supports the ability to add protection through Windows Media rights Manager, which can limit playback times and the number of plays and even the machines that are playing. WMA supports streaming technology, that is, playing on one side, so WMA can easily be broadcast online, because it is Microsoft's masterpiece, so Microsoft added the support for WMA in Windows, WMA has excellent technical characteristics, in the strong promotion of Microsoft, this format is more and more people accept.

8. RA

RA is the RealAudio format, this is a lot of network insects contact a very large format, most of the music website online audition is the use of RealAudio, this format is completely targeted to the network media market, support very rich features. The biggest flashing point is that this format can be based on the audience's bandwidth to control their own bitrate, in order to ensure smooth conditions to maximize the sound quality. RA can support multiple audio encodings, including ATRAC3. Like WMA, not only does RA support edge-reading, it also supports the use of special protocols to conceal the real network address of a file, enabling only online playback without the download. This is important for record companies and record sales companies, where RA and WMA are the most popular audio media formats available on the Internet for online listening.

9. APE

Ape is a lossless compression format provided by monkey ' s audio. Monkey's audio provides Winamp plug-in support, so this means that the compressed file is no longer a pure compression format, but an audio file format that can be played as well as MP3. The compression ratio of this format is much lower than other formats, but can be truly lossless, thus gaining a lot of fever users favor. In the existing many lossless compression scheme species, ape is a prominent performance of the format, satisfactory compression ratio and fast compression speed, became a lot of friends privately exchange fever music the only choice.

By:yangchen

Reproduced

Reprint Audio Basics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.