Codec learning notes (1): Basic Concepts

Source: Internet
Author: User

Media services are the main services of the network. Especially with the rise of mobile Internet businesses, among operators and application developers, media businesses are extremely heavy. The media codec service involves demand analysis, application development, and license release charges. Recently, due to the project relationship, we need to clarify the media codec. What we do is to look at the specifications and standards of the operator on the douding network. The same operator's business has different requirements in different documents, in addition, some requirements seem to me to be a continuation of history, that is, they are rarely used now. So I can't see why on douding. I can check it on the wiki. The Chinese wiki has a limited amount of information, which is very short. However, the wiki has a large amount of English content, and the cut-down version also loses too much weight. On the Internet, I also saw a Chinese wiki in The Shanzhai, which looks very similar and red, and is called "global wiki ". Wiki is still very good in Chinese, but it is recommended to read English again after reading.

I have summarized and summarized the media codec. The materials are sourced from wiki, and a small part are sourced from the collection of online blogs. We will provide the source of the user information. If the information has been transferred for a few times, we can only give a track.

Basic Concepts

Codec

Codec refers to a device or program that can transform a signal or a data stream. Here, the conversion includes encoding (usually for transmission, storage, or encryption) The signal or data stream, or extracting an encoding stream, it also includes operations that are suitable for observation or operation to be restored from the encoding stream. Codecs are often used in video conferencing, streaming media, and other applications.

Container

Many multimedia data streams need to contain both audio and video data. In this case, metadata for audio and video data synchronization, such as subtitles, is usually added. These three data streams may be processed by different programs, processes, or hardware. However, when they are transmitted or stored, these three data streams are typically encapsulated together. This encapsulation is usually implemented through the video file format, for example, common *. mpg ,*. avi ,*. mov ,*. mp4 ,*. rm ,*. ogg or *. tta. some of these formats can only use certain codecs, but more can use various codecs in container mode.

FourCC is the Four-Character Codes. It consists of 4 characters (4 bytes) and is a Four-byte format that uniquely identifies the video data stream, there will be a FourCC section in the wav and avi files to describe the AVI files, which are encoded using codec. Therefore, wav and avi have a large number of FourCC equal to "IDP3.

Video is an important part of Multimedia Systems in computers. To meet the needs of video storage, people have set different video file formats to put the video and audio in one file for simultaneous playback. Video files are actually a container that contains different tracks. The Container formats used are related to the scalability of video files.

Parameter Introduction

Sampling Rate

The sampling rate (also known as the sampling speed or sampling frequency) defines the number of samples that extract from a continuous signal every second and form a discrete signal, which is expressed in Hz. The reciprocal of the sampling frequency is called the sampling period or sampling time, which is the interval between sampling. Do not confuse the sampling rate with the bit rate (bit rate.

The sampling theorem indicates that the sampling frequency must be twice the bandwidth of the sampled signal, and the other equivalent statement is that the nequest frequency must be greater than the bandwidth of the sampled signal. If the signal bandwidth is 100Hz, the sampling frequency must be greater than 200Hz to avoid aliasing. In other words, the sampling frequency must be at least twice the maximum frequency of the signal, otherwise the original signal cannot be restored from the signal sampling.

For Voice Sampling:

  • 8,000Hz-telephone sampling rate, enough for people to speak

  • 11,025Hz
  • 22,050Hz-sampling rate for radio broadcast
  • Sampling Rate Used for 32,000Hz-miniDV digital video camcorder and DAT (LP mode)
  • 44,100Hz-audio CD, also commonly used in MPEG-1 audio (VCD, SVCD, MP3) Sampling Rate
  • Sampling rate used by the world's first commercial PCM Recorder developed by 47,250Hz-Nippon Columbia (Denon)
  • 48,000Hz-miniDV, digital TV, DVD, DAT, sampling rate for digital sound for movies and professional audio
  • Sampling rate used by the first commercial digital recorder developed by 3 M and Soundstream In the 50,000Hz-1970s s
  • 50,400Hz-sampling rate used by Mitsubishi X-80 digital recorder
  • Sampling Rate Used for 96,000 or 192,000Hz-DVD-audio, some lpcm DVD tracks, Blu-ray Disc (blue disc) tracks, and HD-DVD (high definition DVD) tracks
  • The sampling rate used for the one-digit sigma-delta modulation process called direct stream digital, jointly developed by 2.8224 MHz-SACD, Sony and Philips.

In a simulated video, the sampling rate is defined as frame rate and field frequency, rather than the conceptual pixel clock. The image sampling frequency is the cycle speed of the sensor's integral cycle. Because the integral period is much less than the time required for repetition, the sampling frequency may be different from the reciprocal of the sampling time.

  • 50Hz-PAL video

  • 60/1 .001Hz-NTSC video

When the analog video is converted to a digital video, another different sampling process occurs. This time, the pixel frequency is used. Some common pixel sampling rates include:

  • 13.5 MHz-CCIR 601, D1 video

Resolution

Resolution refers to the detailed resolution of the measurement or display system. This concept can be measured in time, space, and other fields. Generally, the resolution is mostly used for the definition of an image. The higher the Resolution, the better the image quality, and more details. However, the more information you record, the larger the file. Currently, image processing software can be used to resize and edit images on personal computers. Such as photoshop or photoimpact.

Image resolution
:

Used to describe the detailed resolution of images. It is also suitable for digital images, film images, and other types of images. It is often measured by lines per millimeter and lines per inch. Generally, the "resolution" is expressed as the number of pixels in each direction, such as 640xx. In some cases, it can also be expressed as "pixels per inch (ppi)" and the length and width of the image. For example, 72ppi and 8x6.

Video resolution
:

Resolution of various TV specifications
The image size of the frequency is called "resolution ". A digital video is measured in pixels, while a video is measured by the number of horizontal scanning lines. SD video Number Resolution is
720/704/640x480i60 (NTSC) or 768/720 x576i50 (PAL/SECAM ). Resolution of the new HD TV
1920x1080p60, that is, each horizontal scanning line has 1920 pixels, and each screen has 1080 scanning lines, playing at 60 screenshots per second.

FPS

Frame rate (Chinese) is often translated as "image update rate" or "Frame rate", which refers to the number of static images played per second in the video format. The typical update rate varies from 6 or 8 frames per second (fps) to 120 frames per second. PAL (television broadcasting formats in Europe, Asia, Australia, and other regions) and SECAM (television broadcasting formats in France, Russia, and some Africa and other regions) stipulate that the update rate is 25fps, while NTSC (United States, television broadcasting formats in Canada, Japan, and other places) set the update rate to 29.97 fps. The film is shot at a slightly slower speed of 24 FPS, which makes it necessary for television broadcasts from various countries to broadcast movies in complex conversion procedures (refer to Telecine conversion ). It takes about 10 FPS to achieve the most basic visual effect temporarily.


Compression Method

Lossy compression and lossless compression

The concept of Lossy and Lossless in video compression is similar to that in static images. Lossless Compression means that the data before and after decompression is exactly the same. Most lossless compression uses the RLE travel encoding algorithm. Lossy compression means that the decompressed data is inconsistent with the pre-compressed data. Some images or audio information that are not sensitive to human eyes and ears must be lost during compression, and the lost information cannot be recovered. Almost all high compression algorithms adopt lossy compression to achieve the goal of low data rate. The loss data rate is related to the compression ratio. The smaller the compression ratio, the more data is lost, and the worse the decompression effect. In addition, some lossy compression algorithms adopt repeated compression for multiple times, which may cause additional data loss.

  • Lossless formats, such as WAV, PCM, TTA, FLAC, AU, ape, Tak, wavpack (WV)

  • Lossy formats, such as MP3, Windows Media Audio (WMA), Ogg Vorbis (Ogg), and aac

Intra-frame compression and inter-frame Compression

Intra-frame (Intraframe) compression is also called space compression (Spatial compression ). When compressing an image, we only consider the data of the current frame, not the redundant information between adjacent frames. This is actually similar to static image compression. The lossy compression algorithm is generally used in frames. Because there is no relationship between frames during intra-frame compression, the compressed video data can still be edited in frames. Intra-frame compression generally fails to achieve high compression.

Interframe compression is based on the correlation between two consecutive frames of many videos or animations, or the information changes between the two frames are small. That is to say, consecutive videos have redundant information between adjacent frames. Based on this feature, compressing the redundancy between adjacent frames can further increase the compression volume and reduce the compression ratio. Inter-frame compression, also known as Temporalcompression, compresses data between different frames on the timeline. Inter-frame compression is generally lossless. The Frame differencing algorithm is a typical time compression method. By comparing the difference between the current Frame and the adjacent Frame, only the difference between the current Frame and its adjacent Frame is recorded, this greatly reduces the amount of data.

Symmetric encoding and asymmetric Encoding

Symmetry is a key feature of compression encoding. Symmetry means that compression and decompression occupy the same computing capability and time. symmetric algorithms are suitable for real-time compression and transmission of videos. For example, symmetric compression encoding algorithms are used for video conferencing applications. In electronic publishing and other multimedia applications, videos are usually pre-compressed and played later. Therefore, asypolicric encoding can be used. Asymmetry or asymmetry means that it takes a lot of processing power and time to compress, while decompression can play back the data in real time, that is, compression and decompression at different speeds. Generally, it takes much longer to compress a video than to decompress it. For example, it may take more than 10 minutes to compress a video clip of three minutes, and the real-time playback time of the clip is only three minutes.

 

Sources other than wiki: http://tech.lmtw.com/csyy/Using/200411/3142.html

Related links:
My industry ecosystem chain and miscellaneous articles

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.