Company Project reason, contact a bit of video stream H264 codec knowledge, before the project using the FFmpeg Multimedia library, using the CPU to do video encoding and decoding, commonly known as soft-coded soft solutions. This method is more common, but it consumes CPU resources, and the codec efficiency is not high. The general system will provide a GPU or a dedicated processor to encode and decode the video stream, that is, hardware encoding and decoding, referred to as hard coding and decoding. Apple did not have the hardware encoding and decoding capabilities of the open system before the iOS 8.0 system, but the Mac OS system has always had a framework called Video Toolbox to handle the encoding and decoding of hardware, and finally, after iOS 8.0, Apple introduced the framework into the iOS system.
As a result, developers can use the interface provided by the Video Toolbox framework in iOS to work on hardware encoding and decoding, which facilitates the video codec for VoIP video calls and video streaming.
(PS: According to Apple WWDC2014 513 "direct access to media encoding and decoding", Apple's previous avfoundation framework also uses hardware to hard-encode and decode video, But after the code is written directly to the file, decoded directly after the display. The video Toolbox framework can get the encoded frame structure, or the original image after decoding, so it has greater flexibility to do some image processing. )
One, Videotoolbox basic data structure.
The data structure that needs to be applied before and after the video toolbox is encoded.
(1) Cvpixelbuffer: Image data structure before encoding and decoding.
(2) Cmtime, Cmclock, and cmtimebase: timestamp related. Time appears in the form of 64-bit/32-bit.
(3) Cmblockbuffer: The data structure of the resulting image after encoding.
(4) Cmvideoformatdescription: Image storage mode, codec and other format description.
(5) Cmsamplebuffer: A container data structure for storing video images before and after encoding and decoding.
Figure 1.1 Video H264 data structure before and after encoding and decoding
1.1, before and after the codec video image is encapsulated in the Cmsamplebuffer, if the image is encoded, stored in cmblockbuffe manner, the decoded image to cvpixelbuffer storage. Cmsamplebuffer also has additional time information cmtime and video description information Cmvideoformatdesc.
Two, hard decoding using method.
A typical application shown in 2.1 shows how to use the hardware decoding interface. The application scenario is a H264 encoded video stream from the network, which is then displayed on the phone screen.
Figure 2.1 H264 Typical application Scenarios
1, the H264 code flow into the Cmsamplebuffer before decoding.
As shown in Figure 1.1, the cmsamplebuffer before decoding = cmtime + Formatdesc + cmblockbuffer. The above three messages need to be extracted from the H264 stream. Finally, the cmsamplebuffer is combined into a hard decode interface to decode the work.
The H264 is composed of Nalu elements, and the Nalu unit contains the video image data and H264 parameter information. The video image data is Cmblockbuffer, and H264 parameter information can be combined into formatdesc. Specifically, the parameter information contains the SPS (Sequence Parameter set) and PPS (Picture Parameter set). Figure 2.2 shows the structure of a H264 bitstream.
Figure 2.2 H264 Code Flow Structure
(1) Extract the SPS and PPS to generate the format description.
A, the starting code for each NALU is 0x00 00 01, which locates Nalu according to the starting code.
b, through the type information to find the SPS and PPS and extract, start code after the first byte after 5 bits, 7 for sps,8 for PPS.
C,cmvideoformatdescriptioncreatefromh264parametersets function to build the cmvideoformatdescriptionref. The specific code can be seen in demo.
(2) Extracting video image data to generate Cmblockbuffer.
A, through the start code, positioning to Nalu.
b, after determining the type as data, replace the start code with the length information of Nalu (4 Bytes).
The C,cmblockbuffercreatewithmemoryblock interface constructs the CMBLOCKBUFFERREF. The specific code can be seen in demo.
(3) Generate cmtime information as needed. (When the actual test, adding time information, there is an unstable image, do not add time information but not, need to further study, it is recommended not to add time information)
Based on the above, Cmvideoformatdescriptionref, cmblockbufferref and optional time information are obtained. Use the Cmsamplebuffercreate interface to get the raw data cmsamplebuffer data that is to be decoded. See figure 2.3 For the H264 data conversion.
Figure 2.3 H264 Code Stream Conversion Cmsamplebuffer
2 , the hardware decodes the image display.
Hardware decoding is displayed in two ways:
(1) decoding and displaying via the system-provided avsamplebufferdisplaylayer.
Avsamplebufferdisplaylayer is an apple-provided display layer that specifically displays encoded H264 data, which is a subclass of Calayer and is therefore used in a similar way to other calayer. This layer has built-in hardware decoding function, the original Cmsamplebuffer decoded image directly displayed on the screen, very simple and convenient. Figure 2.4 shows this decoding process.
Figure 2.4 Avsamplebufferdisplaylayer Show image after hard decompression
The interface shown is [_avslayer enqueuesamplebuffer:samplebuffer];
(2) through the Vtdecompression interface, the Cmsamplebuffer decoded into an image, the image through the Uiimageview or OpenGL display.
A, initialize the vtdecompressionsession, set the information about the decoder. The initialization information needs to cmsamplebuffer inside the formatdescription, as well as set the decoded image storage mode. The Cgbitmap mode, which is set in the demo, is stored using RGB method. After the encoded image is decoded, a callback function is called and the decoded image is passed to this callback function for further processing. In this callback, we send the decoded image to control to show that the callback pointer is passed as a parameter to the Create interface function when initializing. Finally, the session is initialized with the Create interface.
The callback function described in B,a can complete cgbitmap image conversion to uiimage image processing, the image is sent through the queue to control for display processing.
C, call the Vtdecompressessiondecodeframe interface for decoding operations. The decoded image will be left to the callback function set by a A, B step for further processing.
Figure 2.5 shows the process steps for hard decoding.
Figure 2.5 vtdecompression Hard Decoding process
Third, hard-coded use method.
Hard-coded use is also described by a typical application scenario. First, through the camera to capture the image, and then the captured image, by hard-coded way to encode, the last encoded data will be combined into a H264 code stream through the network transmission.
1, the camera collects data.
Camera capture, The iOS system provides avcapturesession to capture image data from the camera. Set the acquisition resolution of the session. Set input and output again. When output is set, the delegate and output queues need to be set. In the delegate method, the captured image is processed.
Note that it should be noted that the format of the image output is an cmsamplebuffer form that is not encoded.
2, hard-coded using vtcompressionsession.
(1) Initialization of vtcompressionsession.
Vtcompressionsession initialization time, generally need to give width width, height length, encoder type kcmvideocodectype_h264 and so on. And then by calling the Vtsessionsetproperty interface to set the frame rate and other properties, the demo provides some settings reference, testing the time to find little impact, may need further debugging. Finally, a callback function needs to be set, which is called after the video image encoding is successful. When all is ready, use Vtcompressionsessioncreate to create the session.
(2) Extract the original image data captured by the camera to vtcompressionsession to hard code.
The image acquired by the camera is an cmsamplebuffer form, which extracts cvpixelbufferref from the given interface function Cmsamplebuffergetimagebuffer, The hard-coded interface vtcompressionsessionencodeframe is used to hard-encode the frame, and when the encoding succeeds, the callback function set at session initialization is automatically called.
(3) using the callback function, the successful encoding cmsamplebuffer is converted into H264 stream, which is transmitted through the network.
is basically a reverse process of hard decoding. The parameter set of SPS and PPS is parsed, plus the starting code is assembled into Nalu. Extracts the video data, converts the length code into the starting code, and the leader becomes Nalu. Send the Nalu out.
Figure 2.6 shows the entire hard-coded processing logic.
Figure 2.6 hard-coded processing flow
Four, hard codec of some coding instructions.
Since video Toolbox is the underlying Core Foundation library function, C is written in the same way as all the other features of the core Foundation, remember that GitHub has a comrade who has changed it to a mode that the OC language can easily invoke, but the address forgot , after the fate to find, will provide the next link.
How do I implement video hard decoding on the iOS platform?
In the iOS platform to do video decoding, generally there are three kinds of scenarios:
1, soft decoding scheme: FFmpeg
Disadvantage: Consumption of CPU is too large, on the iphone4s generally 720P 20 frames above the solution is not moving
2, Hard decoding scheme 1: The use of private interface Videotoolbox
Advantage: Very low CPU consumption, high decoding efficiency
Cons: To use a private interface Videotoolbox,ios device must jailbreak
3. Hard decoding Scheme 2: Using Avplayer+httpserver+httplivestream combination scheme
Advantage: Very low CPU consumption, high decoding efficiency
Cons: Video has a delay, not suitable for real-time video communication
Here is a flowchart for the hard decode scheme 2:
The scheme I have realized the source code, and verified the stability of the iphone5 on the 720P 25 frame CPU occupancy rate 3%;
Specific implementation source code temporarily not open source, if necessary, can contact me, qq:349260360 email:[email protected]
Error correction: The use of Avplayer,ts stream shards in the time of the switch will splash screen, to realize the seamless connection of TS stream slicing, must use the Avqueueplayer, this specific solution needs to be perfected ...
Pinterest
In iOS, webcam recording video is mov format, although MOV is compatible with MP4, but some needs need to use MP4 format video file. After turning the StackOverflow, I found the code to convert the video format. To record it.
Avurlasset *avasset =[Avurlasset Urlassetwithurl:[nsurl Fileurlwithpath:path] options:nil]; Nsarray*compatiblepresets =[Avassetexportsession Exportpresetscompatiblewithasset:avasset];if([compatiblepresets containsobject:avassetexportpresetlowquality]) {avassetexportsession*exportsession =[[Avassetexportsession alloc]initwithasset:avasset Presetname:avassetexportpresetpassthrough]; NSString*exportpath = [NSString stringWithFormat:@"%@/%@.mp4", [Nshomedirectory () stringbyappendingstring:@"/ tmp"], @"1"]; Exportsession.outputurl=[Nsurl Fileurlwithpath:exportpath]; NSLog (@"%@", Exportpath); Exportsession.outputfiletype=AVFileTypeMPEG4; [Exportsession Exportasynchronouslywithcompletionhandler:^{ Switch([exportsession status]) { CaseAvassetexportsessionstatusfailed:nslog (@"Export failed:%@", [[exportsession Error] localizeddescription]); Break; CaseAvassetexportsessionstatuscancelled:nslog (@"Export canceled"); Break; CaseAvassetexportsessionstatuscompleted:nslog (@"Conversion Success"); Break; default: Break; } }];}
IOS8 system H264 Video Hardware codec description