Implementation Scheme of MPEG-2 audio and video editing software based on directshow/DES

Source: Internet
Author: User
Tags support microsoft

Introduction

The digital video broadcast (DVB) standards established in Europe in 1994 and the Advanced Television Standards Commission (ATSC) of the Federal Communications Commission (FCC) in 1996, MPEG-2 standards are adopted in the video section. China's CATV digital network also adopts the DVB standard. The appearance of MPEG-2 standard has promoted the development of digital video business. More and more Program Exchange is compressed by MPEG-2, so that the studio production involves a large number of MPEG encoding compression program processing, for example, non-linear editing involves switching between different programs, cutting and stringing, adding subtitles to the screen, tagging, and fading out. Program cutting and serial editing are the basis for other editing functions. Therefore, program cutting and serial editing are more important.

For MPEG-2 audio and video, DirectShow editing Service (DES, DirectShow editing services) is used in DirectShow (development package for Streaming Media Processing on Windows platform provided by Microsoft) to achieve the cut and serial editing of MPEG-2 audio and video programs. As an enhanced DirectShow application, des simplifies tedious video editing and makes up for the gaps in non-linear editing application software, allowing you to easily edit audio and video files, therefore, it is an effective method to edit the MPEG-2 audio and video files.

Des Introduction

Timeline Model
The internal structure model of DES is a timeline-based model. It puts forward the concept of timeline and track, and organizes all multimedia edits to a virtual timeline, it actually represents the final audio/video clip. For programmers, these timelines are the same as the general communication (COM) interfaces of the same track. They are pure virtual classes and can inherit, derive, and add necessary functions, for editing. Figure 1 shows the internal model of the des timeline.

This is a tree structure. In this tree, audio and video files are leaf nodes and are called as media sources. One or more sources form a track ), each track has a uniform media format output; the collection express bus of the track is sent back to the CPU. The CPU can transmit data to the I/O Board through the PCI bus to convert the data to the base band signal output, or send the data back to the software codecs encoded into DV format and output it through the 1394 bus.

Compared with NLE systems based on specialized hardware, the NLE System Based on the CPU + GPU Software Architecture completely removes the dependency on the hardware board and breaks through the limitations of the dedicated hardware structure, high-performance video editing and processing are achieved using common hardware system resources.

In terms of performance, the mainstream NLE Systems Based on CPU + GPU can now be easily implemented 4 ~ The real-time output of layer-6 3D effects far surpasses the mainstream NLE Systems Based on dedicated boards.

In terms of scalability, because the system fully adopts the software architecture, on the one hand, by improving the configuration of the computer platform to achieve higher hardware performance, thus directly improving the performance of the NLE system; on the other hand, by adding and upgrading software modules, You can edit more formats and obtain more stunt effects.

In terms of stability, the dedicated hardware board is abandoned, and the simple I/O board is used to achieve baseband signal output. The system failure rate, power consumption, and calorific value are greatly reduced, stability is greatly improved. By further optimizing the hardware structure of the system, a more mature and secure brand graphics workstation is selected, and the I/O board is further split into modules, the analog processing circuit is split from the Board into an independent power supply and Independent Heat Dissipation interface box, which enables the non-linear editing system to achieve security and stability at the workstation and server level, thus greatly reducing maintenance costs, improve device utilization.

In terms of cost, the current non-linear editing system based on CPU + GPU has high Configuration Requirements for computer systems and high investment in computer platforms, however, because there is no cost investment on dedicated boards, the overall cost has not increased. Due to the rapid development of computer technology and the increasing cost-effectiveness, the overall cost of a CPU + GPU-based NLE system is declining while the performance is constantly improving.


As a set (composition), each composition can perform various complex edits on all its compositions or tracks; top-level compositions or tracks constitute a group ); each group outputs a media stream in a single format. All groups form a timeline. Timeline indicates a video editing project, which is the root node of the tree. A timeline project must contain at least one group. In the most typical case, there are generally two groups: the audio group and the video group ).

Figure 2 shows a typical timeline-based media track chart. The Arrow direction is the timeline direction. This timeline consists of two groups, each containing two media source tracks (sourcetrack ). In a video group, the track has a priority (track 0 has the lowest priority, and so on ). During running, the media source content in the high-priority track is always output. If there is no media source output in the high-priority track at this time, let the media source output in the low-priority track. The output sequence of the video group in 1 is media source A → media source C → media source B. For an audio group, the output of all its tracks is simply a synthesis. Using this timeline principle, we assign media materials to the corresponding media sources one by one, and then organize the media sources to the tracks with different priorities, finally, the synthesis program we need is output under the timeline model organization. This is the core model of MPEG-2 audio and video editing function.

Time Concept
Des generally has three types of time:
(1) Timeline time (timeline Time): The time relative to the entire timeline project;
(2) media time: the time relative to the media source. For example, if the media source is a file, the media time actually refers to the start time of the file;
(3) parent time: the time relative to an existing object in the timeline.

Design Scheme

The design scheme is as follows: Firstly, the function of playing MPEG-2 audio and video files should be completed. During the browsing process, the recorded and output points of the required audio and video clips should be marked. Second, you need to complete the edited audio and video preview function. If the editing result is satisfactory, you need to save it to the file. During cropping, you only need to set the entry and exit points for the files in the playback, and press the Save button to preview and save the files. During serial editing, open the files to be edited in sequence and set the entry and exit points, then press preview or save.

Implementation Process

It is very convenient to use DirectShow to implement the playback function. This function module is mainly used by the getcurrentposition () function to obtain the time of the entry point and the output point, providing the media start time for later editing. The following describes how to preview and save data using DES.

Timeline Construction
To use des to preview or save a video clip, you must first build a timeline model. First, a virtual interface provided by the system is called. (The Virtual Interface contains many undefined Function Methods and only provides one application-layer interface ). This virtual interface is called iamtimeline. What we need to do is to follow the timeline structure in figure 1 to define the attributes and functions we need and create our timeline objects. The basic attributes include group, composition, track, and source ).

(1) first create a timeline object
Iamtimeline * PTL = NULL;
HR = cocreatelnstance (clsid_iamtimeline, null, clsctx_inproc_server, IID iamtimeline, (void **) & PTL );
At this time, we are faced with an empty timeline framework. Next we will fill in "branches and leaves" for our "Tree" timeline structure based on our own needs ".

(2) Use the interface method iamtimeline: createemptynode to create various des objects. Including: audio (Video group pvideogroup, audio group paudiogroup), iamtimelinecomp (Video pvideocomp, audio pauiocomp), audio (Video pvideotrack, audio pauiotrack), audio (Video pvideosrc, audio pauiosrc ).
The following example shows the video group code. The audio group is similar to the video group.
Iamtimelinegroup * pvideogroup = NULL;
Iamtimelineobj * pvideogroupobj = NULL;
PTL → createemptynode (& pvideogroupobj, timeline_major_type_group );
Pgroupobj → querylnterface (iid_iamtimelinegroup, (void **) & pvideogroup );
After calling iamtimeline: createemptynode successfully, we can get an iamtimelineobj interface pointer. That is to say, the iamtimelineobj interface is implemented for each created des object.

(3) add tracks to the group.
Pvideocomp → vtracklnsbefore (pvideotrackobj,-1 );
Pvideotrackobj → querylnterface (iid_iamtimelinetrack, (void **) & pvideotrack ).

(4) This is the most critical step. Set the cut time of the media source and its time on the time line and place it on the corresponding track. In the case of serial editing, the author designs a class to record the file name, segment length, entry point and output point of the media source, and manages the class with the carray template class.
"Class cfileinfo
"{
"Public:
"Cstring filename;
"Longlong cliplen;
"Longlong outdot;
"Longlong indot;
"Cfileinfo ();
"Virtual ~ Cfileinfo ();
"};
"Carrayfilearray
"For (I = 1; I <= filearray. getsize (); I ++)
"{
"Filearray [0]. cliplen = 0;
HR = pvideosrcobj → setstartstop (filearray [I-1]. cliplen, filearray [I-1]. cliplen + filearray [I]. cliplen); // set the timeline time
HR = pvideosrcobj → setmediatimes (filearray [I]. indot, filearray [I]. outdot); // set the media source time
HR = pvideosrcobj → setmedianame (T2W (filearray [I]. filename); // set the media source file name
HR = pvideotrack → srcadd (pvideosrcobj); // Add the media source to the corresponding track
Filearray [I]. cliplen = filearray [I-1]. cliplen + filearray [I]. cliplen; // the start time of the next media source timeline is the end time of the previous media source timeline.
"}

Implementation of the preview function
After creating a timeline, create the basic rendering engine irenderengine, which is used to build a filter graph for preview or output files through the created timeline. Therefore, we need to pass the timeline information to it. The next process is very simple. Call the f I L T E R Filter established by connectfrontend to connect to the timeline and call renderoutputpins. So far, the filter chart has been successfully created. You only need to call the run () function of the imediacontrol interface to preview it.
Irenderengine * prenderengine = NULL;
HR = cocreatelnstance (clsid_renderengine, null, clsctx_inproc_server, iid_irenderengine, (void **) & prenderengine); // create a basic Rendering Engine
HR = Prender → settimelineobject (PTL); // determine the time series to be rendered
HR = prenderengine → connectfrontend (); // build the front-end of Graph
HR = prenderengine → renderoutputpins (); // connect the front-end pins to the audio and video Renderer Based on the media type to build the graph.
Igraphbuilder * pgraph = NULL;
Imediacontrol * pcontrol = NULL;
HR = Prender → getfiltergraph (& pgraph );
HR = pgraph → QueryInterface (iid_imediacontrol, (void **) & pcontrol );
HR = pcontrol → run (); // run filter graph.

Save
After the timeline and front-end are created, the front-end outputs non-compressed audio streams and video streams, while what we want to save is compressed data, therefore, MPEG-2 Audio Encoder and Video Encoder and multier must be added to the filter diagram.

Step 1: add the video encoder, Audio Encoder and multipleer, and file write program filter to the filter graph.
HR = addfilterbyclsid (pgraph, lsid_video_encoder, l "MPEG Video Encoder", & pvideoencoder );
Step 2: obtain the number of groups and output pin pointers, and connect them to the corresponding Encoder Based on the media type of the pins.
Long numgroups;
PTL-> getgroupcount (& numgroups );
Ipin * ppin;
"For (I = 0; I <numgroups; I ++)
"{
"If (prenderengine-> getgroupoutputpin (I, & ppin) = s_ OK)
"{
"HR = getmediatype (ppin );
"If (hR = true) // If the returned value is true, the pin outputs a video stream.
"Connectfilters (pgraph, ppin, pvideoencoder, true );
"Else
"Connectfilters (pgraph, ppin, paudioencoder, true );
"}
"}
Step 3: connect the video encoder and the Audio Encoder filter to the multiplexing filter.
HR = connectfilters (pgraph, pvideoencoder, pmux, true );
HR = connectfilters (pgraph, paudioencoder, pmux, true );
Step 4: connect the multiplexing and file writing program Filter
HR = connectfilters (pgraph, pmux, pfilewriter, true );
Step 5: Create the mpeg output file ifilesinkfilter * pSIN = 0; pfilewriter → QueryInterface (iid_ifilesinkfilter, (void **) & psink );
Psink → setfilename (T2W (strsavefile), null );
Finally, call the run () function of the imediacontrol interface to save the operation.

Test Results

Implementation of MPEG-2 audio and video editing interface 3. The list box shows the list of files to be edited, double-click the file name to display the first figure of the file 3 MPEG-2 audio and video editing interface frame, click the play button to play back the local file. Click the "inbound" button to set the entry point of the file to be cut. After you click the "outbound" button, the video is paused and the cut termination point is obtained. After setting the entry and exit points for multiple files in sequence and clicking the preview button, the playback screen is smooth, non-Mosaic, and no obvious delay. Click the Save button to save the serial editing results, save the MPEG-2 audio and video stream, with storm audio and video software to play back the file, playback of the picture is smooth, quality is satisfactory.

Conclusion

Audio and video editing is attractive to TV station program editing and production, and to ordinary family users. Therefore, the practical and popularization of the video editing function is of great practical significance. The use of des makes time series management and stunt synthesis more convenient and reliable. At the same time, des uses plug-ins, which can be used by many third-party plug-ins that support Microsoft DirectX, making the software easier to expand, provide more options for users. Des makes the program more modular and facilitates software development. It also makes it easier to integrate new functions during constant upgrades. Of course, this video editing method also has its limitations, that is, the editing accuracy is lower than the traditional hardware implementation method and is not suitable for editing with high precision requirements, this is also a defect of the software for video editing.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.