Live555 currently only supports mpg, MKV, webm, and other audio/video files. FFMPEG can be used to extend the file formats supported by live555. It took more than a month,
Finally, MP4 and Avi are supported. media formats include MPEG4, h264, MP3, and AAC. Refer to the implementation of MPG in live555.
1. Main steps to expand mediaserver
1) define an RTSP server class myrtspserver and inherit from dynamicrtspserver. The function is to re-implement the lookupservermediasession function, and add the support code for Avi and MP4.
2) refer to the MPG implementation to define the following classes:
Mydemux-> mdeium:
Call the function in FFMPEG to parse the file and separate the data of the media stream in the file. Each client connection corresponds to a mydemux instance.
Mydemuxedelementarystream-> framedsource:
As the source, this class actually calls the mydemux instance to obtain the required data. Each stream corresponds to a mydemuxedelementarystream instance.
Myserverdemux-> medium:
This service class is used to create mydemux, mydemuxedelementarystream, and subsession instances.
3) subsession
For each media format, you must implement a subsession and re-implement the virtual function createnewstreamsource to create your own source.
H264: subsessions for processing h264 are inherited from hsf-videofileservermediasubsession. elasticstream for h264 can be obtained from packet and handed directly to hsf-videostreamframer for processing.
MPEG4 and subsession inherit from mpeg4videofileservermediasubsession. Note that the data obtained from packet is not a strict es stream. There are two classes related to processing MPEG4 es streams: mpeg4videostreamframer and mpeg4videostreamdiscreteframer. here you should select mpeg4videostreamdiscreteframer. For details, refer to the previous article "live555
MPEG4 processing"
MP3 and subsession are inherited from mp3audiofileservermediasubsession. Note that you also need to implement the seekstreamsource function. The function body can be empty. When mp3audiofileservermediasubsession is created, generateadus in the passed parameter should be false, and Interleaving is null.
AAC, subsession inherited from fileservermediasubsession. Createnewstreamsource and createnewrtpsink must be implemented. For the implementation of createnewrtpsink, see the process of AAC in matroska in live555.
2. Notes for handling Avi and MP4 files
Data obtained through FFMPEG is generally stored in packet. For AVI files, the processing is relatively simple. You can extract the data in packet directly for subsequent processing. MP4 files are a little complicated.
When h264 data is extracted from MP4, packet does not contain SPS and PPS information, which is stored in the extradata data field of avcodeccontext, the standard SPS and PPS can be obtained through the stream filter "h1__mp4to1_ B. In addition, the data in packet is not a standard NALU unit, and the first four bytes must be replaced with the start character. For details, see "FFMPEG extracting h264 NALU from MP4 ".
For MPEG4 in MP4, the obtained packet only contains VOP-level (0x000001b5) data, and Vos (0x000001b0) is stored in the extradata data field of avcodeccontext, therefore, you only need to add the data in extradata to the beginning of the stream.
For audio AAC in MP4, the audio configuration information config is also saved in extradata.
3. AAC audio configuration information config
The AAC in MPEG4 is handed over to mpeg4genericrtpsink. Pay attention to the passed parameters to see how the matroska file is processed.
Rtpsink * tip: createnewrtpsink (groupsock * rtpgroupsock, unsigned char rtppayloadtypeifdynamic, framedsource */* inputsource */) {matroskatrack * track = fourdemux. lookup (ftracknumber); Return mpeg4genericrtpsink: createnew (envir (), rtpgroupsock, category, track-> samplingfrequency, // sampling frequency "audio", "AAC-HBr", fconfigstr, track-> numchannels); // number of channels}
When mpeg4genericrtpsink is created, fconfigstr is the AAC frame configuration information, which indicates the configuration information of AAC frames. In fact, config has two transmission modes, including RTP load called in-band transmission. Otherwise, it is called out-of-band transmission. In this case, config information must be transmitted in SDP. Obviously, using out-of-band transmission can save bandwidth. Config is a hexadecimal 8-byte string that represents the MPEG-4 audio load configuration data defined by ISO/IEC 14496-3 [5] "streammuxconfig ". SDP also has a cpresent attribute, a Boolean parameter, indicating whether the audio load configuration data has been reused into an RTP load. 0 indicates that it has not been reused. 1 indicates that it has been reused. The default value of this parameter is 1. That is, out-of-band transmission is performed by default.
The mpeg4genericrtpsink constructor shows how the configuration information is saved to SDP.
Syntax: syntax (usageenvironment & ENV, groupsock * rtpgs, invalid rtppayloadformat, u_int32_t syntax, char const * syntax, char const * mpeg4mode, char const * configstring, unsigned numchannels ): multiframedrtpsink (ENV, rtpgs, rtppayloadformat, rtptimestampfrequency, "MPEG4-GENERIC", numchannels), fsdpmediatypestring (strdup (sdpmediatypestring), FM Peg4mode (strdup (mpeg4mode), fconfigstring (strdup (configstring) {// check whether "mpeg4mode" is one that we handle: If (mpeg4mode = NULL) {env <"mpeg4genericrtpsink error: NULL \" mpeg4mode \ "Parameter \ n";} else if (strcmp (mpeg4mode, "AAC-HBr ")! = 0) {// note that the "AAC-HBr" Mode env <"mpeg4genericrtpsink error: Unknown \" mpeg4mode \ "parameter: \ "" <mpeg4mode <"\" \ n ";} // set up the" A = fmtp: "SDP line for this stream: char const * fmtpfmt = "A = fmtp: % d" "streamtype = % d; Profile-level-id = 1;" "mode = % s; sizelength = 13; indexlength = 3; indexdeltalength = 3; "" Config = % s \ r \ n "; // config property unsigned fmtpfmtsize = strlen (fmtpfmt) + 3/* max char Len */+ 3/* m Ax char Len */+ strlen (fmpeg4mode) + strlen (fconfigstring); // note char * fmtp = new char [fmtpfmtsize]; sprintf (fmtp, fmtpfmt, rtppayloadtype (), strcmp (fsdpmediatypestring, "video") = 0? 4: 5, fmpeg4mode, fconfigstring); // Note: ffmtpsdpline = strdup (fmtp); Delete [] fmtp ;}
The config information obtained by FFmpeg is saved in avcodeccontext. extradata, which occupies two bytes. For 44 Khz audio, the two bytes are in the order of 0x12, 0x10. However, you need to convert it to a string and save it to config. Therefore, the final Config = "1816 ".
4.mp3 and aac frame playback time
Each frame of AAC is 1024 samples, so we can use the following formula for calculation:
Frame_duration = 1024*1000000/sample_rate
For example, when sample_rate = 44100hz, the calculated time is 23.219 Ms.
Each frame of MP3 is 1152 bytes, then:
Frame_duration = 1152*1000000/sample_rate
For example, when sample_rate = 44100hz, the calculated length is 26.122 MS, which is the origin of the frequent MP3 frame playback time fixed to 26 Ms.
Attached a sample number table for each MPEG audio Frame
Some questions:
There is a problem in this place. When you use VLC to play an MPEG4 video in AVI or MP4, it is very difficult, and the CPU will reach 100%, but it is normal when ffplay and mplayer are used,
PS:
Code uploaded to http://download.csdn.net/detail/gavinr/4320175