FLV file Format Official specification detailed

Source: Internet
Author: User
Tags flv file

-If you want to learn a new point of knowledge, the Official Handbook may be the quickest way. Looking at a summary of others on the web may be quicker to get started, but be accurate, thorough, complete, or read the Official Handbook. The following is from an analysis summary of the official document video file Format specification Version 10. In the process, an FLV file use case study was actually converted with ffmpeg. An FLV file, each type of tag belongs to a stream, that is, an FLV file with a maximum of only one audio stream, one video stream, there is no multiple independent audio and video streams in a file case. (MP4 seems to be possible) in addition, the FLV file format is used in the big-endian order. Note: In the following data type, the UI represents an unsigned shape, followed by a number indicating how long it is. For example, UI8, which means it cannot be shaped, is a byte in length. The UI24 is three bytes. UB represents a bit field, and UB5 represents 5 bits of one byte. You can refer to the bit-field structure in C. flv Head
Field Type Comment
Signature UI8 ' F ' (0X46)
Signature UI8 ' L ' (0X4C)
Signature UI8 ' V ' (0x56)
Version UI8 Version of FLV. 0X01 indicates that the FLV version is 1
Reserved fields UB5 The first five digits must be 0
Is there an audio stream UB1 Whether the audio stream has a flag
Reserved fields UB1 Must be 0
Is there a video stream UB1 Whether the video stream has a flag
File Header Size UI32 FLV version 1 o'clock fills in 9, indicating the size of the FLV header, which is used for later FLV version extensions. Include these four bytes. The starting position of the data is offset by so many sizes from the beginning of the file.
flv file Body The body part consists of a tag, each tag below a piece 4bytesSpace, used to record the length of this tag, this post is used for reverse reading processing, their relationship such as: note: The head below four oneself is previoustagsize, because the previous one does not have the tag, therefore, the value fills 0. FLV Tags Structure
Field Type Comment
Tag type UI8 8:audio9:video18:script data--Here are some descriptive information. All others:reserved other values are not used.
Data size UI24 The size of the data area, not including the header. The total size of the Baotou is 11 bytes.
Time stamp UI24 The current frame timestamp, in milliseconds. The first tag timestamp relative to the flv file. The timestamp of the first tag is always 0. --Not the timestamp increment, which is the timestamp increment in rtmp.
Time Stamp extension Field UI8 If the timestamp is greater than 0xFFFFFF, this byte will be used. This byte is the high 8 bits of the timestamp, and the above three bytes are low 24 bits.
Stream ID U24 Always 0
Data area Ui8[n]
Audio Data
Field Type Comment
Audio format UB4 0 = Linear PCM, platform endian
1 = ADPCM
2 = MP3
3 = Linear PCM, little endian
4 = Nellymoser 16-khz Mono
5 = Nellymoser 8-khz Mono
6 = Nellymoser7 = g.711 A-law Logarithmic PCM8 = g.711 Mu-law logarithmic PCM 9 = RESERVED10 = AAC
one = Speex14 = MP3 8-khz15 = device-specific Sound 7, 8, +, and 15: reserved for internal use. FLV is not supported for g711a, and if it is to be used, linear audio may be used.
Sample Rate UB2 for aac:always = 5.5-KHZ1 = 11-KHZ2 = 22-khz3 = 44-khz
Sample size UB1 0 = Snd8bit1 = snd16bit
Channel UB1 0= Mono 1 = stereo, dual channel. AAC is always 1
Sound data Ui8[n] If it is PCM linear data, it is stored at the time each 16bit small end is stored, signed. If the audio format is AAC, the data is stored as AAC audio data, otherwise it is a linear array.
AAC AUDIO DATA Video Data
Field Type Comment
Frame type UB4 1:keyframe (for AVC, a seekable frame)--h264 IDR, keyframes, can be re-entered frames. 2:inter frame (for AVC, a non-seekable frame)--h264 Normal frames 3:disposable inter frame (h.263 only)
4:generated keyframe (reserved for server with only) 5:video Info/command frame
Encoding ID UB4 What type of encoding to use: 1:jpeg (currently unused) 2:sorenson H.263
3:screen video4:on2 vp65:on2 VP6 with alpha channel 6:screen video version 27:AVC
Video data Ui[n] If it is AVC, refer to the following description: Avcvideopacket
Avcvideopacket
Field Type Comment
AVC Packet Type UI8 0:AVC Sequence Header 1:AVC Nalu unit 2:AVC sequence ends. Low-level AVC is not required.
Cts SI24 If the AVC packet type is 1, then the CTS offset (see explanation below), 0 is 0
Data Ui8[n] If the AVC packet type is 0, then it is the decoder configuration, Sps,pps. If it is 1, it is the Nalu unit, which can be multiple, specific format: the following
about CTS: This is a more incomprehensible concept that needs to be understood in conjunction with Pts,dts. First, the concept of PTS (presentation time stamps), DTS (decoder timestamps), CTS (Compositiontime): pts: The time that the receiver displays this frame on the display. The unit is 1/90000 seconds. DTS: The decoding time, which is the time stamp transmitted in the RTP packet, indicates the order of decoding. Unit unit is 1/90000 seconds. According to the following understanding, PTS is the compositiontimects offset in the standard: CTS = (PTS-DTS)/90. The CTS unit is in milliseconds.  pts and DTS do not have the same time, should only appear in the case containing B-frame, that is, the profile main above. Baseline is not the problem, Baseline pts and DTS have always wanted to vomit, so the CTS has been 0.   The time stamp in the FLV tag is DTS.   Research documentation,  iso/iec 14496-12:2005 (E)      8.15   time to Sample Boxes, found that Compositiontime is presentation time stamps, just a different term. -Need further confirmation.   In, CP is the PTS, which shows the time. DT is the timestamp of the decoding time, RTP. The I1 is the first frame, the B2 is the second, and the subsequent sequence number is the camera output order. Determines the order in which the display is displayed. DT, is the order of the encodings, especially in cases where there is a B-frame, P4 to be in the second solution, because B2 and B3 depend on P4, but the P4 is displayed after B3 because his order is back. This shows the difference between the display time CT (PTS) and the decoding time DT, and there is a CT offset. The   P4 decoding time is 10, but the display time is 40, avcvideopacket in the data format:  
Field Type Comment
Length UI32 The length of the Nalu unit, not including the length field.
Nalu data Ui8[n] Nalu data, no four bytes of Nalu cell head, starting directly from the H264 head, for example: 65 * * * * * * * * * * * *
Length UI32 The length of the Nalu unit, not including the length field.
Nalu data Ui8[n] Nalu data, no four bytes of Nalu cell head, starting directly from the H264 head, for example: 65 * * * * * * * * * * * *
... ... ...
Data tags The main Onmeta information needs attention. Avcdecoderconfigurationrecord Avcvideopacket data format, save control information. Record Sps,pps information. Usually appears in the second tag, immediately after the Onmeta. A typical sequence: 0000190:0900 0033 0000 0000 0000 00 0000 0000... 3............00001A0: 0164 002a ffe1 001e 6764 002a acd9 4078. D.*....gd.*[email protected]00001b0: 0227 e5ff c389 4388 0400 0003 0028 0000.‘ .... C...... (.. 00001C0: 0978 3c60 c658 0100 0568 ebec b22c0000 .x< '. X...h ...,.. 17: H264idr data00: Indicates that the AVC sequence header is 00:cts for 0// Avcdecoderconfigurationrecord01: Version Number 2a: Profile level Id,sps Three bytes, 64 means H264 high profile,2a represents level. Ff:nalu length, for 3? I don't know where this length is used. E1: Indicates that there is an SPS immediately below. An array of Sps[n]:sps. 1e: The front is a two-byte SPS length, indicating that the length of the subsequent SPS is 1e in size. 6764 002a acd9 4078 0227 e5ff c389 4388 0400 0003 0028 0000 0978 3c60 c658:sps data. Because there is only one SPS, skipping these lengths, and then the number of PPS information: number of 01:pps, 1//pps[n] PPS number 00 05: The Size of PPS is 5 bytes. Data of EB EC B2 2c:pps 00 00 ... this is the next tag.

My public number.

FLV file Format Official specification detailed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.