FFmpeg Integrated Application Example (i)--Webcam live

Last Update:2018-07-26 Source: Internet

Author: User

Tags flush time interval

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The example in this article will be implemented by reading the PC webcam video data and sending it as a live stream with the RTMP protocol. Example contains the

1, the use of FFmpeg libavdevice

2. The basic flow of video decoding, encoding and pushing

has a strong comprehensive.

To use Libavdevice related functions, you first need to register the relevant components

Avdevice_register_all ();

Next we want to list the DShow devices available in the computer

Avformatcontext *pfmtctx = Avformat_alloc_context ();
	Avdeviceinfolist *device_info = NULL;
	avdictionary* options = NULL;
	Av_dict_set (&options, "List_devices", "true", 0);
	Avinputformat *iformat = Av_find_input_format ("DShow");
	printf ("Device info=============\n");
	Avformat_open_input (&pfmtctx, "Video=dummy", Iformat, &options);
	printf ("========================\n");

You can see that the steps to open the device here are basically the same as the steps to open the file, with the Avdictionary set in the code above, which has the same effect as entering the following command on the command line

The results from the above statement are as follows

Here on my computer only a virtual camera software virtual out of a few dshow devices, no audio devices, so like on the results.

It is important to note that Avdevice has a avdevice_list_devices function that enumerates the system's acquisition devices, including the device name and device description, which is well suited to allow the user to select the device to use, but does not support the DShow device, so it is not used here.

The next step is to open the above specific device name as input and initialize it as the normal file, as follows

Av_register_all ();
	Register Device Avdevice_register_all ();
	
	Avformat_network_init ();
	
	Show Dshow Device show_dshow_device ();
	printf ("\nchoose Capture Device:");
		if (gets (capture_name) = = 0) {printf ("Error in gets () \ n");
	return-1;

	} sprintf (Device_name, "video=%s", capture_name);
	
	Ifmt=av_find_input_format ("DShow"); Set own video device ' s name if (Avformat_open_input (&ifmt_ctx, Device_name, IFMT, NULL)! = 0) {printf ("couldn ' t o
		Pen input stream. (Cannot open input stream) \ n ");
	return-1; }//input Initialize if (Avformat_find_stream_info (Ifmt_ctx, NULL) <0) {printf ("couldn ' t find stream information.
		Unable to get stream information) \ n ");
	return-1;
	} Videoindex =-1; for (i = 0; i<ifmt_ctx->nb_streams; i++) if (Ifmt_ctx->streams[i]->codec->codec_type = = AVMEDIA_TYPE_VID
			EO) {videoindex = i;
		Break
		} if (Videoindex = =-1) {printf ("couldn ' t find a video stream. (not found) \ n");
	return-1; } if (Avcodec_open2 (ifmt_ctx->streams[videoindex]->Codec, Avcodec_find_decoder (ifmt_ctx->streams[videoindex]->codec->codec_id), NULL) <0) {printf ("Could
		Not open codec. (Unable to open decoder) \ n ");
	return-1; }

After selecting the input device and initializing it, the output needs to be initialized accordingly. FFmpeg the network protocols and files, and because of the use of RTMP protocol for transmission, here we specify the output as FLV format, the encoder uses H.

Output Initialize Avformat_alloc_output_context2 (&ofmt_ctx, NULL, "flv", Out_path);
	Output Encoder Initialize PCODEC = Avcodec_find_encoder (av_codec_id_h264); if (!pcodec) {printf ("Can not find encoder! (no suitable encoder found.)
		) \ n ");
	return-1;
	} pcodecctx=avcodec_alloc_context3 (PCODEC);
	PCODECCTX-&GT;PIX_FMT = pix_fmt_yuv420p;
	Pcodecctx->width = ifmt_ctx->streams[videoindex]->codec->width;
	Pcodecctx->height = ifmt_ctx->streams[videoindex]->codec->height;
	Pcodecctx->time_base.num = 1;
	Pcodecctx->time_base.den = 25;
	Pcodecctx->bit_rate = 400000;
	Pcodecctx->gop_size = 250; /* Some formats,for example,flv, want stream headers to be separate.

	*/if (Ofmt_ctx->oformat->flags & Avfmt_globalheader) pcodecctx->flags |= Codec_flag_global_header;
	H264 codec param//pcodecctx->me_range = 16;
	Pcodecctx->max_qdiff = 4;
	pcodecctx->qcompress = 0.6;
	Pcodecctx->qmin = 10;
	Pcodecctx->qmax = 51; Optional ParaM pcodecctx->max_b_frames = 3;
	Set H264 preset and tune avdictionary *param = 0;
	Av_dict_set (&param, "preset", "Fast", 0);

	Av_dict_set (&param, "tune", "Zerolatency", 0); if (Avcodec_open2 (Pcodecctx, Pcodec,&param) < 0) {printf ("Failed to open encoder! (Encoder open failed.)
		) \ n ");
	return-1; }//add a new stream to output,should is called by the user before Avformat_write_header () for muxing Video_st = Avform
	At_new_stream (Ofmt_ctx, Pcodec);
	if (Video_st = = NULL) {return-1;
	} video_st->time_base.num = 1;
	Video_st->time_base.den = 25;

	Video_st->codec = Pcodecctx; Open output Url,set before Avformat_write_header () for muxing if (Avio_open (&ofmt_ctx->pb,out_path, AVIO_FLAG_ Read_write) < 0) {printf ("Failed to open output file! (Output file open failed.)
	) \ n ");
	return-1;

	}//show Some information Av_dump_format (ofmt_ctx, 0, Out_path, 1); Write File Header Avformat_write_header (ofmt_ctx,null);

After the initialization of the input and output, you can formally start decoding and encoding and pushing the flow of the stream, it is important to note that the camera data is often RGB format, you need to convert it to the yuv420p format, so you have to do the following preparatory work

Prepare before decode and encode
	dec_pkt = (Avpacket *) av_malloc (sizeof (Avpacket));
	ENC_PKT = (Avpacket *) av_malloc (sizeof (Avpacket));
	Camera data has a pix FMT of Rgb,convert it to YUV420
	img_convert_ctx = Sws_getcontext (ifmt_ctx->streams[videoind Ex]->codec->width, Ifmt_ctx->streams[videoindex]->codec->height, 
		ifmt_ctx->streams[ VIDEOINDEX]->CODEC->PIX_FMT, Pcodecctx->width, Pcodecctx->height, pix_fmt_yuv420p, SWS_BICUBIC, NULL, NULL, NULL);
	PFRAMEYUV = Avcodec_alloc_frame ();
	uint8_t *out_buffer = (uint8_t *) Av_malloc (avpicture_get_size (pix_fmt_yuv420p, Pcodecctx->width, pCodecCtx-> height));
	Avpicture_fill ((Avpicture *) PFRAMEYUV, Out_buffer, pix_fmt_yuv420p, Pcodecctx->width, pCodecCtx->height);

Now you can officially start decoding, coding, and pushing the stream.

Start decode and encode int64_t start_time=av_gettime ();
		while (Av_read_frame (Ifmt_ctx, DEC_PKT) >= 0) {if (exit_thread) break;
		Av_log (NULL, Av_log_debug, "going to reencode the frame\n");
		Pframe = Av_frame_alloc ();
			if (!pframe) {ret = Averror (ENOMEM);
		return-1; }//av_packet_rescale_ts (DEC_PKT, ifmt_ctx->streams[dec_pkt->stream_index]->time_base,//ifmt_ctx->
		Streams[dec_pkt->stream_index]->codec->time_base); ret = Avcodec_decode_video2 (Ifmt_ctx->streams[dec_pkt->stream_index]->codec, Pframe, &dec_got_frame,
		DEC_PKT);
			if (Ret < 0) {Av_frame_free (&pframe);
			Av_log (NULL, Av_log_error, "decoding failed\n");
		Break } if (Dec_got_frame) {Sws_scale (Img_convert_ctx, (const uint8_t* const*) Pframe->data, pframe->linesize, 0, PCod	

			Ecctx->height, Pframeyuv->data, pframeyuv->linesize);
			Enc_pkt.data = NULL;
			enc_pkt.size = 0;
			Av_init_packet (&AMP;ENC_PKT); RET = Avcodec_encode_video2 (Pcodecctx, &enc_pkt, PFRAMEYUV, &enc_got_frame);
			Av_frame_free (&pframe);
				if (enc_got_frame = = 1) {//printf ("Succeed to encode frame:%5d\tsize:%5d\n", framecnt, enc_pkt.size);	
				framecnt++;

				Enc_pkt.stream_index = video_st->index;
				Write PTS avrational time_base = ofmt_ctx->streams[videoindex]->time_base;//{1, 1000};
				Avrational r_framerate1 = ifmt_ctx->streams[videoindex]->r_frame_rate;//{50, 2};
				Avrational time_base_q = {1, av_time_base};	Duration between 2 frames (US) int64_t calc_duration = (double) (av_time_base) * (1/av_q2d (r_framerate1)); Internal timestamp//parameters//enc_pkt.pts = (double) (framecnt*calc_duration) * (double) (av_q2d (TIME_BASE_Q))/(double) (av_
				Q2d (time_base));
				enc_pkt.pts = Av_rescale_q (framecnt*calc_duration, Time_base_q, time_base);
				Enc_pkt.dts = enc_pkt.pts; Enc_pkt.duration = Av_rescale_q (calc_duration, Time_base_q, time_base); (double) (calc_duration) * (double) (av_Q2d (TIME_BASE_Q))/(double) (av_q2d (time_base));
				
				Enc_pkt.pos =-1;
				Delay int64_t pts_time = Av_rescale_q (Enc_pkt.dts, Time_base, time_base_q);
				int64_t now_time = Av_gettime ()-start_time;

				if (Pts_time > Now_time) av_usleep (pts_time-now_time);
				ret = Av_interleaved_write_frame (Ofmt_ctx, &AMP;ENC_PKT);
			Av_free_packet (&AMP;ENC_PKT);
		}} else {av_frame_free (&pframe);
	} av_free_packet (DEC_PKT); }

Decoding part is relatively simple, the coding part needs to calculate PTS, DTS, more complex. This is where PTS and DTS are computed by frame rate

The time interval between each two frames is calculated first through the frame rate, but is converted to the value represented by the time base inside the ffmpeg. The time base within the so-called FFmpeg is av_time_base, defined as

#define         av_time_base   1000000

Any time value in seconds is converted to the time value represented by the FFmpeg internal time base, which is actually converted to microseconds

Timestamp=av_time_base*time (s)

So there are

Duration between 2 frames (US)
int64_t calc_duration = (double) (av_time_base) * (1/av_q2d (r_framerate1));	Internal time stamp

and ENC_PKT because it is to write the last output stream, its pts, DTS should be expressed as ofmt_ctx->streams[videoindex]->time_base time radicals, the conversion between time-based

enc_pkt.pts = Av_rescale_q (framecnt*calc_duration, Time_base_q, time_base);

is actually

Enc_pkt.pts = (double) (framecnt*calc_duration) * (double) (av_q2d (TIME_BASE_Q))/(double) (av_q2d (time_base));

A very simple mathematical conversion.

Also, because the transcoding process may be much faster than the actual playback, in order to maintain smooth playback, to determine the DTS and the current real time, and the corresponding delay operation, as follows

Delay
				int64_t pts_time = Av_rescale_q (Enc_pkt.dts, Time_base, time_base_q);
				int64_t now_time = Av_gettime ()-start_time;
				if (Pts_time > Now_time)
					av_usleep (pts_time-now_time);

This is exactly the same as before, to convert the Ofmt_ctx->streams[videoindex]->time_base time base to the FFmpeg internal time base, because Av_gettime gets the time in microseconds
After the overall process is complete, the final Flush encoder operation is left and the data stored in the buffer is output

Flush Encoder
	ret = Flush_encoder (ifmt_ctx,ofmt_ctx,0,framecnt);
	if (Ret < 0) {
		printf ("Flushing encoder failed\n");
		return-1;
	}

	Write file Trailer
	av_write_trailer (ofmt_ctx);

	Clean
	if (video_st)
		avcodec_close (VIDEO_ST->CODEC);
	Av_free (out_buffer);
	Avio_close (OFMT_CTX->PB);
	Avformat_free_context (IFMT_CTX);
	Avformat_free_context (OFMT_CTX);

The contents of Flush_encoder are as follows

int Flush_encoder (Avformatcontext *ifmt_ctx, Avformatcontext *ofmt_ctx, unsigned int stream_index, int framecnt) {int ret
	;
	int got_frame;
	Avpacket enc_pkt; if (! (
	Ofmt_ctx->streams[stream_index]->codec->codec->capabilities & Codec_cap_delay)) return 0;
		while (1) {enc_pkt.data = NULL;
		enc_pkt.size = 0;
		Av_init_packet (&AMP;ENC_PKT);
		ret = Avcodec_encode_video2 (Ofmt_ctx->streams[stream_index]->codec, &enc_pkt, NULL, &got_frame);
		Av_frame_free (NULL);
		if (Ret < 0) break;
			if (!got_frame) {ret=0;
		Break

		} printf ("Flush encoder:succeed to encode 1 frame!\tsize:%5d\n", enc_pkt.size);
		Write PTS avrational time_base = ofmt_ctx->streams[stream_index]->time_base;//{1, 1000};
		Avrational r_framerate1 = ifmt_ctx->streams[stream_index]->r_frame_rate;//{50, 2};
		Avrational time_base_q = {1, av_time_base}; Duration between 2 frames (US) int64_t calc_duration = (double) (av_time_base) * (1/av_q2d (r_fRAMERATE1));
		Internal timestamp//parameters enc_pkt.pts = av_rescale_q (framecnt*calc_duration, Time_base_q, time_base);
		Enc_pkt.dts = enc_pkt.pts;

		Enc_pkt.duration = Av_rescale_q (calc_duration, Time_base_q, time_base);
		/* Copy packet*///Conversion Pts/dts (convert Pts/dts) Enc_pkt.pos =-1;
		framecnt++;

		Ofmt_ctx->duration=enc_pkt.duration * FRAMECNT;
		/* MUX Encoded frame */ret = Av_interleaved_write_frame (Ofmt_ctx, &AMP;ENC_PKT);
	if (Ret < 0) break;
} return ret; }

You can see that the coding process is basically repeated again.

At this point, the camera data is realized live.

Of course, you can also use multithreading to achieve "press ENTER to stop playback" such as the control function.

The source code of the project.

Everyone crossing, if you feel that my blog is helpful to you, you can scan the following QR code to make a reward, how much you are free ~

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More