Tutorial on using Python Django framework to complete video processing tasks, pythondjango
The webpage application of Stickyworld supports video playback for a period of time, but all of them are implemented through the embedded mode of YouTube. We started to provide a new version to support video operations, so that our users do not have to be subject to the YouTube service.
I used to participate in a project and the customer needs the video transcoding function. This is not an easy requirement. You need to read a large number of the formats of each video, audio, and video container and then output the video formats that match the usage and preferences of the webpage.
With this in mind, we decided to hand over the transcoding work to Encoding.com. This website allows you to encode videos of 1 GB size for free. Files exceeding 1 GB capacity will be charged by tiered pricing.
The developed code is as follows. I uploaded a kb video in two seconds to test whether the code runs successfully. When no exception error occurs during the test, I continue to test other larger external files.
Phase 1: users upload video files
The new code snippet now provides an HTML5-based and quick-start upload mechanism. Code written with CoffeeScript can upload files from the client to the server.
$scope.upload_slide = (upload_slide_form) -> file = document.getElementById("slide_file").files[0] reader = new FileReader() reader.readAsDataURL file reader.onload = (event) -> result = event.target.result fileName = document.getElementById("slide_file").files[0].name $.post "/world/upload_slide", data: result name: fileName room_id: $scope.room.id (response_data) -> if response_data.success? is not yes console.error "There was an error uploading the file", response_data else console.log "Upload successful", response_data reader.onloadstart = -> console.log "onloadstart" reader.onprogress = (event) -> console.log "onprogress", event.total, event.loaded, (event.loaded / event.total) * 100 reader.onabort = -> console.log "onabort" reader.onerror = -> console.log "onerror" reader.onloadend = (event) -> console.log "onloadend", event
It is best to upload each file through ("slide_file"). files and through an independent POST, instead of uploading all files by a POST request. We will explain this later.
Phase 2: Verify and upload data to Amazon S3
At the backend, we run Django and RabbitMQ. The main modules are as follows:
$ Pip install 'django> = 1.5.2 ''Django-celery> = 3.0.21 '\ 'django-storages> = 1.1.8'' lxml> = 3.2.3 ''python-magic> = 0.4.3' I have created two modules: slideUploadQueue is used to store each uploaded data. SlideVideoMedia is used to store the data of each video to be uploaded. Class SlideUploadQueue (models. model): created_by = models. foreignKey (User) created_time = models. dateTimeField (db_index = True) original_file = models. fileField (upload_to = filename_sanitiser, blank = True, default = '') media_type = models. foreignKey (MediaType) encoding_com_tracking_code = models. charField (default = '', max_length = 24, blank = True) STATUS_AWAITING_DATA = 0 bytes = 1 STATUS_PROCESSING = 2 bytes = 5 STATUS_FINISHED = 3 STATUS_FAILED = 4 STATUS_LIST = (optional, 'awaiting data'), (waiting, 'awaiting processing'), (STATUS_PROCESSING, 'processing'), (waiting, 'awaiting 3rd-party processing'), (STATUS_FINISHED, 'finished'), (STATUS_FAILED, 'failed'),) status = models. positiveSmallIntegerField (default = STATUS_AWAITING_DATA, choices = STATUS_LIST) class Meta: verbose_name = 'slide' handler = 'slide upload queue 'def save (self, * args, ** kwargs ): if not self. created_time: self. created_time = \ datetime. utcnow (). replace (tzinfo = pytz. utc) return super (SlideUploadQueue, self ). save (* args, ** kwargs) def _ unicode _ (self): if self. id is None: return 'new <SlideUploadQueue> 'call' <SlideUploadQueue> % d' % self. id class SlideVideoMedia (models. model): converted_file = models. fileField (upload_to = filename_sanitiser, blank = True, default = '') FORMAT_MP4 = 0 FORMAT_WEBM = 1 FORMAT_OGG = 2 FORMAT_FL9 = 3 FORMAT_THUMB = 4 supported_formats = (FORMAT_MP4, 'mpeg 4'), (FORMAT_WEBM, 'webm'), (FORMAT_OGG, 'ogk'), (FORMAT_FL9, 'Flash 9 video'), (FORMAT_THUMB, 'thumbnail '),) mime_types = (FORMAT_MP4, 'video/mp4'), (FORMAT_WEBM, 'video/webm'), (FORMAT_OGG, 'video/ogc '), (FORMAT_FL9, 'video/mp4'), (FORMAT_THUMB, 'image/jpeg '),) format = models. positiveSmallIntegerField (default = FORMAT_MP4, choices = supported_formats) class Meta: verbose_name = 'slide video' verbose_name_plural = 'slide videos 'def _ unicode _ (self): if self. id is None: return 'new <SlideVideoMedia> 'call' <SlideVideoMedia> % d' % self. id
All our modules use filename_sanitiser. FileField automatically adjusts the file name to <model>/<uuid4>. <extention> format. Sort each file name and ensure its uniqueness. We have adopted signed URL columns with timeliness, allowing us to control which users are using our services and how long they have been using them.
def filename_sanitiser(instance, filename): folder = instance.__class__.__name__.lower() ext = 'jpg' if '.' in filename: t_ext = filename.split('.')[-1].strip().lower() if t_ext != '': ext = t_ext return '%s/%s.%s' % (folder, str(uuid.uuid4()), ext)
The testing. mov file to be tested will be converted to the following url: https://our-bucket.s3.amazonaws.com/slideuploadqueue/3fe27193-e87f-4244-9aa2-66409f70ebd3.mov and uploaded via Django Storages module.
We use Magic to verify the files uploaded from the user's browser. Magic detects the type of file from the file content.
@verify_auth_token@return_jsondef upload_slide(request): file_data = request.POST.get('data', '') file_data = base64.b64decode(file_data.split(';base64,')[1]) description = magic.from_buffer(file_data)
If the file type is consistent with the MPEG v4 system or the Apple QuickTime movie, we know that there will be no major problems with file transcoding. If the format is not the one mentioned above, we will mark it to the user.
Next, we will store the video to the queue through the SlideUploadQueue module and send a request to RabbitMQ. Because the Django Storages module is used, files are automatically uploaded to Amazon S3.
slide_upload = SlideUploadQueue()...slide_upload.status = SlideUploadQueue.STATUS_AWAITING_PROCESSINGslide_upload.save()slide_upload.original_file.\ save('anything.%s' % file_ext, ContentFile(file_data))slide_upload.save() task = ConvertRawSlideToSlide()task.delay(slide_upload)
Phase 3: send the video to a third party.
RabbitMQ controls the call of task. delay (slide_upload.
Now we only need to send the video file URL and output format to Encoding.com. This website will reply to us with a work code so that we can check the video transcoding progress.
class ConvertRawSlideToSlide(Task): queue = 'backend_convert_raw_slides' ... def _handle_video(self, slide_upload): mp4 = { 'output': 'mp4', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'mpeg4', 'profile': 'main', 'vcodecparameters': 'no', 'audio_codec': 'libfaac', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'file_extension': 'mp4', 'hint': 'no', } webm = { 'output': 'webm', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_sample_rate': '44100', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'libvpx', 'profile': 'baseline', 'vcodecparameters': 'no', 'audio_codec': 'libvorbis', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'preset': '6', 'file_extension': 'webm', 'acbr': 'no', } ogg = { 'output': 'ogg', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_sample_rate': '44100', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'libtheora', 'profile': 'baseline', 'vcodecparameters': 'no', 'audio_codec': 'libvorbis', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'file_extension': 'ogg', 'acbr': 'no', } flv = { 'output': 'fl9', 'size': '320x240', 'bitrate': '256k', 'audio_bitrate': '64k', 'audio_channels_number': '2', 'keep_aspect_ratio': 'yes', 'video_codec': 'libx264', 'profile': 'high', 'vcodecparameters': 'no', 'audio_codec': 'libfaac', 'two_pass': 'no', 'cbr': 'no', 'deinterlacing': 'no', 'keyframe': '300', 'audio_volume': '100', 'file_extension': 'mp4', } thumbnail = { 'output': 'thumbnail', 'time': '5', 'video_codec': 'mjpeg', 'keep_aspect_ratio': 'yes', 'file_extension': 'jpg', } encoder = Encoding(settings.ENCODING_API_USER_ID, settings.ENCODING_API_USER_KEY) resp = encoder.add_media(source=[slide_upload.original_file.url], formats=[mp4, webm, ogg, flv, thumbnail]) media_id = None if resp is not None and resp.get('response') is not None: media_id = resp.get('response').get('MediaID') if media_id is None: slide_upload.status = SlideUploadQueue.STATUS_FAILED slide_upload.save() log.error('Unable to communicate with encoding.com') return False slide_upload.encoding_com_tracking_code = media_id slide_upload.status = \ SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING slide_upload.save() return True
Encoding.com recommends some useful Python programs for communication with their services. I have modified some modules, but I still need to modify some functions to achieve my satisfaction. The following is the program code that is being used after modification:
import httplibfrom lxml import etreeimport urllibfrom xml.parsers.expat import ExpatErrorimport xmltodict ENCODING_API_URL = 'manage.encoding.com:80' class Encoding(object): def __init__(self, userid, userkey, url=ENCODING_API_URL): self.url = url self.userid = userid self.userkey = userkey def get_media_info(self, action='GetMediaInfo', ids=[], headers={'Content-Type': 'application/x-www-form-urlencoded'}): query = etree.Element('query') nodes = { 'userid': self.userid, 'userkey': self.userkey, 'action': action, 'mediaid': ','.join(ids), } query = self._build_tree(etree.Element('query'), nodes) results = self._execute_request(query, headers) return self._parse_results(results) def get_status(self, action='GetStatus', ids=[], extended='no', headers={'Content-Type': 'application/x-www-form-urlencoded'}): query = etree.Element('query') nodes = { 'userid': self.userid, 'userkey': self.userkey, 'action': action, 'extended': extended, 'mediaid': ','.join(ids), } query = self._build_tree(etree.Element('query'), nodes) results = self._execute_request(query, headers) return self._parse_results(results) def add_media(self, action='AddMedia', source=[], notify='', formats=[], instant='no', headers={'Content-Type': 'application/x-www-form-urlencoded'}): query = etree.Element('query') nodes = { 'userid': self.userid, 'userkey': self.userkey, 'action': action, 'source': source, 'notify': notify, 'instant': instant, } query = self._build_tree(etree.Element('query'), nodes) for format in formats: format_node = self._build_tree(etree.Element('format'), format) query.append(format_node) results = self._execute_request(query, headers) return self._parse_results(results) def _build_tree(self, node, data): for k, v in data.items(): if isinstance(v, list): for item in v: element = etree.Element(k) element.text = item node.append(element) else: element = etree.Element(k) element.text = v node.append(element) return node def _execute_request(self, xml, headers, path='', method='POST'): params = urllib.urlencode({'xml': etree.tostring(xml)}) conn = httplib.HTTPConnection(self.url) conn.request(method, path, params, headers) response = conn.getresponse() data = response.read() conn.close() return data def _parse_results(self, results): try: return xmltodict.parse(results) except ExpatError, e: print 'Error parsing encoding.com response' print e return None
Other matters to be completed include rigorous SSL Verification via HTTPS-only (encrypted online) using Encoding.com, and some unit tests.
Phase 4: Download all new video file formats
We have a regular program that checks the video transcoding progress every 15 seconds through RabbitMQ:
class CheckUpOnThirdParties(PeriodicTask): run_every = timedelta(seconds=settings.THIRD_PARTY_CHECK_UP_INTERVAL) ... def _handle_encoding_com(self, slides): format_lookup = { 'mp4': SlideVideoMedia.FORMAT_MP4, 'webm': SlideVideoMedia.FORMAT_WEBM, 'ogg': SlideVideoMedia.FORMAT_OGG, 'fl9': SlideVideoMedia.FORMAT_FL9, 'thumbnail': SlideVideoMedia.FORMAT_THUMB, } encoder = Encoding(settings.ENCODING_API_USER_ID, settings.ENCODING_API_USER_KEY) job_ids = [item.encoding_com_tracking_code for item in slides] resp = encoder.get_status(ids=job_ids) if resp is None: log.error('Unable to check up on encoding.com') return False
Check the response of Encoding.com to verify that each part is correct so that we can continue.
if resp.get('response') is None: log.error('Unable to get response node from encoding.com') return False resp_id = resp.get('response').get('id') if resp_id is None: log.error('Unable to get media id from encoding.com') return False slide = SlideUploadQueue.objects.filter( status=SlideUploadQueue.STATUS_AWAITING_3RD_PARTY_PROCESSING, encoding_com_tracking_code=resp_id) if len(slide) != 1: log.error('Unable to find a single record for %s' % resp_id) return False resp_status = resp.get('response').get('status') if resp_status is None: log.error('Unable to get status from encoding.com') return False if resp_status != u'Finished': log.debug("%s isn't finished, will check back later" % resp_id) return True formats = resp.get('response').get('format') if formats is None: log.error("No output formats were found. Something's wrong.") return False for format in formats: try: assert format.get('status') == u'Finished', \ "%s is not finished. Something's wrong." % format.get('id') output = format.get('output') assert output in ('mp4', 'webm', 'ogg', 'fl9', 'thumbnail'), 'Unknown output format %s' % output s3_dest = format.get('s3_destination') assert 'http://encoding.com.result.s3.amazonaws.com/'\ in s3_dest, 'Suspicious S3 url: %s' % s3_dest https_link = \ 'https://s3.amazonaws.com/encoding.com.result/%s' %\ s3_dest.split('/')[-1] file_ext = https_link.split('.')[-1].strip() assert len(file_ext) > 0,\ 'Unable to get file extension from %s' % https_link count = SlideVideoMedia.objects.filter(slide_upload=slide, format=format_lookup[output]).count() if count != 0: print 'There is already a %s file for this slide' % output continue content = self.download_content(https_link) assert content is not None,\ 'There is no content for %s' % format.get('id') except AssertionError, e: log.error('A format did not pass all assertions: %s' % e) continue
At this point, we have confirmed that all items are normal, so we can store all the video files:
media = SlideVideoMedia()media.format = format_lookup[output]media.converted_file.save('blah.%s' % file_ext, ContentFile(content))media.save()
Phase 5: video playback through HTML5
A page with HTML5 image units has been added to our front-end webpage. Video. js with the best support for each browser is used to display videos.
? bower install video.jsbower caching git://github.com/videojs/video.js-component.gitbower cloning git://github.com/videojs/video.js-component.gitbower fetching video.jsbower checking out video.js#v4.0.3bower copying /home/mark/.bower/cache/video.js/5ab058cd60c5615aa38e8e706cd0f307bower installing video.js#4.0.3
There are other dependent files on our homepage:
!!! 5html(lang="en", class="no-js") head meta(http-equiv='Content-Type', content='text/html; charset=UTF-8') ... link(rel='stylesheet', type='text/css', href='/components/video-js-4.1.0/video-js.css') script(type='text/javascript', src='/components/video-js-4.1.0/video.js')
In the Angular. js/JADE-based framework, we introduce the <video> volume label and Its <source> subvolume label. Each video file is shown as a thumbnail by the <video> poster component. The thumbnail image is captured several seconds before the video.
#main.span12 video#example_video_1.video-js.vjs-default-skin(controls, preload="auto", width="640", height="264", poster="{{video_thumbnail}}", data-setup='{"example_option":true}', ng-show="videos") source(ng-repeat="video in videos", src="{{video.src}}", type="{{video.type}}")
The format of each converted video file is displayed and used in the <source> label. Video. js decides the Video format based on the browser used by the user.
We still have a lot of work to do to establish a unit test and a program to enhance communication with the Encoding.com service. If you are interested in these jobs, contact me.