How does machine learning pry open hundreds of billions of videos into commercial big markets?

Last Update:2016-08-10 Source: Internet

Author: User

Tags xeon e5

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M02/85/99/wKioL1epmH3Ca4waAAF26qbCix4384.jpg-wh_500x0-wm_3 -wmp_4-s_1426569986.jpg "title=" intelligent video image analysis start-up company Viscovery CEO Chieh. jpg "alt=" wkiol1epmh3ca4waaaf26qbcix4384.jpg-wh_50 "/ >

(For intelligent video image analysis start-up company Viscovery CEO Chieh)

Online video is rapidly growing into a huge market. According to the Cisco Visual Networking Index (VNI) report released this June, video will account for 82% of consumer Internet traffic by 2020, while Internet video surveillance traffic from 2015 to 2020 will grow 10 times times and global virtual reality traffic will grow 61 times times. In the Chinese market, according to the market research firm, Eric predicts that the online video market will reach hundreds of billions of yuan in 2018.

In the face of such a big video market, how to commercialize the video is the focus of all operators and internet companies. In the past, video-on-demand, patch advertising and live broadcasting were the main forms of video commercialization. In the age of artificial intelligence, machine learning is the next big trend in video commercialization by capturing and identifying graphics in real time in video, so that more accurate matching of new business models such as advertising and e-commerce shopping is a big step in the development of machine learning algorithms and underlying hardware.

Sundar Pichai, Google's current CEO, has said that machine learning is a core way of transformation, and we will rethink everything accordingly. A start-up company called Viscovery was named a "successful and innovative Enterprise" by Google, and since 2011, VDS has been using Intel technology to develop an intelligent video discovery platform that captures and recognizes images in real-time video.

Exploration of intelligent video recognition based on machine learning

Viscovery is a collection of U.S., Chinese and Taiwan high-end talent of the start-up company, since 2011 has been committed to research image recognition technology. Viscovery CEO Chieh said that Viscovery's goal is to automatically parse the video content through big Data mining, to achieve accurate matching of ads, video shopping and social, yellow storm information monitoring and other applications.

Viscovery after years of research and development of image recognition technology and a large number of customer practices, the development of the intelligent video exploration platform VDS, with a unique full range of video content recognition engine Fitamos, can be achieved including face, picture/trademark (image), text (text), Sound/dialogue/ Multi-modal identification of music (audio), motion, Object (object), scene, etc.

By identifying the above-mentioned seven advertising objects in the video, VDS automates the generation of information, labels, commodities, and so on, overcoming the problem of manual video and audio tagging, through the identification of target classification and object information matching channels, complete accurate advertising, e-commerce, social and other matching, The result is an increase in ad delivery or e-commerce revenue, which translates video traffic into tangible revenue.

In short, VDS can simultaneously analyze a 1-hour video, automatically analyzing the earrings, necklaces, notebooks, smartphones and other items in the movie, and what kind of occasions they appear in a few seconds, which helps advertisers or video sites more accurately find better ad opportunities. "In the past, through artificial possible processing 100, 1000 films, the adoption of this system can be processed in 1 million, 10 million video volume, timely find out the ad point, to achieve better delivery." Viscovery company CEO Chieh said.

VDS currently has three ways: one is provided to the user in a lightweight SaaS manner, the user uploads the video and returns the analysis results; a large internet company with millions of videos that can directly deploy the VDS system to a user's own data center cluster And another is if you need to use viscovery self-built based on the Intel high-performance computing cluster of the computer room, you can pass the video to Viscovery processing.

High-performance computing boosts machine learning

"We have more challenges than others because we have to deal with billions of images," he added. 2012, 2013 years later, more and more people began to use neural network to process images, whether it is Google LeNET, Vgg, or Caffe, Torch, need to do deep learning experiments in so many architectures, it takes a week, one months to know the results of the experiment. ”

Chieh said, especially in the 2015 Imagenet competition, Microsoft's latest "deep residual network" can reduce the error rate of image recognition system to about 3.57%, lower than the human eye 5.1% error rate, this is a major breakthrough. One of the key points is a depth neuron network of up to 152 layers. Generally speaking, a GPU card on the gpu,1u machine that is common on the market now can train about 15 to 20 layers of neuron network, so it is difficult to reach the depth of 100 or 200 layers.

In June 2016, at the ISC International Supercomputer Conference, Intel unveiled the second-generation Xeon Phi processor codenamed Knights Landing (KNL), a series of up to 72 cores of x86 CPUs that are the first Xeon Phi CPUs available as standalone processors. This means a highly scalable machine learning cluster that can be cpu-only from the GPU. The KNL also comes with 16GB Mcdram high-bandwidth memory for 490gb/s memory bandwidth and 6 DDR4 memory slots for up to 384GB memory. The KNL processor is also the first processor to support the new AVX512 instruction set, which has a great acceleration effect for deep learning.

While the GPU can also be used as a clustered network, each GPU server is connected via Ethernet or InfiniBand technology, and Intel has developed an Omni-path high-speed internetwork bandwidth of up to 100G for high-performance computing. Both in computation and transmission speed are far beyond the previous technology. Viscovery chief Scientist Chen Yansian stressed that the purchase of 100 GPU machines for deep learning algorithm training may only be 30 times times faster, but KNL-based machines through the Omni-path architecture to achieve linear growth, 100 can be 80, 90 times times faster.

Chieh said: "If you just do a simple small-scale 20-layer neuron network learning, use the GPU." When today to challenge hundreds of millions of images, identify tens of thousands of objects, the need for more rapid training to provide services to the enterprise, you need to choose a more complete architecture, including computing, storage, network transmission trinity, can do more depth machine learning. "Chen Yan is further introduced Viscovery is also studying the thousand-layer neuron network, a simple understanding is to connect 10 hundred-layer neuron network parallel, while recognizing a video image of 10 objects.

Viscovery first attempted to build Cpu-only's new architecture video deep learning platform, which is up to 3 to 6 times times more energy efficient for video stream recognition than traditional solutions. Chieh said that with the integration of hardware and software, Viscovery can be more comprehensive and efficient service video and broadcast platform requirements, deep learning is widely used in the business environment.

On Computex, Intel, Quanta (Quanta) and viscovery together provide a complete set of video analytics solutions that integrate Intel Xeon E5 and Phi processors, Quanta's system design, and Viscovery software to include servers, Solutions for large-scale deployments, including algorithmic libraries and open source software.

Intel's machine learning "ambition"

April 18, 2016 Intel machine learning Strategy and business development director Joe Spisak's blog, citing Sundar Pichai's famous assertion. When the internet giants, represented by Google, are using machine learning to rethink the future, Intel's strategy for machine learning is not an investment in a simple one or two-chip product line, but a complete strategy.

Joe Spisak says the Intel machine learning strategy includes the underlying Intel Xeon E5 and Xeon Phi series, SSD SSDs, next-generation memory technologies, Omni-path architectures, and so on to compose machine learning single nodes or clusters. In fact, the Intel Xeon E5 processor, the generation Xeon Phi Coprocessor, and the second generation Xeon Phi processor family series provide a cost-effective hybrid X86 server solution for building machine learning clusters.

Coupled with lustre software-based parallel file systems, Mcdram high-speed integrated memory, HPC orchestrator installation software, the Intel Extensible System Framework (SSF) organizes these underlying compute, storage, and networking hardware technologies in a balanced Can accommodate supercomputers from small to large top 500, as well as a variety of compute-intensive and data-intensive scenarios.

On top of that, Intel offers a highly optimized library of software and tools to maximize performance from the underlying hardware. The Intel Math Kernel Library is an optimized library of basic math algorithms, and the Intel Data Analytics Acceleration Library Analytics Accelerator provides an optimised set of machine learning algorithms. These libraries abstract the hardware and the ISA instruction set architecture, shielding the complexity of the underlying hardware and simplifying programming and code.

Intel also actively integrates with the open source projects related to machine learning and contributes the code to the open source community. This includes caffe from Uc-berkeley, Theano of Montreal University, Torch7 for Facebook and Twitter, Microsoft's CNTK, and Google's tensor flow. At a higher level, Intel also helps businesses and developers accelerate machine learning with the open source trusted Analytics Platform (TAP) Trusted computing platform. Tap provides resources ranging from big data infrastructures and cluster management tools, to model development and training, and application development and deployment.

In terms of development tools, the Intel Parallel Studio Xe Tool Suite simplifies code design, development, debugging, and optimization, leveraging parallel processing to improve application performance. With compatible Intel processors and coprocessors, you can improve the performance of your C + + and FORTRAN applications more efficiently.

In fact, for developers, the biggest benefit of the Intel Unified architecture is a single programming model and programming language, Chen Yansian that the GPU-accelerated code cannot be executed on the CPU, so traditional deep learning solutions are often GPU full-loaded but CPU idle. Intel KNL can be used as a coprocessor in the form of multiple KNL next to a main CPU, so the same code can be distributed directly on different compute nodes without recompiling. In contrast, the GPU unit price is also not low, but also need special programming language (CUDA) to do processing.

Recently, viscovery with Jiangsu Satellite TV and proud broadcast "We Fight", to watch the show to provide an app. When you use the app to watch, at any point in time you want to know Wang Kai, xiaojingteng, Missy, etc. wearing clothes, the head of the hat or shoes worn on the feet, as long as a little immediately know where to buy, the video into an interactive scene. "This is a scenario where there was no way to achieve mass implementation in the past," Chieh said.

Intelligent Video Analytics will have a bright future in the world of IoT. With the advances in machine learning algorithms, software and hardware, machine learning will undoubtedly be the best commercial strategy for the billions of video markets. (Wen/Ningchuang,"The era of cloud technology" , No.: Cloudtechtime)

This article is from the "Cloud Technology Age" blog, please be sure to keep this source http://cloudtechtime.blog.51cto.com/10784015/1836188

How does machine learning pry open hundreds of billions of videos into commercial big markets?

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More