When machine learning encounters machine Vision (2)

Last Update:2015-03-17 Source: Internet

Author: User

Keywords Cloud computing Big data Amazon Google industry internet cloud security cloud security

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This blog post was completed by Microsoft University and Jamie Shotton,antonio Criminisi,sebastian Nowozin in Cambridge, the second of the topic.

In the last article, we introduced you to the field of machine vision and discussed a very effective algorithm--pixel intelligent classification decision tree, which has been widely used in medical image processing and Kinect. In this article, we will see the recent popular Deep neural network (depth learning) and its successful application in machine vision, and then we will look at the future development of machine vision and machine learning.

Deep Neural network

The training datasets we have used in machine vision research in recent years have greatly improved both in quality and in quantity. These improvements depend to a large extent on the development of the public, increasing the number of tagged pictures to millions of. A good data-chi--imagenet--includes tens of thousands of categories of millions of tagged pictures.

After several years of slow development in the Imagenet dataset community, in 2012, Krizhevsky and others detonated the field. They show that a small change in the algorithm combined with the general GPU computing can train more layers of convolution neural networks than before. They tested the 1000 categories of imagenet accurately, with the result being a landmark leap. This has attracted a lot of attention from the mass media and even brought in a lot of mergers and acquisitions by start-up companies. After that, deep learning has become a hot topic in the field of machine vision, and many recent papers have expanded the research methods of target location, face recognition and human attitude estimation.

Future prospects

There is no doubt that the deep convolution neural network is powerful, but can it completely solve the machine's visual problems? What we can be sure of is that deep learning will continue to be popular over the next few years and will drive the development of related technologies over the next few years, but we believe that it still needs some way to go. Although we can only speculate about what will happen in the future, there are certain trends that we can see.

Presentation method: At present, these neural networks can only identify some relatively simple image content, but not a deeper understanding of the relationship between the target objects in the picture and the role of specific individuals in our lives (for example, We cannot simply think that the hair is wet because the people in the picture are all shiny and have a hairdryer. New datasets such as Microsoft's "CoCo" can further improve the situation by providing more detailed labels for individual target objects of "atypical" images, such as those that contain multiple objects that are not in the most prominent position.

Efficiency: Although the depth of neural network in image processing can be in parallel to achieve relatively fast execution speed, however, the neural network is not the same as the one mentioned in our previous article: Every Test sample here will be output by traversing every node of the neural network. In addition, even if the fastest GPU cluster is used to accelerate, it will take days or weeks to train a neural network, which limits the speed of our experiment.

Structure Learning: At present, the deep convolution neural network is a well-designed stable structure which has been studied for many years. If we want to change it, we can only change the size of each layer and the number of layers (that is, the depth of the neural network), which does have a great impact on the predictive accuracy of the entire neural network. At present, in addition to the simple neural network parameter changes to optimize the network, we hope to be able to learn from the data directly to a more flexible network structure.

Recently, we have started to solve the above problems, especially the last two. What makes us particularly happy is our recent work on the Decision Jungle algorithm: A set of decision-to-ring-free graphs (DAG). You can think of a decision to have a direction-free graph is a decision tree, and the difference is that each decision has a child node to the loop-free graph that allows multiple parent nodes. Compared to the decision tree, we have proved that this can reduce the memory consumption of one order of magnitude, and also improve the generalization ability of the algorithm. Although a DAG is very similar to a neural network, it does include two very large differences, first of all, the structure of the DAG can be trained with the parameters of the model at the same time; DAG retains the efficient operational performance of the decision tree: Each test sample selects only one path from the DAG rather than traversing all nodes like a neural network. We are actively studying whether the decision jungle combined with other forms of deep learning can produce more efficient deep neural networks.

If you are interested in trying to solve your problem with the decision jungle, you can study it further through the Azure ML Gemini model.

All in all, the prospect of machine vision is largely attributable to the development of machine learning. The recent rapid development of machine vision has been very surprising, but we believe that the future of machine vision is still an exciting open book.

Jamie, Antonio and Sebastian.

(Responsible editor: Mengyishan)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More