When machine learning encounters machine Vision (2)
Source: Internet
Author: User
KeywordsCloud computing Big data Amazon Google industry internet cloud security cloud security
This blog post was completed by Microsoft University and Jamie Shotton,antonio Criminisi,sebastian Nowozin in Cambridge, the second of the topic.
In the last article, we introduced you to the field of machine vision and discussed a very effective algorithm--pixel intelligent classification decision tree, which has been widely used in medical image processing and Kinect. In this article, we will see the recent popular Deep neural network (depth learning) and its successful application in machine vision, and then we will look at the future development of machine vision and machine learning.
Deep Neural network
The training datasets we have used in machine vision research in recent years have greatly improved both in quality and in quantity. These improvements depend to a large extent on the development of the public, increasing the number of tagged pictures to millions of. A good data-chi--imagenet--includes tens of thousands of categories of millions of tagged pictures.
After several years of slow development in the Imagenet dataset community, in 2012, Krizhevsky and others detonated the field. They show that a small change in the algorithm combined with the general GPU computing can train more layers of convolution neural networks than before. They tested the 1000 categories of imagenet accurately, with the result being a landmark leap. This has attracted a lot of attention from the mass media and even brought in a lot of mergers and acquisitions by start-up companies. After that, deep learning has become a hot topic in the field of machine vision, and many recent papers have expanded the research methods of target location, face recognition and human attitude estimation.
Future prospects
There is no doubt that the deep convolution neural network is powerful, but can it completely solve the machine's visual problems? What we can be sure of is that deep learning will continue to be popular over the next few years and will drive the development of related technologies over the next few years, but we believe that it still needs some way to go. Although we can only speculate about what will happen in the future, there are certain trends that we can see.
Presentation method: At present, these neural networks can only identify some relatively simple image content, but not a deeper understanding of the relationship between the target objects in the picture and the role of specific individuals in our lives (for example, We cannot simply think that the hair is wet because the people in the picture are all shiny and have a hairdryer. New datasets such as Microsoft's "CoCo" can further improve the situation by providing more detailed labels for individual target objects of "atypical" images, such as those that contain multiple objects that are not in the most prominent position.
Efficiency: Although the depth of neural network in image processing can be in parallel to achieve relatively fast execution speed, however, the neural network is not the same as the one mentioned in our previous article: Every Test sample here will be output by traversing every node of the neural network. In addition, even if the fastest GPU cluster is used to accelerate, it will take days or weeks to train a neural network, which limits the speed of our experiment.
Structure Learning: At present, the deep convolution neural network is a well-designed stable structure which has been studied for many years. If we want to change it, we can only change the size of each layer and the number of layers (that is, the depth of the neural network), which does have a great impact on the predictive accuracy of the entire neural network. At present, in addition to the simple neural network parameter changes to optimize the network, we hope to be able to learn from the data directly to a more flexible network structure.
Recently, we have started to solve the above problems, especially the last two. What makes us particularly happy is our recent work on the Decision Jungle algorithm: A set of decision-to-ring-free graphs (DAG). You can think of a decision to have a direction-free graph is a decision tree, and the difference is that each decision has a child node to the loop-free graph that allows multiple parent nodes. Compared to the decision tree, we have proved that this can reduce the memory consumption of one order of magnitude, and also improve the generalization ability of the algorithm. Although a DAG is very similar to a neural network, it does include two very large differences, first of all, the structure of the DAG can be trained with the parameters of the model at the same time; DAG retains the efficient operational performance of the decision tree: Each test sample selects only one path from the DAG rather than traversing all nodes like a neural network. We are actively studying whether the decision jungle combined with other forms of deep learning can produce more efficient deep neural networks.
If you are interested in trying to solve your problem with the decision jungle, you can study it further through the Azure ML Gemini model.
All in all, the prospect of machine vision is largely attributable to the development of machine learning. The recent rapid development of machine vision has been very surprising, but we believe that the future of machine vision is still an exciting open book.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.