Editor's note: In the previous blog post, we introduced the effects and future applications of the ADAM Project: intelligently identify objects with computer vision. Now let's approach this technology to see how large-scale distributed systems can effectively train giant deep neural networks (DNN), making this possible.
Article translated from: Http://research.microsoft.com/en-us/news/features/dnnvision-071414.aspx (with limitations)
Can you tell the difference between the two kinds of Pembroke Welsh Corgi dogs? If you are a dog lover, you can probably answer this question. But for most people, this is a bit difficult. Because they may not know that there is such a dog in the world, let alone know that there are two kinds of--there are only two--the Pembroke Welsh Corgi dog. In fact, the two dogs are named after the county name of Wales-a Pembroke Welsh corgi named Pembroke (Pembroke), a Pembroke Welsh Corgi dog called Cardigan (cardigan). There are few people who can tell the difference between these two kinds of dogs.
But Microsoft's Adam can do just that. Adam is a project initiated by Microsoft researchers and engineers to demonstrate that large-scale commercial distributed systems can effectively train giant deep neural networks (DNN). To achieve this, Microsoft's researchers have created the world's most advanced photo classification system, using 14 million images from the imagenet image database, which divides the included images into 22,000 categories, some of which are related to dogs in a number of different categories.
Adam knows dogs, and can identify dogs from pictures. It recognizes different types of dogs and can even identify specific species, such as a corgi dog or a cardigan dog.
That sounds like déjà vu. A few years ago, the New York Times reported that Google had a network system of 16,000 computers that had learned to identify pictures of cats. This is a difficult task for computers, and Google has made a remarkable achievement.
The current paper on the Adam Project is being reviewed, and the paper points out that Adam's speed is 50 times times that of the Google Network system, which is twice times more accurate. In addition, Adam is extremely efficient, the number of machines used is only one-thirtieth of the Google Network system, both scalable, these are the short board of the song system.
Microsoft researcher who spearheaded the ADAM project Trishul Chilimbi "Our initial idea was to build an extremely efficient and highly scalable distributed system using a commercial PC, with training speed, scalability, and mission accuracy to reach world-class levels in order to accomplish important, major tasks. The focus is on visual technology because we have the largest public data set in this area. "
"Then consider that if we built a truly scalable system, it would be worth proving its usability. The challenge is whether we can use this large-scale system to effectively train large-scale models and make them master large datasets. In the end, all of this has been resolved. Our system is a universal system that supports the training of various DNN architectures. It can also be used to train large DNN to complete tasks such as speech recognition and text processing. "
"The machine learning models we used to train were very, very small, especially compared to the number of neurons connected to the human brain," says Chilimbi. Google's project has shown that if you train a larger model and get more data, you will be better able to accomplish the difficult AI tasks of classifying them. ”
"On the Adam Project, we tried to build a system that was much more scalable and efficient than ever before, using it to train larger models, to get more data, and to improve task accuracy." The overall goal of this study is to build a scalable training system that proves that it is a promising path for the system to master large amounts of data through training, and that you do not have to be a machine learning expert to accomplish such tasks very accurately. A system-driven approach that leverages robust computing, model sizing, and data size is really a viable path. "
Many people say that this is an impossible task, so this result makes Microsoft more happy.
"When machine learning experts were initially invited to use distributed systems for machine learning, we were challenged. "Chilimbi said. "The basic machine learning training algorithm is synchronous. They are usually run on the same machine at all times. At the time, there were questions about the feasibility of the distributed, that synchronization costs would make the process slow, and it would never be possible to achieve high performance or scalability. ”
"What we want is not only to build a Asynchronous algorithm, but also to carry out the work to the end. Finally, we found a way to help improve robustness by not only learning but also getting better at asynchronous algorithms. The key to learning is not the training data set, but the ability to effectively sum up invisible data. "
Team members working to implement DNN async algorithms (from left to right):Karthik Kalyanaraman, Trishul Chilimbi, Johnson apacible, Yutaka Suzue
asynchronous technology also brings another advantage. "The async algorithm also helped us out of stasis, which was unable to effectively improve the accuracy of the task,"Chilimbi said. "It's much like the situation when humans learn to accomplish a new task--after a period of rapid progress, people tend to find themselves stuck." "
Johnson Apacible, engineering manager of the Microsoft Research Special project, elaborated on the approach Adam took. He explained: "When you use pictures to identify a car, the picture must be a complete car." And if you're an adult, even if you just catch a glimpse of a part of a car, you can recognize that it's a car. Because you've been trained. When the car is running at high speed, the image will be a little blurry, but you still know it's a car. ”
"Our system is to do that. This enables the system to train for different types of data, different types of situations, and to make the model more robust. ”
The project was launched 18 months ago to realize the vision of a full-featured system. The system will contain an end-to-end scenario and be able to successfully implement continuous operations spanning several days. In addition, the project will create a new world record in the number of models, the speed of training and the accuracy of the classification of imagenet mass data.
Not only that, but the Adam Project also proves that deep learning-a learning mechanism that has previously proven its usefulness in the field of speech-also applies to the visual realm. In addition, the researchers also have a more in-depth understanding of the actual operating mechanism of DNN.
Chilimbi said: "We found that with the increase in the number of DNN layers, accuracy will be improved, but after a certain number of layers, the accuracy will no longer improve." From two convolutional layers to three convolutional layers, and then to five or six convolution layers, the effect seems to be the best. Interestingly, scientists studied the human visual cortex, where they found that the human visual cortex is located in the depths of about six neurons within the brain. ”
"It's interesting because each layer of the neural network automatically learns a higher level of functionality based on the previous layer. The topmost layer learns high-level concepts such as plants, written text, or Flash objects. It seems that by the time we go deeper, we will go into a declining function. From a biological point of view, this seems reasonable. “
Back to Adam's ability to identify Corgi dogs, the working principles of different levels are as follows: The first layer learns the outline of the dog. The second layer may learn textures and fur, and the third layer may learn the body parts-the shape of the ears and eyes. The fourth layer learns more complex parts of the body, while the fifth layer is used to learn high-level recognizable concepts like dog faces. Information is passed up and up to the top, and the process gradually forms an increasingly complex visual comprehension.
When asked about the extent to which DNN would subvert today's computing environment, Chilimbi again mentions what he says about the two computing times, that the faster and faster computers and Moore's laws drive the development of the first computing era, while the second computing age is dominated by the Internet, communications and connectivity.
"Both times have brought profound changes, and many inventions have sprung up in both eras," he said. Today, we are at the very beginning of the real AI era, and artificial intelligence will bring about similar changes in these two eras. Of course, these are inseparable from the foundation of the previous revolution. It requires the support of powerful computing power, as well as the connectivity and availability of data to support computer learning of interesting things. ”
"So far, computers have shown excellent data processing capabilities. Now, we are starting to train computer recognition patterns, and combining the two will open a new world of applications. Imagine that if a blind person uses a mobile phone to point to a scene, the phone can describe the scene and help blind people to observe the world around them. If we take the food we're going to eat, the cell phone will be able to understand the nutritional information of the food and help us make smarter choices. ”
As for the changes brought about by deep learning, apacible that the key is the scale.
"When the computer came out, people used the electronic tube to program, followed by the Assembly language, which brought some help to programmers." The resulting C language greatly facilitated the writing of the Code. Now that we are in the data age, we can get more and more data. Developing a product like Bing requires hundreds of machine learning experiences to produce a model that has a good correlation. ”
"When it comes to millions of images, it may take hundreds of thousands of machine learning experts to develop a model. The implementation of the Adam system proves to us that DNN can control such a large scale. We don't need a machine learning expert to figure out what makes this picture look like a dog. The system will learn this automatically. This is a big advantage. “
We can provide the system with massive amounts of data, such as images, voice and text, and use this scale to train the system so that it can express, understand and help us interpret the world around you.
Now, when it comes to DNN, it always mentions the mysteries behind the magic of deep learning.
Chilimbi said: "What we have not yet understood is, when we only provide to dnn a picture and tell it ' this is a Pembroke Welsh Corgi dog ', how does it decompose this image into different feature layers? ”
"We did not provide any instructions in this regard. We just trained the algorithm and indicated ' This is the image, this is the label '. It will automatically recognize these hierarchical features. This is still a puzzle, and we know very little about this process. But it took millions of of years for nature to magically shape the brain, so it would take some time to uncover the mystery. “
However, DNN is not a unique case.
"It's like quantum physics at the beginning of the 20th century, where experimenters and practitioners walked in front of theorists who couldn't explain the results," Chilimbi said. ”
"DNN seems to be at a similar stage. We are implementing the power and capabilities of DNN, but we still do not understand the basic workings of DNN. The internet is a powerful example of how we tend to overestimate the short-term impact of disruptive technology, but underestimate its long-term impact. As for deep learning, we still have a lot of theoretical work to do. ”
____________________________________________________________________________________
Related reading
Video: Research team member Trishul Chilimbi on Adam's plan
Adam Project showcases new breakthroughs in Microsoft Research AI
Microsoft challenges Google ' s Artificial Brain with ' Project Adam '
Welcome attention
Microsoft Research Asia Renren homepage:http://page.renren.com/600674137
Microsoft Research Asia Weibo :http://t.sina.com.cn/msra
From:http://blog.sina.com.cn/s/blog_4caedc7a0102uxol.html
Pembroke Welsh Corgi Dogs, computer vision, and the power of deep learning